The language used to program Pochi and that gets assembled to Pochi roms by the assembler is the Pochi assembly. The assembly is some kind of ColorForth that gets directly compiled to bytecode. There is no REPL yet, this means that the only way to run Pochi program is to feed the assembly to the assembler which then outputs a rom that the vm can run.
Currently the only way to run Pochi programs is to write the code in the custom editor which can then assemble a rom on save and use this rom with the Pochi cli vm.
In the future we can imagine a lot of different ways to run Pochi programs such as a REPL or streaming bytecode directly from the assembler to the vm or even a vm instance streaming instructions to another vm instance. This is possible thanks to that fact that there is almost no distinction between fetching and executing from RAM or from I/O ports like stdin or files.
First an overview of the different colors and their role.
| color | role |
|---|---|
| red | Define a word |
| green | Compile call to a word |
| yellow | Macro word (like an immediate word) |
| gray | Assembly word |
| magenta | Compile address of red word |
| “magenta string” | Compile a counted string |
| white | Comments |
| blue | Edit-time formatting word |
Colors can have a somewhat different role if they are literal numbers, if this is the case it will be detailed in the relevant color section. For instance, a difference is for green words which compile a call, but when it’s a number it will compile a literal.
Numbers are like C numbers, they are in decimal base, if prefixed by 0 they are in octal
base and if prefixed with 0x they are in hexadecimal base.
Note: The white and gray colors are currently visually inverted in the editor, this means that comments will look gray and assembly words will look white. Keep in mind that whenever we refer to a color we are referring to the role of the word, not the perceived color.
Red words are similar in function to : definition in Forth. In Pochi red words simply define
a label that can be referenced by green or magenta words.
Red numbers represent an absolute offset (in cells) from the start of the rom. Anything defined
after a red number will have an address starting from that offset. Red numbers are powerful, you
can easily override previous definitions or a silly example would be to put a red 0 at the end
of your source which would result in a completely empty rom.
Green words compile a call to a red words. They act like a function call, or if
you are familiar with Forth, they are like normal words in a colon definition.
Green words can also be compiled to a jump when the following word is a gray ;.
Tail call optimization is that easy!
Green numbers are 32 bits signed integer. They are compiled to a simple @p (fetch) and the number
itself is compiled in the next available RAM cell.
Be wary that a green number will always need at least 2 cells, one for the @p and one for the data.
Yellow words are akin to macros, or Forth immediate words, but not exactly. Yellow words can only reference previously defined red words or one of the built-in yellow words. When a yellow words is being assembled, the assembler feeds it into a running Pochi VM (the asm VM) which will then execute it.
One way to prevent cluttering your rom with words that will only be used as macros is to use red words to define them somewhere higher in memory.
Yellow number simply push that number on the working stack of the asm VM.
The following built-in yellow words are not sent to the asm VM, but directly handled by the assembler:
initThe init yellow word is a words that makes it convenient to load a whole rom to the RAM
with Pochi instructions. It adds the following loader at the beginning of the rom:
0001 @p >r ..
0002 N
0003 @p !+ unext ..
Where N is the number of cells until just before init, this will basically load all the
cells in the rom to the RAM thanks to the a register being initialized to 0 and thanks to
how the I/O port (files device) work. And the final call to your “main”
word will kick-start the execution of your program.
You may want to always stick to putting a single green word call (e.g. to a “main” word) followed by a colon to turn it into a jump after the init to make sure that you are doing a slot 0, 1 or 2 jump so that you can reach all memory. Not doing so can be fine, but you may expose yourself to weird behaviors due to the page mechanism when jumping.
..The .. (align) yellow words simply fills the current instruction cell with . (noop).
padThe pad yellow word need a number n on the stack. It will first fill the current cell
with . (noop) and then “pad” n cells with zeros, starting from the first available cell.
'The ' (tick) yellow word, is almost like it’s Forth counter part. It looks at the next words and
put it’s address on the working stack of the asm VM.
,The , (comma) yellow word, is almost like it’s Forth counter part. It compiles the number from the
working stack to the next available cell in RAM.
litThe lit yellow word compiles a @p opcode and compiles the number that is on the stack to
the next available cell in RAM. Doing a yellow 7 with a yellow lit is equivalent to a green 7.
for ... nextThe yellow for and next are convenient words that takes care of putting the count on the
return stack for you as well as handle the jump address for next or decide if it can actually use
unext which is much more efficient.
Note that the yellow next expects an address (left by the yellow for) on the
asm VM stack.
if/-if ... else ... thenIn a similar way, the yellow if, -if, else and then are convenient word that handles the
jumping constructs needed to create the desired branching.
Note that the yellow else and then expect an address (left by the yellow if/-if or else)
on the asm VM stack.
includeThe yellow include word does what one might expects, it includes another Pochi source file.
A good way to reason about include is to imagine the include statement being replaced with
the content of the included file.
Something to note, is that when a file is included, the meaning of a red 0 changes, it is
not an absolute offset, but relative to where the include word is called. Any red number
different than 0 will work as usual.
It is discouraged to use more than 1 level of include, in other words don’t use include
inside an included file. It is however possible to do, but you need to be familiar with the details
of how include works if you don’t want to run into weird behaviors.
Gray words are words that map 1:1 with the available opcodes, so a gray words just compiles the
corresponding opcode. In Forth they would be akin to use the CODE word.
Gray number are like green number but instead of putting the number in the next available cell, a gray number tries to put the number in the next available slot. Gray number are 6 bits and unsigned, this means that their value is between 0 and 63 included, or 0 and 3 if it’s in slot 6.
It does so by first compiling a @s (fetch-slot) opcode and then putting the number in the
next slot. If there is only one slot left, it’ll become a . (noop) and the gray num will be
compiled in the next available cell along with its corresponding @s.
Magenta words compile a literal address, this means a @p opcode and the address of the
red words (label) being referred to in the next available RAM cell.
A magenta number compiles the number directly in the next available RAM cell without compiling
a @p. This is useful if you want a value somewhere that you will address through the a or b
registers, as it won’t take 2 cells like a green number.
Like green numbers tho, magenta numbers are signed 32 bit values.
Magenta strings compile into a counted string, the number of bytes (the length) is written in the first available cell and then the string is written in the following cells, 4 bytes per cell.
Magenta strings are special in 2 ways:
They don’t fit in slots at all, the consider each 32 bits cells to be 4 bytes (8 bits),
this is why we need special byte addressing opcodes (c@, c!, c@+ and c!+).
Since they are byte addressing don’t forget to adjust the address of the string with
something like a 4* a! or a 4/ a!.
The other peculiarity of strings is that multiple magenta string words will be normal space separated words in the pre-parsed representation, but they will be considered part of a single string by the assembler. This means that a magenta string can contain spaces.
White words acts as comments, they are ignored by the assembler. The editor allows to toggle their visibility on/off. This is useful if you want to focus on the code, it also allows to not be scared to comment things right in the middle of the code.
Blue words are “edit-time” words, they only have a purpose for the visual formatting of the text in the editor. Like white words, they are ignored by the assembler. And like white words, the editor allows to toggle their visibility on/off.
Currently there are only 4 blue words:
cr which inserts a newline> which inserts a tab. which inserts a spaceEOF a special word which signals the end of the fileA call or a jump opcode will use 1 slot for the opcode and then the rest of the cell for
the address. This means that sometimes the address will use less bits than needed for the whole
addressable range, this is the case for slot 3, 4 and 5 jumps. In order to avoid using . (noop)
and encoding a slot 0 jump on the next instruction, we first try to see if the address we jump
to is on the same “page”.
Being on the same page means that all the high order bits of the current instruction address and the address we jump to, are the same. In the extreme case of a slot 5 jump, this means that only the 2 lower order bits can differ, in other words, a slot 5 jump can only reach the 3 other addresses that are on the same “page”.
What? A VM running inside the assembler? Yes.
The main reason for this is that this gives us intuitive macros for almost free. Here is how it works, the VM only has one input and one output device setup, this is to allow reading the macro words and writing back the output in the next available cells. What actually happens is that as the rom gets assembled, you could consider that the rom gets assembled directly into the RAM of the asm VM and the rom that gets saved on disk is a “dump” of the asm VM’s RAM.
Another benefit of the asm VM is that we can use its working stack to pass values while assembling. This is
used when compiling if ... then statement where the if can leave it’s address so that then will
be able to “patch” the address to jump to if the condition is not true. You could go crazy, if you wanted to,
and start messing with the stack by using yellow words that modify the top of stack between if and then,
although I don’t recommend this.