romforth/romforth
Ultra Portable, Small, Baremetal Forth for various processors
romforth is a small, portable, baremetal version of Forth which has been ported to various 8/16/32/64-bit microcontrollers with support for both big and little endian architectures. So far, it has been ported to the following instruction sets: x86 (16-bit, 32-bit and 64-bit), PDP11, 68000, SPARC, Z80, MSP430, ARM64, RISC-V(rv32, rv64), WASM, Padauk, 6502.
So, at a superficial level, romforth is just yet another Forth which has been ported to a wide variety of CPUs and is meant to be runnable directly from the Flash/ROM of the microcontroller.
Dig deeper though, and you might see that romforth can also be considered a "porting toolkit" which enables a "stepwise porting framework" that allows you to port Forth "relatively easily" to any of your favorite CPU architectures.
The various ports that have been done already can be then be considered as proof of concept of this approach to porting Forth.
At this point you might reasonably ask: Why use Forth for microcontrollers? I'll go ahead and claim that Forth is a perfect fit for small, resource constrained systems. Perhaps elsewhere too, but opinions widely differ on that. Here's some sample code that might be used to blink an LED every second:
loop{
green led_on
500 millisec sleep
green led_off
500 millisec sleep
}loop
romforth is yet another Forth-like implementation meant to (eventually) run on most "low end" (ie ROM and RAM constrained) microcontrollers.
See PORTING for a list of architectures to which romforth has been ported as well as all the steps that are required for porting to a new architecture.
Unlike many Forth implementations which require a large number of "words" to be implemented before you can do anything with it, romforth is meant to be a "shrink to fit" implementation where the user decides how much of the Forth functionality they really need and only implement what they really require.
There are four "levels" of Forth that could be implemented in this way. For lack of better names, I'll refer to these "levels" as: 1/4 : "oneforth" (pun intended), which has only primitives and a data stack ROM : ~256 bytes, RAM : in the single digit byte range? 2/4 : "twoforth" which has definitions and a return stack (in addition to the primitives and data stack that "oneforth" has) ROM : ~256 bytes, RAM : 16 bytes (or thereabouts in two digit byte range) 3/4 : "threeforth" which has a static dictionary (in addition to the primitives and definitions and the two stacks that "twoforth" has) ROM : ~512 bytes, RAM : 16 bytes (or thereabouts in two digit byte range) 4/4 : a full fledged "fourforth"/full/regular Forth with a dynamic dictionary in addition to the rest of the bells and whistles from "threeforth". ROM : ~1024 bytes, RAM : 512 bytes or more - to hold the dictionary
These levels are further broken down into a total of ~73 individual steps which implement incremental pieces of Forth functionality. Each of these steps comes with tests that can be run to verify each additional piece of functionality as it is added, one at a time, which helps in having a working implementation at each and every step along the way.
By design, the most compact implementation is limited to use just a little over ~256 bytes of native code with the rest of the functionality implemented in machine independent code. Currently, ~100 bytes of machine independent Forth related functionality is used in the x86 implementation. So the overall Flash/ROM requirement on x86 is currently about ~350 bytes.
As mentioned earlier, there is also a regression test suite which is called at boot as part of init (which can be optionally #ifdef'ed out) and that currently takes an additional ~500 bytes of machine independent code.
romforth is meant to be extended using new "purpose built" Forth words which can be defined in terms of existing Forth words. The new definitions can be added in the machine independent "defs.4th".
The new words that have been defined can then be called directly as part of the CPU boot code/initialization by adding them to "rom.4th".
Running "make" will generate a "forth" binary which can be flashed into ROM (although in all of the implementations that have been completed so far, the testing is done using emulation, not actual hardware). I hope to run this on actual hardware real soon now.
The following (mostly) traditional Forth runtime words are available: Data Stack operators : dup drop nip dip swap over rot 2drop third fourth Data Stack get/set ops : pick stick Arithmetic : + - inc(1+) dec(1-) neg Memory access : ! @ c! c@ Comparison operators : > < >= <= 0= ~ I/O : key emit p@ p! Memory allocation : here alloc Return stack operators : >r r> Call/return : enter exit call exec Loop index : i Literals : lit Bit operators : & | ^ << >> inv Control flow : j(branch) jz(branch0) jnz(branchnz) Stack switch operators : sp@! rp@! Halt : bye Conditionals : if{ ... }else{ ... }if unconditional loop : loop{ ... }loop while loop : loop{ ... }while{ ... }loop until loop : loop{ ... }until{ ... }loop for loop : for{ ... }for
At the "oneforth" and "twoforth" levels of the implementation, compared to regular Forth, what is not available is the dictionary. We make up for that by providing an "umbilical host"ed solution using a set of scripts, which runs on the host, and will turn the Forth code into byte code which can be run on top of the ~350 byte "runtime". This "umbilical host" provides control flow structures such as conditionals and loop structures which are identical to the ones available at runtime in the dictionary at the "threeforth"/"fourforth" levels of the implementation.
So, using about ~350 bytes of "runtime", we can write architecture independent loops (while, until, for), conditionals (if, else), allocate memory, and even run threads or invoke semi/full-coroutines (if your proclivities tend that way)
It is reasonable to ask what distinguishes this Forth implementation from the zillion other implementations that already exist. I think the main distinguishing features are that it is portable, small and runs baremetal:
All of the code here is licensed under the Affero GPL 3.0 open source license. For exact details about the license please see the LICENSE file. For the thinking/intent behind the choice of AGPL, see LICENSE.README
For a long winded explanation of why this project exists, see RATIONALE
To make a local copy: git clone https://github.com/romforth/romforth
To build/test, the following dependencies need to be installed:
(Note 1: the version number in parentheses is the "known to work" version used
in testing, older/newer versions may or may not work)
(Note 2: If you are only doing a partial build for a specific processor by
running make
within a subdirectory, a subset of these dependencies is
sufficient. For example, for the x86 port, a C compiler and the nasm
x86 assembler might be more than sufficient.)
- perl (v5.22.1)
- nasm (version 2.15.05 compiled on Aug 28 2020)
- gcc (version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2))
- binutils ((GNU Binutils) 2.38)
- pdp11
- m68k
- sparc
- msp430
- aarch64
- riscv64
- riscv32
- simh (open-simh V4.0-0 Current git commit id: 33aad325)
- zig (0.11.0)
- qemu (7.1.0, via nix)
- sdcc(ucsim) (4.2.0 #13081 (Linux), ucsim: 0.6.4)
- m4 ((GNU M4) 1.4.17)
- msp430-gcc ((GCC) 4.6.3 20120301 (mspgcc LTS 20120406 unpatched))
- mspdebug(from github.com/romforth/mspdebug, until my changes
are upstreamed)
- wasmtime (wasmtime-cli 9.0.4)
- perl6 (version 2015.11 built on MoarVM version 2015.11)
After installing all the above dependencies, running make
will build each of
the ports and run each of them in an emulator and should complete successfully.
To do a partial build, you can go into any individual processor specific
directory and run make
within that subdirectory.
The table below lists the current ROM usage (in bytes) of the various ports:
.--------------.------------.-----------------.-----------------. | CPU | Primitives | Forth bytecodes | Forth bytecodes | | | | (total) | TESTROM (total) | +--------------+------------+-----------------+-----------------+ | x86 | 264 | 1619 (1883) | 3047 (3311) | | x86-as | 264 | 1619 (1883) | 3047 (3311) | | PDP11 | 276 | 2140 (2416) | 4272 (4548) | | x86-user | 445 | 3245 (3690) | 6548 (6993) | | C/x86 | 445 | 3928 (5848) | 6985 (7430) | | x86-32 | 385 | 2229 (2614) | 4288 (4673) | | x86-sys | 518 | 3245 (3763) | 6548 (7066) | | m68k | 378 | 2280 (2658) | 4348 (4726) | | sparc | 956 | 2208 (3164) | 4315 (5271) | | z80 | 491 | 1735 (2231) | 3178 (3674) | | z80-c | 1588 | 1563 (3151) | 2927 (4515) | | z180-sdcc | 1590 | 1563 (3153) | 2927 (4517) | | z80n-sdcc | 1584 | 1563 (3147) | 2927 (4511) | | msp430-as | 270 | 2540 (2810) | 4672 (4942) | | arm-zigcc | 1272+.. | 3036 (4308) | 5762 (7034) | | arm64-sys | 1060 | 3336 (4396) | 6664 (7724) | | riscv-zigcc | 1124+.. | 3036 (4160) | 5762 (6886) | | rv64-sys | 692 | 3336 (4028) | 6664 (7356) | | rv32-sys | 692 | 2302 (2994) | 4374 (5066) | | msp430-stc | 0 | 4328 (4328) | 10060 (10060) | | z80-stc | 0 | 3456 (3456) | 7090 (7090) | | msp430-fcode | 344 | 2064 (2408) | 3308 (3652) | | msp430-f | 344 | 602 (946) | 1408 (1752) | | nqh-zig | 9387 | 341 (9728) | 1009 (10396) | | wasm32-c | 1735 | 2840 (4575) | 5552 (7287) | | pdk14-stc* | 0 | 15 (125) | 503 (613) | | pdk14-thr1* | 264 | 40 (304) | 609 (873) | | 6502-sdcc | 2624 | 2060 (4684) | 3495 (6119) | | 6502-stc | 0 | 859 (859) | 3386 (3386) | '--------------'------------'-----------------'-----------------'
*Note: The units for pdk14-stc and pdk14-thr1 are in "words" which can be 13/14/15/... bits long depending on the processor family.