For some time I have been working on a tiny, Forth like language called SIMPL (serially interpreted minimal programming language). Whilst it’s origins are from Ward Cunningham’s “Txtzyme” – and originally written for Arduino, and Teensy in C, I have decided to pare it down even more and write it assembly language.
One of my challenges is to create an extensible language that can be ported to virtually any microcontroller, and it provides the basis of a complete, interractive programming environment, requiring only a terminal program for communications – consisting of text input and output.
A Primitive Command Set
The language will allow interactive programming of the microcontroller by way of a virtual machine. ascii characters will be interpreted as commands, which either operate on data contained in the stack or within system memory.
Some of these commands are treated as primitives – and executed directly within a few machine cycles. Others are longer program operations which are synthesised from the primitives and called as subroutines. It is the ability to build up complex sequences of primitives into loops and other control structures that gives this language its extensibility.
Primitive commands include arithmetic and logical operations – such as ADD, SUB, MUL, DIV, AND OR, XOR, NOT
These are often not much more than a couple of register operations, written in the native assembly language of the target processor.
Other primitives allow data to be fetched or stored to and from memory and the stack
Other commands are used for stack manipulation such as PUSH, POP, DUP, DROP, SWAP and OVER.
For convenience the instructions to send a character to the terminal, or receive a character from the input buffer are included as primitives – but subroutines to print a decimal number to the terminal or produce a hexadecimal dump of memory are created as subroutines composed of several primitive instructions built into program control structures such as loops .
With the right mix of primitive instructions, and the ability to create subroutines using these, virtually any programming task can be achieved – and the user is coding in effectively the native machine language of the virtual machine. This means that if the virtual machine can be hosted on almost any microcontroller – then the applications can be ported from platform to platform.
Tiny languages have existed for decades – literally since the first days of stored program computers, and later in the early days of resource limited microprocessors – and there is a certain amount of discussion on just how small they can be – and still remain useful.
My challenge is to get the complete language kernel into just 1024 bytes of program memory, and this will contain the serial communications routines, a text editor, a compiler and an interactive interpreter – plus whatever else will fit. The language will have the capability of accepting a program, in the form of an ascii text file, and be also able to dump areas of memory – either in text format or hexadecimal to the terminal screen or capture.
I have chosen the MSP430 series of microcontrollers as the target. The reasons for this is that the MSP430 is a good 16 bit mcu, low power and now in variants that offer lots of non-volatile FRAM – and peripherals such as 24bit ADCs.
Plus it’s not such a difficult processor to get to grips with the assembly language (unlike the more complex ARMs) – and as such, it makes a good first choice of affordable 16 bit processor for learning about the mechanics of Forth.
I have bitten the bullet and started to port it into MSP430 assembly language for speed and compactness.
My aspirations lie in creating a minimal tokenised interpreted language – where the words or tokens are single ascii characters. This removes a lot of the more complex aspects of Forth, such as dictionary searches, and it also makes a very mnemonic rich language – in that I can choose what symbols I use for the various stack operations – so I use ” for DUP, $ for SWAP, % for OVER and ‘ for DROP.
Whilst to some in the Forth community this may appear heretical, it makes sense to the way my brain is wired. More conventionally I have &, |, ^ and ~ for AND OR XOR and INVERT for logic operations and the usual +, -, * and / for the maths operators.
Inspired by Ting’s eForth and its minimal word set, I can get most of the primitives into the 32 punctuation symbols, leaving the capital letters for user words, and the lower case for other system words – that are constructed from primitives – for example h to set a port high and l to set it low, m for millisecond delay, u for microsecond delay.
The language is evolving on a daily basis and I can now do basic maths and logic operations and some of the stack manipulations – yet the whole kernel (so far) with UART support is just 582 bytes long – which roughly translates into about 500 lines of code. I have posted the code so far up on Github Gist here
This is very much a work in progress – and I have not yet completed the primitive definitions – but it’s my way of learning how these obscure languages work, and good fun to tinker with. You just need a Launchpad with a MSP430G2553 – but with a little fiddling with the UART routines it will run on virtually any MSP430.
What’s it good for, you may ask. Well I see it as a lingua franca to allow widely different machines to communicate with each other efficiently at a low level – but still human readable. I also see it as being ported easily to a specialist Forth processor – and being used for controlling CNC machines and 3D printers, and also rendering any common file format – such as Gerber or G-code that uses a mix of single letter characters and numbers to represent machine control instructions.
The MSP430 is only the start of the project, hopefully leading onto custom stack processors that execute forth primitives directly.