This post is concerned with minimal computing – a subject that is close to my heart. I write this on the day that my Employers have suffered a major ransom-ware cyber attack on their servers and IT systems, bringing the companies activities to a virtual standstill. To me this proves the fragility of the technology that we entrust for our day to day lives. There has to be a better, much simpler way……..
In the 1960s, the world of computing was evolving quickly as many new machines came on the market. For code developers at that time, this presented a problem in that every new machine had a different assembly language, and this, and other quirks of the machine had to be mastered before efficient coding could be done.
High level languages such as Fortran, Algol and Cobol first appeared in the late 1950s, and whilst these eliminated the need for the developer to handle the machine language directly, the languages brought their own problems and abstracted control of the machine from the programmer.
The young developers of that age wanted something better, and both Ken Thompson of Bell Labs and C fame, and Charles H. Moore – inventor of Forth, came up with their own solutions – to create a comfortable programming environment where they could code efficiently.
Thompson was part of the team including Brian Kernighan and Dennis Richie, working a Bell Labs who brought us the C programming language and the UNIX operating system – which is the basis of Linux and the foundation layer of almost all open source software.
Moore decided that he preferred a language that was closer to the “metal”, and interactive in nature – so as to avoid the edit, compile, debug cycle – typical of a compiled language such as C. Moore also believed that the target processor should be able to host it’s own toolchain – and not rely on external resources from a more powerful machine. Key to this was the dual capabilities of Forth – summarised (from Wikipedia) “Forth features both interactive execution of commands and the ability to compile sequences of commands for later execution.” It is also extensible, in that it has the means for the programmer to create new commands.
Inspired by Chuck Moore’s Forth, and the minimum instruction set computing (MISC) ICs that he subsequently developed, I decided to get to the heart of minimum interactive computing and devise a computing environment that could be applied to the smallest, resource limited microcontrollers and offer the four main features of Forth that I felt were most important
- Interactive – commands may be typed at the keyboard for immediate execution
- Commands can be combined into programs then compiled for later execution
- Extensible – from a small kernel of commands a whole application can be built
- No external tools required other than a serial terminal
These I believe are the minimum requirements for computing – an interactive computing environment that can accessed with nothing more than a serial terminal.
Now some versions of Forth can be very comprehensive and run to about 16Kbytes, though most are between 4K and 6K. My quest was for a much-reduced “Forth-Like” environment, that was so small, that it could almost become part of the bootloader, and be always present at start-up. My initial experimentation was with Arduino – as it was so widely available, but I then progressed to ARM and MSP430 – because of their larger wordsizes. I settled on the MSP430 as my “model processor” – because of it’s 16-bit wordsize (ideal for Forth) and the fact that it had a very “clean, orthogonal” instruction set, with a rich set of registers. I found coding in MSP430 assembly language relatively straightforward and thus I use it to test out new ideas.
As a bare minimum, the interactive environment needs to be able to do the following:
- Read consecutive serial characters from a UART and place them into an input buffer.
- Identify numerical strings from this buffer and convert them into integers to store in RAM.
- Use non-numerical characters as the index for a jump table, allowing the processor to branch to blocks of code on the basis of the ascii value of the character.
- Execute code at the jump address, then return to fetch the next character from the input buffer.
These actions are best performed by an inner interpreter running within a loop, which co-ordinates the actions and ensures that the characters from the buffer are interpreted in sequence until the buffer end is reached with a newline character. This is all that is required to form the basis of an interactive character interpreter framework – and in MSP430 assembly language may be achieved in about 300bytes – including the initialisation of peripherals (UART, GPIO etc) and the get_char and put_char UART routines.
The other main aspect of a Forth-like language is that ability to store sequences of commands from the input buffer to memory, and “compile” them, so that they are available to be run later.
Forth uses the process of giving these sequences a name or “word” and uses a process called the “colon definition” to commit the sequences to memory in an orderly arrangement – such that they may be retrieved and executed later. Forth uses a dictionary structure to do this which it scans to find the word just typed. Whilst convenient, this offered more sophistication than I needed so I adopted a much more basic approach.
In order to keep this process extremely simple, I decided that naming a sequence would just really mean allocating a known character to it, such that it can be executed upon receipt of that character. Then finding a fixed address to store the sequence could also be based on the value of that character, so that it may be located by a jump table.
So with this simple approach, for example we can type a sequence 100L The number decode routine will put 100 (decimal) onto the stack and then jump to the code that is addressed by decoding ascii L. This might be for example a routine that sends a number to a parallel 8-bit port which has LEDs attached. The routine picks up the value of 100 from the stack and lights the LEDs to give a binary representation of 100.
This was how the programming toolkit started – just the means to decode a single integer number and use it within a given routine to perform some I/O action.
It was decided that capital letters would be used for the user routines, allowing a full 26 different actions, initially passing a single numerical parameter to the routine via the stack. This was deemed a little limited, so a mechanism was derived to put more parameters onto the stack – so that arithmetical operations could be done.
Forth uses whitespace to separate words, but I thought that the space character could be used to separate sequences of numbers and place them on the stack one after another:
14 29 put 14 on the stack then put 29 on the stack
We can then introduce code that provides the basic maths operators + – * /
14 29+ Adds 14 and 29 leaving 43 on the top of the stack
14 29- Subtracts 14 from 29 leaving 15 on the top of the stack
Following on from the above example we can then add in the “L” LEDS command
14 29+L Illuminates the LEDs with binary pattern for 43
14 29-L Illuminates the LEDs with binary pattern for 15
So the fledgling language began – a means to perform a series of operations expressed as a string of serial characters. As the language evolved, the following conventions emerged:
Language primitives: All ascii symbols and punctuation marks ! ” % ^ & * ( ) _ + etc
Built in functions: Lower case characters a to z
User “Words”: Upper Case characters A to Z
Assembly Language Implememtation
Tiny-Forths had been explored before – notably on the AVR, a register rich 8 -bit processor but these resulted in about a 2K bytes implementation. My aim was to reduce the core of the language to less than 1024 bytes.
In the winter and spring of 2017, I decided to code up the language in MSP430 assembly language. This served two purposes, I got to learn a little MSP430 assembly language, and it illustrated what sort of resources were required of a processor in order to implement a compact version of this language.
The MSP430 was chosen because it was a 16 bit processor – and it had a very orthogonal instruction set and a rich set of 16 registers. To me, it represented a blank canvas on which to paint the essence of the language.
The MSP430 code was quite compact with the kernel of the language fitting into some 300 bytes and a full implementation in under 900 bytes.
In the next part I will take the ideas around a self contained 1K byte interactive language toolkit further.