More About SIMPL

SIMPL – a small Forth Like Language

Charles (Chuck) Moore, the inventor of the Forth computer language is one of my all time computing heroes.

In a life long career in computing, starting in the early 1950’s – he has been mathematician, programmer, hardware engineer and designer of several unique processors that execute the Forth language directly as their native instruction set.

Last year, I was fortunate enough to have the opportunity to meet him at Forth Day – at Stanford University, in November 2016 – and he has been an inspiration for some of my recent work.

I first came across FORTH as a teenager in the early 1980’s when the home computer was dominated by the BASIC programming language, and it fascinated me that Forth offered a radically different alternative.

It’s taken me the last 35 years to get around to understand what makes Forth tick – as part of my mid-life computing crisis, and as such I have taken time out to learn some of the fundamentals of this fascinating language.

Chuck Moore always looks for ways to simplify the task in hand – and it is his search for simplicity that has led me to some recent developments with a tiny programming language – that I call SIMPL.

SIMPL stands for Serial Interpreted Minimal Programming Language – and it was initialy inspired by Ward Cunningham’s Txtzyme – for Arduino or Teensy.  I took Txtzyme, added several new commands, and made it an extensible language, which could be ported to virtually any microcontroller.

SIMPL was originally written in C for Arduino, and then MSP430 and ARM. Now I am exploring writing the core routines in MSP430 assembly language for speed, efficiency and compactness.

The extended version of SIMPL – called SIMPLEX, includes a tool-kit to allow editing, loading, compiling and examination of memory – all  hosted by the target processor – in about 5K bytes.

MISC Processors and eForth

 

Inspired by Chuck Moore’s Minimal Instruction Set Computer (MISC) designs of processors from the mid 1980’s, I have chosen a similar approach. At the same time that Chuck Moore was working on designing MISC processors in gate arrays,  Bill Meunch  and Dr. Chen Hanson Ting were developing eForth – a complete forth that could be synthesised from only around 30 primitive instructions.

Meunch and Ting showed that once these 30 primitives had been coded in the native assembly language of the target processor, plus a porting of the inner and outer interpreters, then the whole eForth language could be created very quickly.

Ting once remarked that to put eForth on any new microprocessor, starting from scratch generally took him about 2 weeks.

A Comfortable Computing Environment

eForth creates a comfortable computing environment within any new, unfamiliar processor,  – kind of like establishing a base-camp in a hostile terrain. Additionally, once you have studied the processor architecture and instruction set sufficient to write the eForth primitives in it’s assembly language – you have probably become quite intimately conversant with it’s inner workings.

Forth is an extensible interactive language, which provides a complete computing environment hosted by the target machine. It dispenses with the need for sophisticated compiler tool chains or  IDEs, and allows all coding and interaction to be done from a simple serial terminal.  At the same time it provides an editor, loader, compiler, and the means to examine memory by way of a hex dump.

Registers and memory mapped I/O can be poked or examined interactively, and the low level drivers for I/O can be developed by experimentation from the ground up.  Initially the only I/O requirements are to allow a character to be sent or received via the serial UART – and virtually every modern microcontroller has at least one UART.

eForth however is some 6K to 8K bytes in a typical implementation, so I asked myself whether there was a way of creating an extensible language from an even smaller starting point – and so the idea of SIMPL came into being.

The Mechanics of Interpreted Languages

eForth, SIMPL and other tiny Forth like languages rely on efficiently coding a small set of primitive instructions onto a stack-based, 16 bit virtual machine, running in the native assembly language of the host processor.

The virtual machine consists of an inner interpreter that fetches the next instruction token from memory, performs a decode operation (either by jump table or switch-case structure) and executes the code  associated with that operation. The efficient coding and execution of the primitives and the inner interpreter is what gives these threaded, interpreted language their speed.

Ideally there will be fewer than 32 primitives, so in a hardware cpu implementation they may be represented by a 5 bit token, and mapped directly into the cpu machine instructions – this is the approach that Chuck Moore used with his Forth processors, and has been widely been adopted since – such as James Bowman’s J1 Forth CPU.

However in a software implementation we need to encode these in the native assembly language of the target processor, be it an 8-bit, 16-bit or 32 bit processor. Forth was developed at a time when 16-bit minicomputers were just appearing – and so traditionally it has used a 16 bit integer word size.  The late 1970’s and early 80’s saw Forth being adapted to 8-bit microprocessors – some of which were a good fit to the Forth virtual machine model – and some were not.

An example of a tiny Forth for 8-bit AVR

In one very compact implementation – T. Nakagawa’s “Tiny Forth” written for the 8-bit AVR, 25 primitives were encoded into just 286 instructions – an  average of about 11 machine language instructions per primitive.

Despite the limitations of an 8-bit architecture – Nakagawa’s AVR assembly language implementation is very efficient in about 840 instructions of AVR code and 137 bytes of tables – this assembles to around 1.8K bytes.

Typically the cpu does the following pseudo-code by fetching the next primitive instruction from memory,  testing it’s token value and branching to the next in the table if it’s not a match.

In hardware this is equivalent to the instruction fetch and decode.  Getting this process to run efficiently in assembly language is key to the speed of the resulting language.

Test the token character of the primitive
Is it the one we are looking for - no, branch to next in the table
Get the operands off the stack and into a pair of working registers
Perform the operation
Put the result back onto the stack
Correct the stack pointer
Return to the inner interpreter

Here is the AVR implementation of ADD

prim6: ; +              ; This is the ADD primitive with token=5
 cpi r16, 5             ; Is the token=5?
 brne prim7             ; No - branch to next primitive
 ld r19, X+             ; get operand 1 off the stack
 ld r18, X+
 ld r17, X+             ; get operand 2 off the stack
 ld r16, X+
 add r16, r18           ; add the low bytes
 adc r17, r19           ; add the high bytes plus any carry
 st -X, r16             ; Put the result back on the stack
 st -X, r17
 ret                    ; Return to the inner interpreter

In an 8 bit processor such as the AVR,  it takes 4 instructions to load up the working registers with two 16 bit operands (Forth uses a 16-bit integer size), and two instructions to perform arithmetical or logical operations on them. Passing the result back to the stack is another two instructions followed by a return.  With the overhead of the instruction decode and return, the basic arithmetical and logical operations are typically 11 instructions.   So in this light, an 8-bit Forth generally runs at about one tenth or less of the instruction rate of the processor.   The decode operation takes 2 instructions – even if there is no-match, so the frequently used primitives should be put closer to the start of the list.

Nakagawa’s Tiny Forth is concise and well documented – and I have reproduced it in this Github Gist

Whilst the coding of the primitives takes up around a third of the total program, the thirty short routines to implement the inner and outer interpreter, and the support routines such as serial input and output, multiplication and division take up the remaining two thirds.

With SIMPL, I strive to reduce all of these overheads, and this means an implementation using a 16-bit microcontroller – which is fundamentally a far better match to the 16-bit Forth Virtual Machine.

The MSP430 – A good match to SIMPL.

The MSP430 has been around for many years, originally developed by Texas Instruments in 1993 as a low power, mixed signal, RISC processor with a von Neuman architecture. It has an un-complicated 16 bit register set, and a very wide range of peripheral modules, including 24 bit ADCs, Low Energy Accelerators for FFT and DSP math operations and a large selection of standard peripherals.

In recent years it is one of few processors to have on-chip non-volatile FRAM – which has a much faster writing speed than Flash, and with 10E14 write cycles – virtually indestructible.

For a 16-bit processor, such as the MSP430, 16-bit integer maths and register operations are single machine instructions.

The ADD primitive becomes:

add: add @stack +, tos    ; Add 2nd stack member to the top of stack
     $NEXT                ; Call macro that executes the NEXT instruction

Clearly this increases the efficiency of executing the primitives and reduces the code space requirements, provided that the overheads of decoding an instruction token, and then fetching the next virtual machine instruction are kept to a minimum.

 

 

Simplifying Forth

Forth uses the concept of a “Dictionary” to organise the Words or subroutines.  The Words are written in natural language, and the dictionary search function normally uses the first three characters of the word and the length in order to find that word within the dictionary.  This is a powerful and flexible approach, but adds quite a significant coding overheads to the outer interpreter.

The approach I use in SIMPL was inspired by Ward Cunningham’s Txtzyme Interpreter – where just matching a single character using  switch-case statements causes the code associated with that character to be executed.  Written in assembly language this can be done with a hash-table,  or just by jumping to a program address that has been calculated in some way from the ascii character value.

Critics might say that this approach limits you to only a few dozen unique commands – and yes it does, but if your application needs more than this – perhaps it’s worth rethinking how you are going to solve your application.

As Einsten said “Make things as simple as possible, but not simpler”

There are 85 printable ascii codes between  32 and 127,  plus 10 numeric digits.  If we group these into

26 upper case characters

26 lower case characters

33 symbols and punctuation characters

SIMPL uses most of the symbols and punctuation characters as primitive instructions – such as &  +  !  % etc.   The lower case characters are used for invoking higher level commands – synthesised from the primitives. These were influenced by Txtzyme, tend to be more microcontroller specific – and may include commands to exercise the I/O and peripherals, read or write to a digital port, get a reading from and ADC or provide  hex dump of memory.  Finally are the uppercase characters which are reserved for the Users application commands – and provide the mechanism by which the language is extended.

 

The Primitive Instructions
Most of the primitive Forth words are located in the ASCII characters 32 to 63 that leaves 64 to 126 available for the users vocabulary and constructs made from primitives.

There are approximately 32 printable ascii punctuation and symbol characters – and with a bit of pre-processing they can be used to form the basic machine primitives.

Stack Commands (7) PUSH, POP, DUP, DROP, SWAP, OVER, NOP

, PUSH
. POP
" DUP
' DROP
$ SWAP
% OVER
SP NOP

Maths & Logical Operations (9) ADDC, SUB, MUL, DIV, AND, OR, XOR, NOT, SHR 

+ ADDC
- SUB
* MUL
/ DIV
& AND
^ XOR
` SHR
| OR
~ INV (NOT/COMPL)

MUL and DIV my be substituted for Shift Left and Shift Right 

Transfer Instructions (8)

: CALL (COMPILE)
; RET
( LOOP-START
) LOOP-END
< LESS
= EQU
> MORE
\ JUMP - condition follows

IN/OUT (2)

? KEY (INPUT)
. PRINT (OUTPUT)

Memory Instructions (3)

@ LOAD
! STORE
# LIT

Others (5)

_ _ String Print
[ ] String Store
{ } Array of elements - switch/case

 

The use of single ascii characters as tokens, makes the language a lot less verbose than Forth, and snippets of source code can be sent in very few characters – for example as an SMS message between 2 systems equipped with GSM modems.

It is a fairly simple task to have a word table – where the verbose form of the words are stored, and can be printed out, to expand the source into something more readable – for example

“‘% -> DUP DROP OVER

Three ascii characters expanded to 11 plus 3 spaces

SIMPL does not use the space as word separator, but in the special case where you have several numbers to put on the stack, the space is used to indicate “push onto stack”

The SIMPL Virtual Machine

SIMPL is based around a virtual stack machine that provides an interactive programming environment.  SIMPL primitive instructions are effectively the native assembly language of that virtual machine, and have been chosen to provide a compact but versatile instruction set.

Ultimately, I wish to create an open-core processor using FPGA devices that can execute SIMPL code directly – but in the meantime I need to rely on a virtual machine written in MSP430 assembler to do this for me.

SIMPL is Forth-like in that it uses words or tokens that cause blocks of code to be executed sequentially. It differs from Forth in that the words consist of single ascii characters that are decoded directly causing the cpu to call a routine at a given address, execute the subroutine found there and then return control to the inner interpreter.
The inner interpreter is very compact on the MSP430 with the main interpreter loop fitting into about 300 bytes. On top of this interpreter loop, you have the code for the low level primitive symbols – listed below, and then higher level words – such as “toggle a port pin” “n milliseconds delay” “read an adc channel” “output a string to serial port” “produce a hex dump of memory” which are represented by the lower case alphabet characters.

This layer of the language is what I refer to as the “Arduino Layer” – all those useful helper functions that are processor specific and deal with timing and I/O peripherals.

The next layer is the users application code where the users words use captital letters. For example – the classic washing machine program example:

FILL
WASH
EMPTY
FILL
RINSE
EMPTY
SPIN

This would be condensed in SIMPL as FWEFRES
– which is a code reduction factor of >5

But each of thes can have a parameter – for example if the WASH is an AGITATE cycle A

– In Forth

: WASH 50 DO AGITATE LOOP;

In SIMPL we also use a colon definition to define W. In fact the ascii code for W is used to store this code snippet at a given address – calcuated as (W-61)*32. So as soon as the interpreter encounters W, it jumps to that code address

:W50(A) the code in the parenthesis is repeated 50 times.

 

In Conclusion

SIMPL is a compact, extensible, interactive language that provides a minimum tool-kit to allow small applications to be developed on microcontrollers.

It provides a development environment, which may be ported easily to virtually any microcontroller with very low overheads.

The use of single ascii characters as primitive instructions and commands makes it compact but retains a certain level of human readability – that some machine languages do not.  As there is a one to one substitution between the ascii character and the Word – a translation table may easily be created to expand each symbol out to its full text representation.

SIMPL can act as a Lingua Franca between widely different classes of computing machines, or as messaging between several specialised processors on the same board . A lot of meaning can be conveyed in a few characters, and so a simple microcontroller could communicate tasks between dedicated co-processors in just a few characters.

At it’s heart, SIMPL has a text interpreter that reads character strings from memory and translates these into machine actions.  This is a very powerful technique applicable to all sorts of everyday tasks such as desktop manufacturing, CNC milling/routing, 3D printing, laser cutting, graphics drawing etc.

 

 

 

Advertisements

About monsonite

mostly human
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s