SIMPL – a Small Forth Inspired Language

SIMPL really is a very small language. The interpreter and primitives kernel fit into less than 1Kbytes on the MSP430.  It can be ported to virtually any microcontroller that has a reasonable number of registers. SIMPL is like Forth – but simplified!

SIMPL was inspired by Ward Cunningham’s TXTZYME – a minimal language written in C that allowed a microcontroller – such as Arduino or Teensy to perform basic operations, such as printing text, looping, toggling port pins and reading an ADC all controlled from a serial terminal. It only had 13 instructions and was quite limited in what it could do.

I was intrigued by TXTZYME and how it worked –  and quickly realised that it could easily be extended to include math and logical operations, conditional execution and like Forth could have new words created from it’s set of low level primitives.

In the last 4 years, as inspiration leads me, I have developed SIMPL much further and ported it to other microcontrollers such as MSP430 and ARM M4. There are now 400MHz ARM M7 microcontrollers available – and porting SIMPL to these monsters should be relatively SIMPL.

This year – taking my inspiration from a recent trip to Forth Day at Stanford University – with Forth cpu expert James Bowman, and eForth guru Dr. C.H. Ting – I have sought to write SIMPL in MSP430 assembly language, as a learning exercise, and so I can experiment further with the mechanics of the language. All I’m doing is following Chuck Moore’s journey,  but 50 years later. I just want to create the tool set that makes programming computers easier and more productive.  I don’t want 20 million lines of code just to run an operating system. I want the machine to spend all it’s time on my application!

I see SIMPL as a universal “Shorthand” allowing us to write code at a fairly low level for a wide variety of microcontrollers – at least for simple applications.  I think of it as a debugging toolbox, or Smart Bootloader, that once installed on a microcontroller, grants you the ability to load code into memory,  communicate with the mcu at a fairly basic level, execute code and examine the contents of memory or print out results. It also gives the means to exercise the mcu peripherals – such as GPIO, ADC, UART, SPI etc.

Just like Chuck Moore, I don’t want to learn a whole new set of assembly language for every new processor I take on.  I want to put a version of SIMPL (coded in C or Arduino C++) on the chip, and gain a familiar set of basic tools.

My goal is to produce an extensible language that fits into just 1k bytes – which is effectively an “Access All Areas Pass”  for the mcu.

The MSP430 has been chosen because it is a low-power 16 bit processor and newer devices have up to 256K bytes of nonvolatile FRAM memory.  MSP430 also have some neat features – like 24bit ADC, and very fast UART allowing serial communications at up to 8 Mbaud.

Once the SIMPL virtual machine  has been written in MSP430 assembly language – it is a relatively easy task to port it to any other modern register based processor – including ARM, x86 etc.  But for now I am just finding my way around the MSP430 assembly language – which will suffice as a very capable virtual machine.

Ultimately, the plan is to port SIMPL to it’s own custom stack processor – such as James Bowman’s J1, based on a soft core running in an FPGA. Most of the primitive instructions are executed directly by the J1 hardware, so the port will consist of decoding the primitives from ascii using a look up table and generating  the 16 bit wide instructions that the J1 uses.

What I see SIMPL useful for:

It’s a low overhead tookit that allows you to explore the inner workings of microcontrollers.

It uses many of the ideas of Forth, but without getting bogged down with the Forth virtual machine. Forth could be considered to be an extension of SIMPL, and learning SIMPL teaches you a lot of what you need to know for Forth.

It allows the newcomer to explore a microcontroller with just a serial terminal. Ideal as an educational language, showing what can be achieved in a very low byte count on virtually any microcontroller.

Very compact – a 140 character tweet or SMS could do a lot in SIMPL

Automates the process of getting code to run on a new microcontroller – a useful alternative to the traditional bootloader – with lest than 1K of overhead.

Fast – all primitives are written in assembly language. Instructions are decoded in logic or by look-up table – so reduces the overhead of a dictionary search.

Uses a very simple syntax, many machine tools, 3D printers, laser cutters could be controlled from text files that are essentially lists of SIMPL instructions.

SIMPL has just 32 primitive instructions – based on the non-alphanumeric symbols used in ascii. Lower case alphabetic characters are used for function calls to code that exercises processor specific hardware, or for examining the contents of memory.  The uppercase characters are reserved for user defined words – created from primitives and lowercase words.

SIMPL is a highly mnemonic language – where the choice of character reflects the operation being performed.  This is not a new idea – it was used on the very first Cambridge – built EDCSAC machine – where instructions were loaded in from paper tape. SIMPL pays homage to this machine and the simple way in which things were once done.

SIMPL is Forth like in that it uses words or tokens that cause blocks of code to be executed sequentially. It differs from Forth in that the words consist of single, printable ascii characters that are decoded directly causing the cpu to call a routine at a given address, execute the subroutine found there and then return control to the inner interpreter.

There are 85 printable characters – so this defines how many unique codes the machine will interpret as instructions.

The inner interpreter is very compact on the MSP430 with the main interpreter loop fitting into about 300 bytes.

On top of this interpreter loop, you have the code for the low level primitive symbols – listed below, and then higher level words – such as “toggle a port pin” “n milliseconds delay” “read an adc channel” “output a string to serial port” “produce a hex dump of memory” which are represented by the lower case alphabet characters.

This layer of the language is what I refer to as the “Arduino Layer” – all those useful helper functions that are processor specific.

The next layer is the users application code where the users words use capital letters. For example – the classic washing machine program:

FILL
WASH
EMPTY
FILL
RINSE
EMPTY
SPIN

This would be condensed in SIMPL as FWEFRES
– which is a code reduction factor of >5

But each of these can have a parameter – for example if the WASH is an AGITATE cycle A

– In Forth

: WASH 50 DO AGITATE LOOP;

In SIMPL we also use a colon definition to define W. There is no need for the ; as when this snippet is entered it has a null terminator added, and this is the cue to the interpreter to return to fetch the NEXT word

:W50(A) the code in the parenthesis is repeated 50 times

In fact the ascii code for W is used to store this code snippet at a given address – calcuated as (W-61)*32. So as soon as the interpreter encounters W, it jumps to that code address

Most of the primitive Forth words are located in the ASCII characters 32 to 63 that leaves 64 to 126 available for the users vocabulary and constructs made from primitives.

There are approximately 32 printable ascii punctuation and symbol characters – and with a bit of pre-processing they can be used to form the basic machine primitives.

Stack Commands (7) PUSH, POP, DUP, DROP, SWAP, OVER, NOP

,        PUSH
.        POP
”       DUP
‘        DROP
$        SWAP
%        OVER
SP        NOP

Maths & Logical Operations (9) AND, OR, XOR, NOT, SHR, ADDC

+        ADDC
–        SUB
*        MUL
/        DIV
&        AND
^        XOR
`        SHR
|        OR
~        INV (NOT/COMPL)

Transfer Instructions (8)

:        CALL (COMPILE)
;        RET
(       LOOP-START
)        LOOP-END
<       LESS
=        EQU
>        MORE
\        JUMP – condition follows

IN/OUT (2)

?        KEY (INPUT)
.        PRINT (OUTPUT)

Memory Instructions (3)

@        LOAD
!        STORE
#        LIT

Others (6)

_ _        String Print
[ ]        String Store
{ }        Array of elements – switch/case

The use of single ascii characters as tokens, makes the language a lot less verbose than Forth, and snippets of source code can be sent in very few characters – for example as an SMS message between 2 systems equipped with GSM modems.
It is a fairly simple task to have a word table – where the verbose form of the words are stored, and can be printed out, to expand the source into something more readable – for example

“‘% -> DUP DROP OVER

Three ascii characters expanded to 11 plus 3 spaces

SIMPL does not use the space as word separator, but in the special case where you have several numbers to put on the stack, the space is used to indicate “push onto stack”

The intention is to write an implementation of SIMPL in MSP430 assembly language. This not only helps me learn MSP430 asm code, but it’s a good exercise in creating a SIMPL virtual machine using a very conventional register based microcontroller, to gain experience of the language and its limitations. For example, have I got the right mix of primitives to allow a complete language to be synthesised?  The choice of primitives was based on Chuck Moore’s MUP21 instruction set and CH Ting’s e-Forth model.

Writing in assembler exposes you to the raw roots of the processor, where you have to think hard about what you want the code to do, and always be prepared to rewrite and refine every routine.  Having ported an application to one register-rich processor, porting to a new device like an ARM is much easier, as you have already done the hard work of developing the register functional model.

As the microcontroller is running the SIMPL machine, the SIMPL primitives should give access to the widest range of the host’s instruction set, but with the simplification that the stack takes the place of having multiple registers to write to.  This is further simplified that the top and next locations on the stack are stored in registers rather than RAM.

As the VM needs to perform tests and comparisons on numbers – the usual comparison operators are provided  > , < and  = .

These are used in conjunction with the parentheses operators which provide the means of skipping or looping sections of code depending on the result of the comparison.

As an example     10 11>(_Print if Greater_)

 

 

 

 

 

 

 

 

 

 

 

Advertisements

About monsonite

mostly human
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s