The 1949 EDSAC Instruction Set and Architecture

The 1949 EDSAC
This information has been taken from the EDSAC explanatory poster produced by Cambridge university.  I have added some of my own notes for clarity.
The 1949 Instruction set EDSAC’s instructions in 1949 was very simple and were executed at a rate of about 600 per second. They were as follows:
AnS: A += m[n] AnL: AB += w[n]
SnS: A −= m[n] SnL: AB −= w[n]
HnS: R += m[n] HnL: RS += w[n]
VnS: AB += m[n] * R VnL: ABC += w[n] * RS
NnS: AB −= m[n] * R NnL: ABC −= w[n] * RS
TnS: m[n] = A; ABC = 0 TnL: w[n] = AB; ABC = 0
UnS: m[n] = A UnL: w[n] = AB CnS: AB += m[n] & R
CnL: ABC += w[n] & RS
RnS, RnL: Shift ABC right arithmetically by the number of places corresponding to the position of the least significant one in the shift instruction. For example, R0L, R1S, R16S and R0S shift by 1, 2, 6 and 15 places, respectively.
LnS, LnL: Shift ABC left arithmetically by the number of places corresponding to the position of the least significant one in the shift instruction. For example, L0L, L1S, L16S, L64S and L0S shift by 1, 2, 6, 8 and 13 places, respectively.
EnS: if A >= 0 goto n
GnS: if A < 0 goto n
InS: Place the next paper tape character in the least significant 5 bits of m[n].
OnS: Output the character in the most significant 5 bits of m[n].
FnS: Verify the last character output.
XnS: No operation.
YnS: Add a one to bit position 35 of ABC, counting the sign bit as bit zero. This effectively rounds ABC up to 34 fractional bits.
ZnS: Stop the machine and ring a bell.
The EDSAC Instruction Set (1949)
What caught my attention is that the instructions use a single character mnemonic – taken directly from the upper case characters available at that time from the 5-bit “Murray” teleprinter code – which was in common use in the Creed teleprinters used at that time.
The mnemonics were chosen to give a reasonable amount of readability such as A for ADD and S for SUBtract.
This makes common sense to make it as easy as possible and is a simple first stage mapping of human readable mnemonic characters to a 5 bit binary teleprinter code, which was used directly as the 5 bit instruction.
There were only two register spaces accessible from the instruction – visible to the user:
The 71-bit Accumulator
The 35-bit Multiplier Register
Further more there were 1024 18-bit words of delay line storage
The contents of the Order Register (Instruction Register) and Sequence Control Tank (Program Counter) as well as the multiplicand register were viewable on the operator’s console.
For my own clarification – so as to build up a simple model of the EDSAC, I have attempted to group the instructions into:
Accumulator and memory only
A n Add the number in storage location n into the accumulator
S n Subtract the number in storage location n from the accumulator
T n Transfer the contents of the accumulator to storage location n and clear the accumulator
U n Transfer the contents of the accumulator to storage location n and do not clear the accumulator
Accumulator, Multiplier and Memory
H n Copy the number in storage location n into the multiplier register
V n Multiply the number in storage location n by the number in the multiplier register and add the product into the accumulator
N n Multiply the number in storage location n by the number in the multiplier register and subtract the product from the accumulator
C n Collate [logical and] the number in storage location n with the number in the multiplier register and add the result into the accumulator
Accumulator and Shifter Operations
R 2n-2 Shift the number in the accumulator n places to the right
L 2n-2 Shift the number in the accumulator n places to the left
Conditional Jumps
E n If the sign of the accumulator is positive, jump to location n; otherwise proceed serially
G n If the sign of the accumulator is negative, jump to location n; otherwise proceed serially
Input-Output
I n Read the next character from paper tape, and store it as the least significant 5 bits of location n
O n Print the character represented by the most significant 5 bits of storage location n
F n Read the last character output for verification
Misc Operations
X No operation
Y Round the number in the accumulator to 34 bits
Z Stop the machine and ring the warning bell
The numerical values in the accumulator and multiplier registers are normally thought of as signed binary fractions, but integer operations could also be done easily. For example, the order V1S can be interpreted as adding the product of the 17-bit signed integer in m[1] and to the 17-bit integer in RS and adding the result into bits 0 to 32 of the ABC. With a suitable shift, the integer result can be placed in the senior 17 bits of A ready for storing in memory
Teleprinter and perforated tape for Input Output
EDSAC used 5-bit integers (0 to 31) to represent characters using two shifts: letters and figures.
In letter shift the codes 0 to 31 respectively represented: P, Q, W, E, R, T, Y, U, I, O, J, figs, S, Z, K, lets, null, F, cr, D, sp, H, N, M, lf, L, X, G, A, B, C and V.
In figure shift the encoding was as follows: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ?, figs, “, +, (, lets, null, $, cr, ;, sp, £, ,, ., lf, ), /, #, −, ?, : and =.
In these tables, figs, cr, sp and lf denote figure shift, carriage return, space and line feed, and on the paper tape perforator their keys were labelled π, θ, φ and ∆, respectively.
In this document, these codes correspond to the ASCII characters #, @, ! and &.
The paper tape reader complemented the high order bit of each 5-bit character, so the rows , and are read as codes 0(P), 7(U) and 27(G), respectively. The machine could read paper tape at a rate of 50 characters per second and output to a Creed teleprinter at nearly 7 characters per second.
The 5-bit Murray code used by the paper tape perforator and teleprinter was somewhat different to the familiar ascii character set.  I have tried to sumarise it here
Table 2 Edsac Character and Instruction Codes

Letter Shift  Figure Shift  Binary  Decimal   Instruction
P             0             00000      0      PLACE
Q             1             00001      1
W             2             00010      2
E             3             00011      3      JUMP IF EQUAL
R             4             00100      4      RIGHT SHIFT
T             5             00101      5      TRANSFER & CLEAR
Y             6             00110      6      ROUND
U             7             00111      7      UPDATE & !CLEAR
I             8             01000      8      INPUT
O             9             01001      9      OUTPUT
J                           01010      10
#             FS            01011      11
S             "             01100      12     SUBTRACT
Z             +             01101      13     STOP
K             (             01110      14
DEL           LS            01111      15
NULL          .             10000      16
F             $             10001      17     VERIFY
@             CR            10010      18
D             ;             10011      19
!             SP            10100      20
H             £             10101      21     COPY TO MULT
N             ,             10110      22     NEGATE THROUGH MULT
M             .             10111      23
&             LF            11000      24
L             )             11001      25
X             /             11010      26     NO OP
G             #             11011      27     JUMP IF GREATER
A             -             11100      28     ADD
B             ?             11101      29
C             :             11110      30     COLLATE (AND)
V             =             11111      31     ADD THROUGH MULT

Notes

1 Erase (DEL) is represented by an asterisk (“*”) in the simulator. When this character is output, it sets the
teleprinter into letter shift.
2 Blank tape is represented by a period (“.”). This character has no effect on output.
3 The personal computer text environment has only a “newline” character. On the Edsac simulator,
the line-feed character is interpreted as a newline character, and carriage returns are thrown away.
4 The symbols ?, f, ? or p are typed as @, !, & and #, respectively.

Some capital alphas unused B,D,J,K,M,Q,W
Posted in Uncategorized | Leave a comment

An EDSAC Simulator using SIMPL

In September 2017, in Hebden Bridge – as part of the Wuthering Bytes Digital Festival, will host the EDSAC Challenge, in which we will attempt to create the 1949 Cambridge EDSAC computer  as a softcore processor using an FPGA programmed using the IceStorm open source tool chain.
EDSAC was a simple architecture with just 18 instructions and used an 18-bit word length. It could address up to 1024 words of mercury delay line memory and had an instruction cycle time of about 1.5mS but represented a 1000 fold speed up in calculation times compared to contemporary hand cranked mechanical desk calculators.
EDSAC has not only historical significance, as it defined the start of the British Computer Industry, but it was also the first general purpose machine designed for real everyday scientific computation.
It’s simple architecture is relatively easy to study and there is a simulator program that runs on any current platform – thus it is a good introduction to Von Neuman “load-store” computer architectures.
My interest lies in the fact that it has so few instructions – and thus could be termed a minimal instruction set computer or MISC, which is a current fascination of mine.
My contribution to the ESAC Challenge, is that I wish to port my SIMPL language toolkit to the 1024 words of EDSAC memory, to see whether it can run on such a minimal architecture, and provide an interactive computing environment.
With just over 4 months available before the EDSAC Challenge – it’s time to get started.
The EDSAC Challenge is to be held at Hebden Bridge Town Hall on 7th and 8th September 2017 – immediately following the 2 day Chip Hack FPGA course.  These events are all parts of the Wuthering Bytes Digital Festival that runs from Friday 1st September to Sunday 10th September inclusive.
The next post looks at the EDSAC instruction set – and some of the challenges presented by working with 1940’s teleprinter code sets.
Posted in Uncategorized | Leave a comment

Exploring the J1 Instruction Set and Architecture using SIMPL

James Bowman developed the J1 Forth CPU in 2010 – and presented a paper describing its architecture at Euroforth  that same year.  It’s a stack based processor – capable of executing Forth primitives mostly in a single cycle. It’s also an ideal candidate to explore stack machine and MISC architectures – and has been ported to FPGAs using entirely open source tool chains.

This ongoing project was inspired by ideas first discovered in the book “The Elements of Computing Systems”  – and the associated course “From NAND to Tetris”.   Having ploughed easily through the first chapter on hardware, I realised that my software skills were dreadfully lacking – so I thought I’d create a simple language  (SIMPL) from scratch and get it to run on an open source FPGA platform  – myStorm – using Clifford Wolf’s Project IceStorm open source tool chain.

In this post I explore the J1 ISA (Instruction set & architecture) and hack together a simple cross-compiler to execute code on a simulated J1 processor.

 The J1 Architecture.

Having read James Bowman’s J1 Paper several times over, I managed to build up a simplified model of his instruction set and architecture.

J1 uses 16 bit long instruction words – where each word is divided into different length fields. The following 3 images – taken from James’s 2010 EuroForth paper – explains the architecture:

J1_ISAJ1_encodingJ1 ALU Codes

Bits 15, 14 and 13  define the Instruction Class – and there are 5 classes of instruction

1 x x   Literal

000   Jump

001    Conditional Jump

010    Call

011     ALU instruction

Bit 12 if  set in ALU mode provides the return from subroutine mechanism by loading the top of the return address stack R into the PC.

Bits 11,10,9 and 8  define a 4 bit ALU opcode – allowing up to 16 arithmetical, logical and memory transfer instructions.

Bits 7,6,5 and 4 are used to control the data multiplexers – so that data can be routed around the cpu according to which of these bits are set.  Here lies a little anomaly with the J1, in that Bit 4 is not used, and it would seem more logical to use it to provide the return function bit – currently done in bit 12.

Bits 3 and 2 define how the return stack pointer is manipulated in an instruction. It can be incremented or decremented by setting these bits.

Bits 1 and 0 define the manipulation of the data stack pointer  – it has a range of  +1,0,-1,-2 depending on the setting of these bits. To pop off the stack you subtract 1 from the stack pointer, to push on the stack, you add one to the stack pointer.

As the various control fields of the J1 instruction exercise different parts of hardware – they can operate in parallel – so for example a return or exit from subroutine can be had for free.

Modifying the J1 Instruction Set.

Whilst the J1 instruction set is neat and compact – the anomaly in Bit 4 is a bit of a sticking point with me.  If we put the “return bit” into the Bit 4 field, this would free up the Bit 12 field.  Bit 12 could then be used to define more classes of instruction, or better still added to the 4 bit ALU instruction field, to make the number of op-codes 32 rather than 16.

If there were two J1 cores acting together in a single FPGA – bit 12 could be used to channel instructions to either core. One J1 could be used for computation and the other could be used as a graphics processor or similar specialised co-processor.

James has already done some changes in this area – moving the return bit to Bit 7 – plus some other internal improvements – and has subsequently released a J1b – which can be instantiated as either a 16 bit or 32 bit Forth cpu, whilst the J1a is a minimum footprint cut down 16-bit J1 with *K memory to fit into small FPGAs – like the Lattice IceStick . See James’s Github for demo. The Verilog fr the new J1b  – is here

For more information about possible modifications to the J1 – read Victor Yurkovsky’s excellent post in “FPGA Related”.  Some neat ideas for those that want to tinker with the internals of the J1.  As the J1 original design verilog is so neat and clean, it’s easy to make small incremental changes and test them out for usefulness on the J1 simulator. Victor’s spring clean freed up two bits of the instruction for future expansion and reduced the number of CPU slices from 150 to 76 – allowing the scope for multiple J1 instantiations in a single small FPGA.

 

Creating a Simulator and SIMPL to J1 Cross Compiler

I came across a very compact simulator for James Bowman’s J1 cpu – written in about 100 lines of C. I decided to port this across to an ARM based Arduino Due, so I could use it to explore the architecture further, as a step on the road to implementing a J1 on a myStorrm FPGA board.

I have put the Arduino code for this SIMPL cross-compiler/ J1 Simulator on this Github Gist

As an input the simulator expects 16 bit J1 machine code instructions in an array of RAM, and the bulk of the code is an execution model of the cpu which steps through this section of memory executing each instruction in turn. After each program step I chose to print out the main stack elements, the program counter, and a few memory cells.

However, the process of hand assembling J1 machine language needs patience and knowledge of the chip’s architecture,  and to write half a dozen native instructions to execute a simple loop counter was about as much as I would recommend for anyone. So I decided to investigate easier ways to automate this process, and hit upon the idea of a cross-compiler, using my SIMPL language and text interpreter framework to input the source code.

SIMPL uses single ascii character tokens to represent primitive machine instructions – I just needed to create a look-up table mechanism where these characters are decode into whatever the machine language of the host processor happens to be – in this case J1.  However it could be any other cpu – just by changing the contents of the look-up table, so a series of cross compilers could be written so that SIMPL could be hosted virtually (no pun) on any processor.

SIMPL exists as a virtual, stack oriented processor, hosted on real hardware cpu. More often the host processor is register based – such as ARM or MSP430 – but in the case of the J1 it is also a stack machine (albeit simulated on a register based ARM) . This means that the choice of the SIMPL minimal primitive instructions – are a very good fit to actual J1 instructions – with often a 1 to 1 match.  This makes a real J1 an excellent host for the SIMPL language – and it is this relationship that I wish to explore further.

In previous posts I have described SIMPL as a lingua franca – which allows the communication of programs and data between widely varying classes of computing machines.  Each machine just needs to host a SIMPL interpreter, and at the start of a communication exchange, the communicating machine needs to send a descriptive list of what the various words mean – in terms of the primitives that go up to make their definitions. Once this has been done, SIMPL becomes a very compact medium for transfering code, data and objects – with a code compaction of between 3 and 4 over other source code or object code techniques.

In Practice

I took the SIMPL text interpreter framework and mapped into it the J1 instructions, focusing initially on the stack, ALU and memory operations – listed below

SIMPL               Operation                J1 hex code

”                           DUP                                6081

‘                           DROP                              6183

$                         SWAP                              6180

%                        OVER                              6181

+                         ADD                                 6203

&                         AND                                6303

|                           OR                                   6403

^                          XOR                               6503

~                          INV                                6600

@                         FETCH                         6C00

!                            STORE                         6123

As can be seen from the above – they all begin 6xxx which means ALU operation – and the 2nd significant digit  is the 4 bit ALU opcode. The 3rd digit is the transfer destination of the ALU result, and the least significant digit is the stack manipulation bits.

The cross compiler produces a line of output thus – in response to a single ascii character being typed in  – in this case ^ for XOR

XOR 6503 PC=9 TOP=FFFF DS0=0 DS1=0 DS2=0 DS3=0 RTN=0 MEM20=0 MEM21=6F00 MEM22=0

This also shows the Program Counter PC, the Top of Stack plus the first three elements of the stack below TOP,  the value on the top of the return stack and three consecutive locations in memory.  This debug output is essential to ensure that the J1 simulator is correctly executing each new instruction – and a great visual insight into the working of the CPU.

The cross compiler is now producing output in J1 machine language – and because it uses the SIMPL front end text interpreter, this automates the process of putting text strings together so it can be used to compose small snippets of  J1 code and run them in an interpreted manner – allowing new words to be created from SIMPL primitives.

Future work will enhance the cross compiler allowing the larger program flow constructs such as DO-LOOP, IF-THEN and BEGIN-END  to be implemented. Whilst output from the simulator is currently just single lines of Debug – the program will be extended so as to compile into files of J1 opcodes, and SIMPL source.

 

“It’s Forth-like, Jim – but not as we know it….”

 

 

 

 

 

Posted in Uncategorized | Leave a comment

Making it Minimal – MISC Machines

As I read more about Forth, and engage with some of the World’s Forth-Thinkers, I get the impression that at some point Forth becomes addictive and productive.  I have yet to reach that level of enlightenment, but my journey continues  slowly, in fits and starts, as there  is always some other more pressing engagement on my metal capacity.

I have been for some time a proponent of minimal instruction set (MISC) machines – and Chuck Moore has been prolific in this area for the last 35 years or so.  In a progressive career in creating silicon integrated circuits that execute Forth primitives directly, he has slowly whittled away the complexity of these machines, to the point where 144 individual cores can be put on a single die – as part of his Green Arrays work.

The problem then becomes not so much a hardware challenge – but then one of software just how do you efficiently co-ordinate the tasks of 144 cores. Perhaps we have something to learn from the behaviour of bees or other colonial insects including ants and termites?

Each worker has a specific task to perform be it foraging for food, nursery duty or general housekeeping.  Each agent is running a low level behavioural model, with a higher level specialisation – based on the specific task or in response to certain stmuli.

This might be one way to organise a multicore processor  – a set of low level functions which are augmented by task or sensor specific application functions.

For the moment however, I first have to define the Instruction Set and Architecture (ISA) of my proposed Stack CPU.

Charles Moore, Chen Hanson Ting and Bill Meunch provided an important framework for MISC machines, and proved them to be an important family of cpus.  The small instruction set means simplified hardware, fewer transistors or logic cells and the ability to implement  using low cost FPGA technology.

With a minimal instruction set,  and a Forthlike language, it’s always possible to create more complex instructions by concatenating the primitive instructions.  Provided that you have a small number of useful instructions – any program can be synthesised.

Moore and Ting postulated that the minimum useful instruction set would be about 25 to 30 instructions, and this fits in well with using 5 bit tokens to represent instructions. Moore packed four, 5 bit instructions into 21 bit wide memory – and this formed a  very simple pipeline architecture, which could also be used to hold a 16 bit literal and a 5 bit instruction. This scheme was used in Moore’s ShBoom processor and also the Mup21.

The Novix NC4016 however used a different approach and used the bit fields of the instruction to directly control different parts of the hardware.

This was the same approach taken by James Bowman – with his J1 Forth CPU.  This is fully described in just 200 lines of verilog and been implemented on Xilinx and Lattice FPGA devices.

James’s device was influenced my Charles Moore’s Novix NC4016 architecture, and has been successfully incorporated into a retro gaming sheild for Arduino – Gameduino.  James is also a professional Silicon Architect, and the J1 is also used as a core on some graphic processor chips.

Using a microcode ROM it would be possible to decode 5 bit instruction tokens into multiple field wider instructions – and this would allow the simple pipelining scheme to be utilised.

The Virtual Instruction Set

Here’s a non-exhaustive list of instructions that would be useful candidates for a Stack oriented MISC computer. Most of these could execute in a single cycle on a FPGA based processor as run-time code,  but some of the program flow constructs would need to be compiled.

Stack Manipulation

DUP
DROP
OVER
SWAP
PUSH
POP
NOP
Arithmetic
ADD
SUB
MUL
DIV
Logical
AND
OR
XOR
INV
Memory & Register Transfer
FETCH
STORE
>R
R>
I/O
IN
OUT
PRINTNUM
KEY
EMIT
Program Flow
GT
EQ
LT
IF
THEN
DO
LOOP
CALL
RETURN
JMP
JZ

In total about 35 instructions,  but some savings could be made bringing it down to 32, which allows 5 bit tokens and packing of instructions into wider memory words

 This maps well to the 33 printable ascii symbols  (not alpha or numbers).  These are chosen for their strong mnemonic bias  – to form the basis of the SIMPL language.  With only 32 symbols to learn its like a shorthand version of Forth yet retaining its human readability.  With a compiler, there is no reason why the longhand version of the instruction cannot be included in the listing. It also reduces the source code from an average of  4 characters (including space) per instruction to just 1 character per instruction. This compaction of source code means that snippets of executable code can be transferred between processors in small packets – using wireless connectivity or even as SMS messages over the GSM network.

In the next post I’ll cover some aspects of mapping this minimal instruction set onto the J1 architecture, plus a look at some of the cross-compiler and simulator tools needed to allow further exploration of the MISC class of computing machines.

 

 

 

 

 

 

 

 

 

Posted in Uncategorized | Leave a comment

Making progress with SIMPL

If you have read any of my last few posts you will know that I am working on a tiny language called SIMPL which was inspired by Forth.

SIMPL stands for Serial Interpreted Minimalist Programming Language – as that’s what it started out in life as being – a serial command shell for microcontrollers, when a few simple single alpha character commands could be used to run simple programs and loops on microcontrollers such as Arduino and Teensy.

SIMPL was adapted from Ward Cunningham’s Txtzyme – a character interpreter running inside a loop, and over a period of a few years I have augmented the language to provide  more functions such as maths operators, logic operators, loops and conditional branching, and perhaps most importantly, and borrowed from Forth –  the means to compile new words using the “colon definition”.

SIMPL is a lightweight language, it does not try to even attempt to do all the things that Forth can do, and with a word set that has been reduced to a maximum of 85 unique functions, it is much smaller than amost all implementations of Forth.

In just 300 bytes, the heart of the SIMPL kernel can take serial text – either typed in manually or sent as a file from a serial terminal program, and buffer it into RAM memory. It can also take colon definitions, and based on their identifying character – normally a capital letter, store them at precalculated addresses in RAM so that they can be accessed and run later with the minimum of decoding.

This unique feature of SIMPL effectively removes the overhead of the dictionary search in Forth.  A character read from the input buffer is decoded within a few cpu cycles, and program execution is directed to that block of code.

Primitives.

In addition to the engine that drives the inner interpreter, there are a series of low level primitives that define the SIMPL kernel.  These primitive words are effectively the minimum wordset from which the rest of the language can be written. Bill Meunch, C.H. Ting and Chuck Moore all arrived at roughly the same conclusion – back in the early 1990s  – that there was a small subset of words from which a full implementation of Forth could be synthesised. It is in this spirit that SIMPL has been written.

These primitives include stack operators, math and logic operators, memory access operators and those that allow conditional branching and control program flow.

There are just 32 primitives – which allows for a 5 bit instruction scheme.  These primitives mostly consist of symbols taken from  ascii characters  0x20 to 0x40.

The primitives have been chosen to offer register compatibility with Ting’s MSP430 eForth.  This offers a proven route to extension and makes building and testing the code a lot simpler.

At it’s lowest level, SIMPL takes the place of a serial bootloader allowing source code to be loaded into the program space of the target microcontroller, using nothing more sophisticated than a serial terminal program such as terraterm.

SIMPL is written in MSP430 assembly language for efficiency, speed and compactness, and it has strong influences from Dr CH Ting’s eForth for MSP430.

As I read through Ting’s book  Zen and the Forth Language – I realised that it should be possible to make the kernel of SIMPL a cut down version of eForth, that resides in just 1k bytes of program space, and presents a programmer’s “tookit” in addition to the means to load source code into the device and run that code on a virtual machine.

Progress to Date

Inspired by CH Ting’s book on the MSP430 eForth,  I decided to restructure my rudimentary code to make it compatibe with his register model.  This will allow for easier expansion later on.

Jump Table

 I have included a “jump table” to jump to the code addresses of all of the 96 possible ascii character instructions. This framework will allow me to quickly add the other functions and run them from the code addresses stored in the jump table.

Whist this jump table in itself is 192 bytes long, just over one quarter of the code so far, it reduces other overheads of instruction decoding. The jump table structure is not going to be as fast as direct threaded code, and this is the price we pay for using a tokenised instruction set – but it is easier to understand and extend with more functionality. In hardware, this jump table would be analagous to a microcode ROM for instruction decoding.

The jump table accepts an index value  in the form of a single ascii character. This is first copied to a temporary register as we need the original character later. Then we subtract 32 from this  – to remove the offset from zero – as the first printable character is “space” which is ascii 32. Finally we multiply whats left by 2 (by adding it to itself) and add this to the program counter.  This causes the pc to be incremented by the correct number of words into the table where we find the code address of the function we wish to run. A snippet of this below shows how this selects from the first 5 instructions

; Now we need to decode the instructions using a jump table
; Jump table uses 2 bytes per instruction - so 2 x 96 = 192 bytes

next:     mov.b @ip+,R12     ; Get the next character from the instruction memory
          mov.w R12,R13      ; Copy into R13 - as needed to decode Jump Address
          sub.w #0x0020,R13  ; subtract 32 to remove offset of space
          add.w R13,R13      ; double it for word addressing
          add.w R13,pc       ; jump to table entry

JumpTbl:  jmp space          ; SP
          jmp store          ; !
          jmp dup            ; "
          jmp lit            ; #
          jmp swap           ; $

 

In this way all of the 96 possible ascii tokens are handled, including numbers – which have their own decoding routine, also accessed via this jump table.

 

Primitive Progress

Most of Ting’s set of eForth primitives have been coded in now – and I can do simple arithmetic, logical operations and memory 16 bit fetch and store.

The code to enter and decode decimal numbers and print them out to the terminal is also in place and working. – this allows me to type in 123 456+. and the machine responds 00579, as I have yet to suppress leading zeroes on my printnum routine.

I have tested memory fetch and store, and also the conditional operators < = and >.

With the conditional operators returning either 0 for false or 1 for true, the next thing to implement will be the simple looping structures, and the mechanism that allows conditional execution of code.

These structures make use of parentheses ( ), and a single parameter on the stack which is passed into the loop counter

Any code to be conditionally executed is paced between the parentheses.  The loop counter defines how many times the code within the brackets should be executed –

0   not at all  – skip everything between the brackets until the end ) is detected

1   execute once – i.e use for conditional execution

n  execute n times – decrementing the loop counter until it is zero

This is a useful structure, as as well as conditional execution and DO-LOOPs  it can also be used for comments – by setting the counter to zero, anything between the brackets is not executed – so the text characters can be treated as a comment.

I am fairly confident that I can keep the kernel below 1024 bytes, with the higher level routines built on top of this 1K kernel. The codesize is now 738 bytes.

I have created a github gist to allow you to view the code so far – as this is clearly a work in progress.

 

 

 

 

 

 

Posted in Uncategorized | Leave a comment

More Thoughts SIMPL and eForth

 

Image result for CH ting zen

I recently downloaded Dr C.H. Ting’s book via Kindle “Zen and the Forth Language” – published by fellow Forther, Juergen Pintaske.

In the book, Ting describes in detail his eForth model ported to the MSP430, and how a complete direct threaded Forth can be built using a limited set of primitives.

Ting’s book is very comprehensive, giving details of how to use TI’s Code Composer to assemble a program and create an output file in a form that can be used by the 4E4TH IDE.  There is sufficient information in his book for any curious person to discover the inner workings of Forth and have a complete working system on the inexpensive MSP430 Launchpad.

Whist at Forth Day, I had the opportunity to talk with Ting and this was the motivation I needed to take the plunge with MSP430 assembly language. Ting’s book, and it’s detail of the eForth mode convinced me that I was on the right lines with my approach to SIMPL – indeed almost to the point where my implementation of SIMPL could be considered to be a sub-set of Ting’s eForth.

Having got the basics of the SIMPL interpreter running in assembly language, the plan is now to massage the allocation of the registers, so as to make it compatible with Ting’s model. Then SIMPL really will be a simplified version of eForth.  This means that if I ever choose to extend SIMPL at a later date – there is a clear and proven route via Ting’s eForth.

SIMPL removes the dictionary search overhead that otherwise exists in Forth. The “words” are just singe ASCII characters, and it is easy to jump to an address, calculated from that value. Whilst this might appear wasteful of memory, the ultimate  MSP430 target has 256Kbytes of FRAM to play with.  It massively reduces the overhead of decoding text strings.

Can we switch between SIMPL and Forth?

SIMPL source code  can be written in such a way that it can be expanded into standard Forth, using the SIMPL interpreter to automatically generated this more verbose form.  The SIMPL primitives can all be expanded out to their conventional Forth names, as can the internal words associated with letters a-z and user words A-Z.

Conversely, small eForth application programs could be analysed for their word content and mapped onto the primitives and user symbols used in SIMPL.  Provided that only 52 user defined words have been used in writing the application – then the switch between Forth and SIMPL should be relatively straightforward.

SIMPL is essentially a shorthand or shortform of Forth. The two should be designed in such a way that they are interchangeable. SIMPL is just a subset of Forth – simplified!

 

MUP21  & eForth Primitives 

From Ting, Dr. Chen-Hanson. Zen and the Forth Language: eFORTH for the MSP430 from Texas Instruments (Kindle Locations 406-413). Juergen Pintaske. Kindle Edition.

Here’s a list of the instruction set of Chuck Moore’s MUP21 Forth cpu. Ting compared his 31 eForth primitives with these and this confirmed that they were thinking along very similar lines.

Transfer Instructions JMP, JZ, JC, CALL, RET, LOOP

Math/ Logic Instructions AND, OR, XOR, NOT, SHR, ADDC

Memory Instructions LDR, STR, LIT

Stack Instructions PUSH, POP, DUP, DROP, SWAP, OVER, NOP

These are also very similar to the instruction set of James Bowman’s J1 Forth processor, which is the intended FPGA target for my SIMPL language. So my plan is to have a maximum of 32 primitives – which can be represented by a 5 bit instruction, and three of these instructions can be pipelined into a 16 bit wide memory – again an idea borrowed from Chuck’s MUP21.

However, the J1 uses a 16 bit wide instruction, where the bit fields directly control the hardware. In my proposal, there will need to be a further level of decode – effectively a 5 bit to 16 bit ROM, which generates the J1 instructions from the 5 bit instruction token. This is the price we pay for having a token set that has been chosen for human readability – it needs a further level of decoding to suit the ISA of the Forth processor.

We can of course model the instruction set and architecture  of the J1 in MSP430 assembly language, and this might be a useful step to do.  In effect we will have created a cross assembler, where SIMPL code is assembled into J1 instructions. Whilst slower than running SIMPL directly in MSP430 assembler, it would be useful to prove the development of the language on the J1 – all be it simulated in code. When run directly on the J1, there will be an order of magnitude speed up.

Uses of SIMPL.

Writing SIMPL has been an education for me, as I have learned about the inner workings of virtual stack based machines, and how they are created within a conventional register based processor.  A learning exercise for me, can equally be a educational process for others, so I see one of the uses of SIMPL is as a learning tool to teach students about the instruction set and the architecture of simple stack processors. In the same theme as “From NAND to Tetris”  “Building a SIMPL Stack Processor” could be offered in the form of an online study course, consisting of theory, implementation on the MSP430 Launchpad, or Juergen Pintaskes MicroBox. For some, learning electronic engineering and hardware description languages –  the implementation in a dedicated soft core processor on an FPGA woud aso be a useful earning exercise.

At it’s heart, SIMPL is a text interpreter, and can be used to process text files. Many of our modern manufacturing processes, such as pcb production, CNC machining, 3D printing and laser cutting transfer their information using fairly simple text file formats – such as Gerber and G code.  These files consist of numbers and alpha characters which are translated into sequential machine operations. SIMPL can be used as a text processor to convert these files between standards, or to generated new compatible formats – which may be of interest to the open source and maker communities.  Animating a walking robot, or fying a drone through a sequence of aerobatic manoeuvres are both things that could be translated into SIMPL text files.

In a later post I will describe how SIMPL source code can be used as a lingua franca for bi-directional  exchange of information and program code between radically different classes of computing hardware – from the humble Arduino or Launchpad to the fastest laptop or Octa-core Android platform. Laptops and other mobile computing devices are convenient viewing platforms because of their high performance graphics capabilities, whist small microcontrollers generally struggle to produce anything other than a serial text output.

However if this serial text was effectively a list of drawing commands for graphics primitives – the mobile platform, running a SIMPL virtual machine (written in iForth or Processing for convenience and portability) could interpret the “display list” and produce the fully rendered graphics display from it. SIMPL can send serial text to a Laptop or tablet at some 200,000 bytes per second – perfectly fast enough to render images for datalogging, oscilloscope or IDE type applications.

 

 

Posted in Uncategorized | Leave a comment

SIMPL – a Small Forth Inspired Language

SIMPL really is a very small language. The interpreter and primitives kernel fit into less than 1Kbytes on the MSP430.  It can be ported to virtually any microcontroller that has a reasonable number of registers. SIMPL is like Forth – but simplified!

SIMPL was inspired by Ward Cunningham’s TXTZYME – a minimal language written in C that allowed a microcontroller – such as Arduino or Teensy to perform basic operations, such as printing text, looping, toggling port pins and reading an ADC all controlled from a serial terminal. It only had 13 instructions and was quite limited in what it could do.

I was intrigued by TXTZYME and how it worked –  and quickly realised that it could easily be extended to include math and logical operations, conditional execution and like Forth could have new words created from it’s set of low level primitives.

In the last 4 years, as inspiration leads me, I have developed SIMPL much further and ported it to other microcontrollers such as MSP430 and ARM M4. There are now 400MHz ARM M7 microcontrollers available – and porting SIMPL to these monsters should be relatively SIMPL.

This year – taking my inspiration from a recent trip to Forth Day at Stanford University – with Forth cpu expert James Bowman, and eForth guru Dr. C.H. Ting – I have sought to write SIMPL in MSP430 assembly language, as a learning exercise, and so I can experiment further with the mechanics of the language. All I’m doing is following Chuck Moore’s journey,  but 50 years later. I just want to create the tool set that makes programming computers easier and more productive.  I don’t want 20 million lines of code just to run an operating system. I want the machine to spend all it’s time on my application!

I see SIMPL as a universal “Shorthand” allowing us to write code at a fairly low level for a wide variety of microcontrollers – at least for simple applications.  I think of it as a debugging toolbox, or Smart Bootloader, that once installed on a microcontroller, grants you the ability to load code into memory,  communicate with the mcu at a fairly basic level, execute code and examine the contents of memory or print out results. It also gives the means to exercise the mcu peripherals – such as GPIO, ADC, UART, SPI etc.

Just like Chuck Moore, I don’t want to learn a whole new set of assembly language for every new processor I take on.  I want to put a version of SIMPL (coded in C or Arduino C++) on the chip, and gain a familiar set of basic tools.

My goal is to produce an extensible language that fits into just 1k bytes – which is effectively an “Access All Areas Pass”  for the mcu.

The MSP430 has been chosen because it is a low-power 16 bit processor and newer devices have up to 256K bytes of nonvolatile FRAM memory.  MSP430 also have some neat features – like 24bit ADC, and very fast UART allowing serial communications at up to 8 Mbaud.

Once the SIMPL virtual machine  has been written in MSP430 assembly language – it is a relatively easy task to port it to any other modern register based processor – including ARM, x86 etc.  But for now I am just finding my way around the MSP430 assembly language – which will suffice as a very capable virtual machine.

Ultimately, the plan is to port SIMPL to it’s own custom stack processor – such as James Bowman’s J1, based on a soft core running in an FPGA. Most of the primitive instructions are executed directly by the J1 hardware, so the port will consist of decoding the primitives from ascii using a look up table and generating  the 16 bit wide instructions that the J1 uses.

What I see SIMPL useful for:

It’s a low overhead tookit that allows you to explore the inner workings of microcontrollers.

It uses many of the ideas of Forth, but without getting bogged down with the Forth virtual machine. Forth could be considered to be an extension of SIMPL, and learning SIMPL teaches you a lot of what you need to know for Forth.

It allows the newcomer to explore a microcontroller with just a serial terminal. Ideal as an educational language, showing what can be achieved in a very low byte count on virtually any microcontroller.

Very compact – a 140 character tweet or SMS could do a lot in SIMPL

Automates the process of getting code to run on a new microcontroller – a useful alternative to the traditional bootloader – with lest than 1K of overhead.

Fast – all primitives are written in assembly language. Instructions are decoded in logic or by look-up table – so reduces the overhead of a dictionary search.

Uses a very simple syntax, many machine tools, 3D printers, laser cutters could be controlled from text files that are essentially lists of SIMPL instructions.

SIMPL has just 32 primitive instructions – based on the non-alphanumeric symbols used in ascii. Lower case alphabetic characters are used for function calls to code that exercises processor specific hardware, or for examining the contents of memory.  The uppercase characters are reserved for user defined words – created from primitives and lowercase words.

SIMPL is a highly mnemonic language – where the choice of character reflects the operation being performed.  This is not a new idea – it was used on the very first Cambridge – built EDCSAC machine – where instructions were loaded in from paper tape. SIMPL pays homage to this machine and the simple way in which things were once done.

SIMPL is Forth like in that it uses words or tokens that cause blocks of code to be executed sequentially. It differs from Forth in that the words consist of single, printable ascii characters that are decoded directly causing the cpu to call a routine at a given address, execute the subroutine found there and then return control to the inner interpreter.

There are 85 printable characters – so this defines how many unique codes the machine will interpret as instructions.

The inner interpreter is very compact on the MSP430 with the main interpreter loop fitting into about 300 bytes.

On top of this interpreter loop, you have the code for the low level primitive symbols – listed below, and then higher level words – such as “toggle a port pin” “n milliseconds delay” “read an adc channel” “output a string to serial port” “produce a hex dump of memory” which are represented by the lower case alphabet characters.

This layer of the language is what I refer to as the “Arduino Layer” – all those useful helper functions that are processor specific.

The next layer is the users application code where the users words use capital letters. For example – the classic washing machine program:

FILL
WASH
EMPTY
FILL
RINSE
EMPTY
SPIN

This would be condensed in SIMPL as FWEFRES
– which is a code reduction factor of >5

But each of these can have a parameter – for example if the WASH is an AGITATE cycle A

– In Forth

: WASH 50 DO AGITATE LOOP;

In SIMPL we also use a colon definition to define W. There is no need for the ; as when this snippet is entered it has a null terminator added, and this is the cue to the interpreter to return to fetch the NEXT word

:W50(A) the code in the parenthesis is repeated 50 times

In fact the ascii code for W is used to store this code snippet at a given address – calcuated as (W-61)*32. So as soon as the interpreter encounters W, it jumps to that code address

Most of the primitive Forth words are located in the ASCII characters 32 to 63 that leaves 64 to 126 available for the users vocabulary and constructs made from primitives.

There are approximately 32 printable ascii punctuation and symbol characters – and with a bit of pre-processing they can be used to form the basic machine primitives.

Stack Commands (7) PUSH, POP, DUP, DROP, SWAP, OVER, NOP

,        PUSH
.        POP
”       DUP
‘        DROP
$        SWAP
%        OVER
SP        NOP

Maths & Logical Operations (9) AND, OR, XOR, NOT, SHR, ADDC

+        ADDC
–        SUB
*        MUL
/        DIV
&        AND
^        XOR
`        SHR
|        OR
~        INV (NOT/COMPL)

Transfer Instructions (8)

:        CALL (COMPILE)
;        RET
(       LOOP-START
)        LOOP-END
<       LESS
=        EQU
>        MORE
\        JUMP – condition follows

IN/OUT (2)

?        KEY (INPUT)
.        PRINT (OUTPUT)

Memory Instructions (3)

@        LOAD
!        STORE
#        LIT

Others (6)

_ _        String Print
[ ]        String Store
{ }        Array of elements – switch/case

The use of single ascii characters as tokens, makes the language a lot less verbose than Forth, and snippets of source code can be sent in very few characters – for example as an SMS message between 2 systems equipped with GSM modems.
It is a fairly simple task to have a word table – where the verbose form of the words are stored, and can be printed out, to expand the source into something more readable – for example

“‘% -> DUP DROP OVER

Three ascii characters expanded to 11 plus 3 spaces

SIMPL does not use the space as word separator, but in the special case where you have several numbers to put on the stack, the space is used to indicate “push onto stack”

The intention is to write an implementation of SIMPL in MSP430 assembly language. This not only helps me learn MSP430 asm code, but it’s a good exercise in creating a SIMPL virtual machine using a very conventional register based microcontroller, to gain experience of the language and its limitations. For example, have I got the right mix of primitives to allow a complete language to be synthesised?  The choice of primitives was based on Chuck Moore’s MUP21 instruction set and CH Ting’s e-Forth model.

Writing in assembler exposes you to the raw roots of the processor, where you have to think hard about what you want the code to do, and always be prepared to rewrite and refine every routine.  Having ported an application to one register-rich processor, porting to a new device like an ARM is much easier, as you have already done the hard work of developing the register functional model.

As the microcontroller is running the SIMPL machine, the SIMPL primitives should give access to the widest range of the host’s instruction set, but with the simplification that the stack takes the place of having multiple registers to write to.  This is further simplified that the top and next locations on the stack are stored in registers rather than RAM.

As the VM needs to perform tests and comparisons on numbers – the usual comparison operators are provided  > , < and  = .

These are used in conjunction with the parentheses operators which provide the means of skipping or looping sections of code depending on the result of the comparison.

As an example     10 11>(_Print if Greater_)

 

 

 

 

 

 

 

 

 

 

 

Posted in Uncategorized | Leave a comment