A simulator for “mini-EDSAC” coded in C

Back in May, I entered into correspondence with Bill Purvis, who is involved with the recreation of EDSAC at The National Museum of Computing, at Bletchley Park.

Bill has created a simplified model of EDSAC, which runs on a 1K Lattice iCE40 IceStick FPGA.

Here are the main differences between his mini-EDSAC and the original:

Comparison between EDSAC and Mini-EDSAC

Mini-EDSAC is a simplified version of the EDSAC instruction set, suitable for implementation on machine with limited capacity. The main differences are listed below:

  • EDSAC uses 35-bit words, Mini-EDSAC uses 16-bit words

  • EDSAC packs two 17-bit orders into a word, Mini-EDSAC has a single 16-bit order in each word.

  • EDSAC uses the 5-bit tape code devised at Cambridge, Mini-EDSAC uses 8-bit ASCII coding, but reduced to 5-bit for the order codes.

  • EDSAC has multiply-and-add and multiply-and-subtract orders which take the content of a multiplier register, multiply it by the operand, and add or subtract it into the accumulator. Mini-EDSAC simply multiplies the accumulator by the operand, placing the result in the accumulator.

  • The EDSAC accumulator is double length 70 bits, the Mini-EDSAC accumulator is 32 bits.

  • The Mini-EDSAC shift orders use the numeric value of the address part to specify how many bits are to be shifted, EDSAC has a more subtle interpretation.

  • EDSAC handles long (35-bit) and short (17-bit) operands, Mini-EDSAC only handles 16-bit operands, hence there is no need for the Long/Short bit in the order format, reducing it to 16-bits.

  • EDSAC input order takes 5-bits from the reader, Mini-EDSAC takes all 8 bits from the ASCII-coded input.

  • EDSAC outputs the top 5 bits of the operand to the printer, Mini-EDSAC outputs the bottom 8 bits of the accumulator.

In general, the appearance is much the same, operation codes match (apart from the encoding). The input programs look much like those of EDSAC programs, using the Initial Orders 1, except the last character of each order is not required.

One noticeable difference is the use of @ in pseudo-orders. EDSAC programs used P as that is encoded as zero. @ maps to zero when the top 2 high bits are removed.

The initial orders are slightly longer – I have not been able to condense them into 31 orders, programs load from address 37 onwards. I have not yet attempted to replicate David Wheeler’s Initial Orders 2!

Bill Purvis, 25th May, 2017

Bill’s mini-EDSAC was the starting point for my simulation – written in Arduino C.  It’s relatively straight forward to get a modern microcontroller to emulate these early machines – by recreating their instruction set in the form of a switch-case statement in C.

This code uses an array of 16-bit locations in RAM  m[n]

A is  the Accumulator

It masks off the instruction field “insn” –  as the top 5 bits

It masks off the address field “n” as the lower 5 bits

 

 //--------------------------------------------------------------------
 // mini-EDSAC CPU Model
 // -------------------------------------------------------------------
 
 static void execute(int instruction) // This is the EDSAC CPU model
 
 // The Alpha code is held in the top 5 bits of the instruction
 // The bottom 11 bits hold the address

 {
 insn = (instruction & 0xF800)/2048 ;
 _pc = pc + 1;
 
 n = (instruction & 0x7FF); // n is the address field - lower 11 bits
 
 switch (insn) 
 {
 
 case 0: A += m[n]           ; break; // ADD A
 case 1: A -= m[n]           ; break; // SUB B SuBtract
 case 2: A += (m[n] & R)     ; break; // COL C - logical AND
 case 3: m[n] = A            ; break; // DEP Deposit D -no clear
 case 4: if(A>=0) {_pc = n; }  break; // JGT E
 case 5:                       break; // VER F
 case 6: if(A<0) {_pc = n; } ; break; // JLT G
 case 7: R += m[n]           ; break; // CPY H - Load R register
 case 8: m[n]=uart_getc();     break; // INP I - get char from UART
 case 9: A = A>>n            ; break; // RHS J - Right Shift
 case 10: m[n] = A; A=0      ; break; // TRC K - Transfer and clear
 case 11: A = A<<n           ; break; // LSH L - Left Shift
 case 12: A += (m[n] * R)    ; break; // ML+ M - Multiply and ADD
 case 13: A -= m[n] * R      ; break; // ML- N - Multiply and SuBtract
 case 14: uart_putc(m[n])    ; break; // OUT O - Output character to UART
 case 15:                      break; // PUT P - PUT constant in memory

// These cases >= 16 are not used on 4-bit instruction model

 case 16:                      break; //     Q - Not Used
 case 17: A = m[n]>>1        ; break; // RHS R - Right Shift
 case 18: A -= m[n]          ; break; // SUB S - Subtract
 case 19: m[n] = A; A=0      ; break; // TRC T - Transfer and Clear Acc
 case 20: m[n] = A           ; break; // UPD U - Transfer - don't Clear
 case 21: A += (m[n] * R)    ; break; // ML+ V - Multiply and ADD
 case 22:                      break; //     W - Not Used
 case 23:                      break; // NOP X - No Operation
 case 24:                      break; // RND Y - Round Accumulator
 case 25:                      break; // END Z - Stop and ring Bell!
 case 26:                      break; 
 case 27:                      break; 
 case 28:                      break; 
 case 29:                      break; 
 case 30:                      break; 
 case 31:                      break; 
 
 }

// Note - the most useful instructions crammed into top 16 cases
// to allow for 4-bit instruction selection
// So those below "P" are moved up to fill vacant slots 
// At "B", "D", "K", "M"
// Model can be extended in future for 5-bit instruction codes

pc = _pc;
 instruction = m[pc];

next_t = _t;

}
 
 
 
 // End of CPU model
 // ------------------------------------------------------------------

Posted in Uncategorized | Leave a comment

SIMPL on EDSAC

SIMPL on EDSAC

When you start coding on a new processor architecture, you always have the initial problem of how to get some sort of a bootloader into memory, to allow you to then load your application program into memory.

Originally this was done manually using toggle switches, to set up the data word for a given address and then write that bit pattern directly into memory, incrementing the address each time. Naturally this bootloader had to be as short as possible – as toggling long sequences of raw data into memory is not only tedious but prone to errors.

The engineers who devised those early bootloaders were masters of minimalism – with EDSAC requiring only 31 instructions – a program called “Initial Orders” which was fed into memory using an arrangement of hard-wired uniselectors. A uniselector was a piece of telephone switch gear – a relay operated “stepping switch” which could be used to generate a fixed output pattern for a given input – effectively a ROM.

With “Initial Orders” loaded, EDSAC could then be loaded from a teletype paper tape reader – using 5-bit teletype code to represent instructions, numbers and characters.

Having pre-packaged routines to assist with the coding of a new processor  is always useful – and being able to take this “toolkit” of known, working routines from processor to processor – makes life so much simpler.  This was what Charles Moore (the inventor of Forth) effectively did – in the 1960s, when he was coding on many different mainframe architectures. It was through the need to have a portable language – regardless of the architecture of the machine, that Moore’s toolkit eventually evolved into Forth –  How this was achieved is described in his article – “Forth – The Early Years”

Inspired by Chuck Moore, I wanted to develop my own language framework, but I needed something to start me off in the right direction.  That incentive came at a talk I attended in May 2013 – where I learned of a very simple interpreter called TXTZYME – by Ward Cunningham.

Inspired by txtzype, I began to  develop a programming framework called SIMPL. (serial interpreted minimal programming language) – which I have ported to a variety of microcontrollers including MSP430, ARM and AVR.

SIMPL is best described as a character interpreter running in a while loop, with routines to handle numerical input and numerical output.

SIMPL interprets a string of characters, and performs actions based on the characters. It is one level above machine language, providing a framework and environment that is independent of the specific details of the cpu. In theory, once you have ported SIMPL – you have a common machine interface – and no need to deal with the quirky machine code anymore!

Porting SIMPL to EDSAC

Porting SIMPL to a new cpu architecture is an ideal way of gaining an understanding of that architecture and it’s machine code.  As EDSAC has very few useful instructions, I viewed the porting of SIMPL to EDSAC as something of a coding challenge.

Fortunately it has ADD and Subtract, and these can be used with the conditional jump instructions for branching and looping. There is also AND, which can be combined with a subtraction from FFFF to negate the bits to create NAND and from NAND we can generate OR, NOR and XOR.

The ability to string primitive instructions together as short sequences to make compound ones is something that can be readily automated by the SIMPL interpreter.

As I stated in the earlier post – it seems unusual to have a hardware multiplier on such an early machine – but Maurice Wilkes predicted that EDSAC would be used for real mathematical computation – so the multiplier was purposefully included right form the start.

The other problem is that the chosen target board – an MSP430 with just 512bytes of RAM has to host the version of SIMPL running on the virtual EDSAC machine – which means cramming a meaningful version of SIMPL into about 200 instructions – which are 16 bit words.

The point of this exercise is to make programming the EDSAC model a lot easier, than raw machine language, and also to get a feel for how a “language” can be created from an instruction set that lacks many of the basic addressing modes.

Bite the Bullet – Itch that Scratch – and so in two days of spare time whist on vacation, an EDSAC simultor was coded on Day 1 – and large chunks of SIMPL were coded on Day 2.

A Model of SIMPL.

SIMPL deals with character input and character output. This is handled with the get_char and put_char routines which communicate with a serial terminal program such as TeraTerm.

Every printable ASCII character is treated as a command. This needs a jump table of 96 entries to be able to jump to any code body address, having received a character from the serial input routine.

Numbers are treated slightly differently by a routine called “number” which reads each character in turn and assembles a 16-bit integer from the given input string.

Similarly there is an output routine “printnum” which takes a 16-bit number off the stack and generates a printable string from it.

In addition to these three routines, there are the action  routines associated with each command.  These consist of a basic set of primitive instructions – mostly represented by punctuation characters and symbols. Higher level functions are represented by lower case alpha characters, and upper case alpha characters are used for User Definable words. In this respect – SIMPL is a bit like Forth, but lacks the dictionary structure.

EDSAC has a pair of Jump instructions – but it lacks a subroutine Call and Return – as these devices were not invented until 1952 – 3 years after EDSAC first ran.  So EDSAC will have to rely on a Jump to the code body for a given character, and then a Jump back to the code that fetches the NEXT character from RAM.

SIMPL needs about 1Kbytes of program memory on the MSP430 – it’s about 250 instructions for the basic kernel.

The EDSAC instruction set is indeed very basic when compared to MSP430 – and so new methods will need to be devised to get the program structure to work. EDSAC code is entirely stored in RAM – so there is the means to self modify code (scary) – but think of this as a means to an end to create the necessary jump table structure needed by SIMPL.

EDSAC only has about 16 or 18 instructions – and they were represented by capital letters – making the assembly language a lot easier to read. I chose the top 16 instructions and wrote an EDSAC simulator in C to allow me to experience this lower level of computing.

SIMPL provided the communications framework, and allowed me to dedicate a few commands to controlling the  virtual machine, such as resetting the program counter and dumping the first 32 locations of RAM in an easily disassembled manner.

EDSAC predates ascii by about 14 years – so all of the communication and I/O was via a wartime surplus teletype that used 5-bit codes. For the modern simulation this may all be updated.  The instruction set – nominally some of the capital letters A through to Z may be simplified to 4 bit numerical opcodes 0-F.

So I derived an EDSAC virtual CPU model in a few lines of C code. This runs under Energia 17 – which is the MSP430 version of the  Arduino IDE

The EDSAC cpu model allows me to hack about with the instruction set,  and work out how exactly to port an interactive language to a resource limited cpu by creating a virtual machine.

Character and Number Input and Output

Character input and output is handled by the EDSAC’s I and O instructions which load a character from tape into a specific memory address and print a given character on the teletype.  For convenience these will interact with uart routines equivalent to get_char and put_char and allow communication from a terminal application.

print_num

The EDSAC literature also includes a listing of the code for printing out the squares of numbers from 0 to 100. This is useful because it features an integer print routine – which is also an essential routine when implementing SIMPL.

number

The “initial orders” program has a routine that tests each character from a sequence of consecutive characters to see if it is a number, and by a process of successive multiplication by 10, expands the input characters as  a decimal number in memory. This is equivalent to the “number” routine in SIMPL.

Jump Table

Fundamental to SIMPL is the jump table – identify a character in the input buffer and execute  a block of code associated with that character. Get this mechanism right – and SIMPL will be a breeze on EDSAC.

One trick with the EDSAC instructions is that both the instruction field and the address field can be manipulated via the accumulator and then deposited back in memory. This allows, for example, a jump table to be created  – based on the value of an input character.  This powerful technique is just what is needed to create an interactive language such as SIMPL – which really is one big jump table.

So we have ascii characters beginning at 32 – we need to create a jump table – so lets load the character into the bottom of the accumulator as an address of where to find the “jump”.  This has to be done in two steps – as we first have to create the pseudo instruction in memory.

T0 - clear the ACC
I2 - get  the character into m[2]
A2 - ADD m[2] into ACC
A3 - ADD m[3] which has been preloaded with the instruction for Jump
U4 - store the assembled instruction at m[4]
E4 - Jump to m[4] - which is now a jump to the jump table

 

Every character now has somewhere to jump to – including numbers, lower case, upper case and punctuation symbols. Numbers can be handled by a routine which looks at each character and creates an integer in the accumulator, which can then be stored on the top of the stack.

With the jump table – we have a mechanism for addressing the code that contains the primitives.

A register model

EDSAC does not use registers in the usual sense. Whilst there is a register to hold the multiplier  – for both the “multiply and add into accumulator” and the “multiply and subtract from accumulator”.  It is also used to hold one of the operands of the AND (collate) instruction.  The multiplier register can only be loaded from a memory and not directly through the accumulator.

However – selecting a small group of memory locations near the start of RAM and giving them nominal register names – will be a good way of creating some order – and a framework for the SIMPL implementation.  Instruction Pointer, stack pointer, return stack and program counter can be implemented thus.

 

Posted in Uncategorized | Leave a comment

Opening up EDSAC

EDSAC was the first of the Cambridge University computers, brought up in 1949 and intended for real scientific and mathematical work.

This year, there is considerable revival of interest in EDSAC – as the National Museum of Computing at Bletchley Park – have recreated a faithful reproduction of the original machine.  Also – the Open Source Hardware Users Group (OSHUG) have set up the EDSAC Challenge in September – which is to capture the flavour of EDSAC but using modern, open source FPGA hardware.

Why all this fuss about EDSAC – a 68 year old machine?

EDSAC was the first practical computer designed to do real work. It also exemplifies what a bit of ingenuity can do to an otherwise technically simple machine. It shows what can be done with the simplest of load-store architectures- and also teaches some of the tricks in making a hardware limited cpu do what you need it to do.

The EDSAC instruction set was based on the 5-bit teletype codes read in from a paper tape reader, that were common at that time as they were used for military and civilian communications.

Quite unusually EDSAC was given a multiplier instruction – somewhat simplified because of the serial nature of the memory and the arithmetic. EDSAC was based on the classic Von Neuman “Load-Store” architecture and as numerical precision was considered of high importance it used a 35 bit word length and the accumulator was 71 bits to accommodate the result of a 35×35 bit multiplication.

Instructions were designated capital letters – an early form of mnemonic, which made the program listings easier to read.  This feature is peculiar to EDSAC – and something that must have been lost when computers needed more than 26 instructions.

EDSAC used 18 instructions presumably tailored to the capabilities of the ALU and delay line memory.

What was it like to code on such a machine with such a limited instruction set?  I wanted to know – so I wrote an EDSAC virtual machine that will run on Arduino or MSP430 LaunchPad. From this I can then experiment with the architecture, of EDSAC and other unsophisticated processors.  This leads on nicely to my interest in Minimal Instruction Set Computers (MISC) – which by their nature are simple to model in code and use very little hardware.

So I created a simplified model – in C, which will run on an Arduino or MSP430 Launchpad. For practical coding reasons, 16 instructions will be used, which can be represented with a 4 bit instruction code.

Whilst EDSAC was created to help perform mathematical calculations – about 150 times faster than a 1940’s desktop mechanical calculator and so precision multiplication was deemed to be important,  the aim of the simplified model is to explore the limitations of a reduced instruction set, and especially when you are trying to create a virtual machine and an interactive programming environment with limited instructions.

From an instruction synthesis point of view, we have ADD, AND and SUBTRACT, so NEGATE, NAND, OR, and XOR should all be possible.

There are two jump instructions  Jump if A>=0 and Jump if A<0.  These  should be sufficient to meet most test situations.

A ADD
C COLLATE - AND
E JUMP IF A>=0
F VERIFY
G JUMP if A<0
H LOAD MULTIPLIER REGISTER
I LOAD INPUT CHARACTER
L LEFT SHIFT
N MULTIPLY AND SUBTRACT
O OUTPUT CHARACTER
P LOAD DATA
R RIGHT SHIFT
S SUBTRACT
T TRANSFER AND CLEAR
U TRANSFER NO CLEAR
V MULTIPLY AND ADD
X NO OP
Y ROUND ACCUMULATOR 
Z STOP AND RING BELL

The instruction set was simple and obviously dictated by the serial arithmetic unit. Multiplying iss a fairly straightforward operation with serial data – even on such a limited machine – the ability to multiply and add, or multiply and subtract was given high status.

EDSAC used about 3000 vacuum tubes for it’s logic – and memory consisted of mercury delay lines – where data existed as a series of standing waves that hd to be continuously refreshed.

This poster gives a really good overall view of the EDSAC machine – and has example code for “Initial Orders” which was a very early form of bootloader – in just 31 instructions. These instructions were “dialled” into memory using a system of hardwired uniselectors – as a lot of EDSAC was based on then current (General Post Office GPO) telecommunication hardware.

With the initial orders program running the main program could be loaded into memory from paper tape.

For the over curious – there is an EDSAC simulator here. Coding on such a primitive load-store architecture is certainly not a task for the fainthearted and takes some getting used to.

It’s easy to get hung up on historical details of this machine, so I created a cpu model, that behaves like EDSAC, but not necessarily constrained to the same teletype character codes or other hardware restrictions.

Perhaps it would be easier to impose a Register model on the cpu – so for example the first 16 locations in memory are given R0 to R15 names – that will certainly make it mentally easier to remember where variables have been stored.

On top of this register model – we can then lay out the virtual machine model, with data and return stacks, instruction pointer, top and 2nd, etc.

The code for the MSP430 is here on Github Gist . It consists of a simple instruction decoder and memory model. Instructions may be written in any text editor and then sent to the simulator using a terminal program – such as Teraterm.

 

 

 

//---------------------------------------------------------------------
// EDSAC CPU Model
// --------------------------------------------------------------------
// Instructions between Q and Z are replicated lower in the table
// This allows all instructions to be accessed with a 4 bit opcode
 
 static void execute(int instruction) // This is the EDSAC CPU model
 
 // The Alpha code is held in the top 5 bits of the instruction
 // The bottom 11 bits hold the address
 {
 insn = (instruction & 0xF800)/2048 ;
 _pc = pc + 1;
 
 n = (instruction & 0x7FF); // n is the address field - lower 11 bits
 
 switch (insn) 
 
 
 {
 
 case 0: A += m[n] ; break; // ADD A
 case 1: A -= m[n] ; break; // SUB Subtract
 case 2: A += (m[n] & R) ; break; // COL C
 case 3: m[n] = A ; break; // DEP Deposit D -no clear
 case 4: if(A>=0) {_pc = n; } break; // JGT E
 case 5: break; // VER F
 case 6: if(A<0) {_pc = n; } ; break; // JLT E
 case 7: R += m[n] ; break; // CPY H - Load R register
 case 8: break; // INP I 
 case 9: A = m[n]>>1 ; break; // Right Shift
 case 10: m[n] = A; A=0 ; break; // Transfer and clear
 case 11: A = m[n]<<1 ; break; // LSH L
 case 12: A += (m[n] * R) ; break; // Multiply and ADD
 case 13: A -= m[n] * R ; break; // Multiply and Subtract- N
 case 14: break; // OUT O
 case 15: break; // PUT P
 case 16: break;
 case 17: A = m[n]>>1 ; break; // RHS R
 case 18: A -= m[n] ; break; // SUB S
 case 19: m[n] = A; A=0 ; break; // TRC T 
 case 20: m[n] = A ; break; // UPD U
 case 21: A += (m[n] * R) ; break; // ML+ V
 case 22: break;
 case 23: break; // NOP X
 case 24: break; // RND Y
 case 25: break; // END Z
 case 26: break; 
 case 27: break; 
 case 28: break; 
 case 29: break; 
 case 30: break; 
 case 31: break; 
 
 }


pc = _pc;
 instruction = m[pc];

next_t = _t;

}
 
 
 
 // End of CPU model
 // ------------------------------------------------------------------

 

 

 

Posted in Uncategorized | Leave a comment

SIMPL Gets Simpler

This week I have been working on my tiny, Forth-like language SIMPL.  in a few hours of spare time I have managed to optimise some of the routines and get the code size down from around 900 bytes to just 780.

The main way that I achieved this was by reducing the size of the jump table from 196 bytes to 120 bytes, and using less code intensive ways to achieve the same functionality.

By paring away at the kernel in this way, makes for more room to add additional functionality  – and still stay within the target of just 1024 bytes.

SIMPL provides an extensible language framework allowing:
text entry – either from terminal or by file transfer
text output
32 basic instructions (primitives)
26 lower level functions – I/O related
arithmetical and logical operations
stack manipulation
I/O manipulation
Millisecond and microsecond timing operations
Looping
memory examination and modification
The kernel of SIMPL is extensible  but in it minimum operational state fits into roughly 800 bytes and has the following main routines
set_up         – sets up the clock, the UART and the GPIO
textRead    – Read a waiting character from the uart and store it into a buffer area in RAM. The location of where it is stored is determined whether we are in immediate mode or compile mode.   Immediate mode saves the character string into a text buffer at 0x0200.  Compile mode saves into a buffer starting at 0x0220 for “A” 0x0240 for “B” and so on. textRead finishes when it encounters a newline or carriage return character. It puts ascii 0x80 on the end of the buffer to signify the end of the input and sets the instruction pointer to point at the start of the text buffer (0x0200). It then jumps to “next”
next –  this is the key routine – central to the  whole operation of SIMPL.  It implements a very basic virtual machine that can work its way through an array  of characters in memory,  making decisions on their type and executing action routines based on the numeric ascii value of the character.   It gets the next character from the buffer and test to see whether it is a numeric character, an uppercase character or something else.
number – determine whether the character is part of a string of numerical characters and form into a 16-bit integer to be placed on the stack. Jumps back to “next” on completion
alpha – determine if a character is an upper case letter, calculate the code address associated with it and modify the instruction pointer to point to that address. Jumps back to “next”
jumper –  This looks up a jump address from a jump table based on the ascii value of the character. 60 table entries are stored – allowing for all the punctuation, arithmetic and logical symbols and the lower case letters.  Each of the 60 characters has an action routine associated with it which is executed when the character is detected in RAM.  All of the action routines jump back to “next” on completion
printnum   – take a 16-bit integer and convert it into an ascii text string for display
The above routines of the kernel can be coded into about 782 bytes on the MSP430 – allowing the additional routines to fit into about 240 bytes.   These will allow access to the peripherals including ADC, GPIO, Timers, SPI etc.

 

 

Posted in Uncategorized | Leave a comment

Minimal Computing – almost 70 Years On

Some of you will know that I am a proponent of minimal computing devices.

I have more than a passing interest in retro-computing, because after all, back in the day, hardware cost so much, – for example in 1963, in the PDP-5,  every transistor cost about $20 in today’s equivalent, and had to be justified.

Now that same $20 will buy 4  Pi-Zero dev-boards, with 1GHz ARM processors- needing about 20 million lines of code just to be useful.

So I have embarked on a journey of discovery and learning, to see if we can make computers a lot simpler.

Ancient History

The first real computers appeared in the late 1940’s, and filled a room with racks of power hungry vacuum tubes. My father attended Cambridge University as an undergraduate at this time – and probably unbeknown to him, computer history was being created daily by a group of dedicated mathematicians, engineers and technicians.

EDSAC was one of these early computers, and with 3000 tubes, and about 650 instructions per second was cutting edge back in May 1949.  With just 16 useful instructions, and memory based on ultrasonic delay lines EDSAC was created to do useful computational work – to assist Cambridge academics with their research.

Nearly 70 years later, the National Museum of Computing at Bletchley Park, is building a faithful reproduction of EDSAC – hopefully with it coming online later this year.

The Open Source Hardware Users Group – are having their own tribute to EDSAC, with the “EDSAC Challenge” – to be held as part of the Wuthering Bytes, Digital Festival, in Hebden Bridge, in early September.

EDSAC will be faithfully emulated using the latest open source FPGA technology – and teams of enthusiasts will work together to contribute meaningful demonstrations of what can be done with modern technology – as a tribute to the venerable EDSAC.

My own contribution will hopefully be a port of a tiny interactive language called SIMPL – which creates an interactive programming environment in just a few hundred instructions of assembly language.

Capable of arithmetic and logical operations,  looping, printing strings, and interpreting sequences  of  alphanumerical data – SIMPL will present a challenge for the computational resources of EDSAC.

Instruction Set.

EDSAC really only has about 16 useful instructions and 1K words of memory. This is going to cramp most programmer’s style.

However, if you examine the instruction set – the basics are there, but it will just take an extra bit of ingenuity to get the most out of them.

ADD

SUBTRACT

MULTIPLY

AND

JUMP IF >=0

JUMP IF <0

SHIFT LEFT

SHIFT RIGHT

TRANSFER ACC TO MEMORY AND CLEAR

TRANSFER ACC TO MEMORY DON'T CLEAR

TRANSFER MEMORY TO MULTIPLICAND REGISTER

MULTIPLY AND ADD

MULTIPLY AND SUBTRACT

INPUT A CHARACTER

OUTPUT A CHARACTER

NOP

ROUND ACCUMULATOR

STOP

 

Having just completed coding up my SIMPL interpreted language in just about 300 instructions in MSP430 assembly language – the challenge is now to code it up in EDSAC machine language – in under 1k words.

The next few posts will report on my progress – as I battle with the EDSAC simulator.

 

 

 

 

 

 

 

 

 

Posted in Uncategorized | Leave a comment

Looping in SIMPL – the mechanics of a tiny language

SIMPL
SIMPL is a character interpreter running within a loop – in pseudocode:
NEXT:             Fetch next Character (VM Instruction) from buffer
                         Decode to form unique Jump Address
                         Jump to associated code body, execute code – which ends with Jump back                            to NEXT
This forms the heart of a very simple virtual machine which is capable of executing unique code functions based on the character found in the memory buffer.  This elementary execution model is equivalent to most modern processor systems, so with a little bit of work, SIMPL can form the basis of a simulator for most basic processor architectures.
The character may be fetched from a serial terminal input buffer and executed immediately in an interactive manner, or the Instruction Pointer can point to a general area of program memory, and execute the character instructions found there.
It is this combination of using SIMPL either interactively at the terminal or as a means of entering and editing text into memory and subsequently running it from RAM, that makes SIMPL a powerful tool – to have in the programmer’s tool kit.
As a kernel, capable of executing the basic inner kernel loop,  SIMPL may be coded in about 300 instructions on most small processors such as the MSP430 – this includes the overhead of setting up the mcu oscillator, UART and GPIO – so that it supports serial terminal interaction.  As more functions are added, the code size grows by approximately 8 to 10 bytes per function – so it’s possible to have a small but usable SIMPL running in about 1K bytes.
The users application code will further extend this by perhaps a similar amount – but it still has an extremely small footprint, making it an ideal serial command shell for resource limited processors.
SIMPL when coded in C has been ported to a number of microcontrollers including ATmega, ATtiny, ARM M4, and MSP430.
The kernel has also been rewritten in MSP430 assembly language for speed and compactness, and work is ongoing to port it to the J1 Forth mcu.
The ability to encode the basic inner interpreter virtual machine in the native assembly language of a range of processors makes this a powerful technique of getting the basis of a useful “monitor” programme onto a variety of processors.
It also means that once the kernel has been coded in the native assembly language, then the user is working with a common set of instructions running on the virtual machine, at a level somewhat higher than the native assembly.
This creates a common language instruction set – a kind of “lingua franca” that makes interaction with a wide range of processors much easier.  It also opens up the door for different processors to communicate with each other through this common character encoded language.
SIMPL has some of it’s roots in Forth, and indeed Charles Moore’s Forth language has been a source of inspiration for the development of SIMPL.  However SIMPL is not a fully fledged language, more like a multi-function tool (Leatherman) which just makes the interpreted interaction with a range of small processors a lot easier, and less painful than the usual edit-compile-run cycle of code development found within a modern C compiler IDE.
Mechanics of SIMPL

Here’s some more info about the various routines that make up the kernel

textRead       33 Instructions 90 bytes
Receive characters from the serial uart and place them into a buffer in RAM.  Two modes of operation are possible, immediate mode, where the characters are executed as instructions directly after a carriage return line feed is received, and compile mode, where the character sequences are preceded by a colon and stored in pre-calculated command buffers in RAM.
number      16 instructions 42 bytes
Number interprets sequences of consecutive digits as a 16-bit integer number and places the value in the register that is the top entry of the stack.
next               5 instructions 12 bytes
Next is the routine that all commands return the program flow back to once they have executed. It fetches the next character instruction from the RAM buffer and through a jump table technique passes program control to the code body that performs the task associated with that instruction.  The  jump table  is used to direct the character to the areas of code that will treat them correctly – for example on encountering a numerical digit, program control is passed to the number routine, whilst for upper case alphabetical characters, program control is passed to a routine that handles these separately.
jump_table       96 instructions   192 bytes
Primitives         72 instructions 166 bytes
SIMPL uses a collection of 32 instruction primitives from which other instructions can be synthesised.   The code body of these primitives is extendable to some 110 instructions or so.
Upper (called alpha in V1)    10 instructions   26 bytes
Upper  handles the capital letters – as these are user commands, and the user is able to write and store in RAM certain functionality based on these characters.  When a capital letter is encountered in the instruction buffer, Upper directs control to the correct command buffer.
Lower   86 bytes
Lower is an area of program that interprets the lower case characters and provides a higher level of program complexity than is achievable form the primitives alone.
printnum     30 instructions   106 bytes
This takes a 16 bit integer number from the stack and prints a string of ascii digit characters to the terminal.
Uart Routines         13 instructions 41 bytes
Low level communication with the uart is by way of the get_c and put_c routines
Initialisation         19 instructions    90 bytes
Here the hardware such as the oscillator, GPIO and uart are initialised for correct operation.  This is code specific to whichever microcontroller has been chosen
Interpreter            4 instructions       16 bytes
This is the main routine that runs the SIMPL interpreter combining textRead, next, number and Upper.

 

Looping 

The loop structure is fundamental to the SIMPL language – but it is one of the trickier parts to code – but with much head-scratching, and a couple of false starts, I got it implemented today.

Loop Code.
Cracking the loop code – previously written in C, was key to getting SIMPL running in MSP430 assembly language. Very quickly after the  loop code start working –  comes LED flashing, tone generation, timing delays and printing character strings.
As it stands, with the loop-code in place,  SIMPL is 874 bytes long. I am confidant that we can get SIMPL into 1024 bytes – as per the Hackaday 1K challenge.
The SIMPL loop code is given a single value k – the loop counter which is placed on the stack immediately before the loop-start character which is a left bracket (.
On finding the left bracket, we need to make a note of the current instruction pointer when we enter the loop – so we have a location to jump back to each time around the loop.
We use the temporary registers R7 and R8 to hold the loop parameters  – i.e. the start of loop and the loop count
The control is then passed back to the interpreter by $NEXT  (jmp next) which steps through the instructions (in the loop) until the right bracket is found.
We then decrement and test the loop counter to see if it is zero – at which point the loop is terminated. If non-zero,  we return the instruction pointer to the start of the loop before jumping back to next.
Confused ?- you will be – but it really is that simple! Just 7 instructions and we are looping.
left_par:                ; code enters here on getting a left parenthesis

MOV.W tos,R8             ; save tos to R8 (R8 is the loop counter k)
MOV.W ip,R7              ; loop-start = ip the current instruction pointer at start of loop
JMP next                 ; get the next character and execute it

right_par:               ; code enters here on getting a right parenthesis

DEC.W R8                 ; decrement loop counter R8
JEQ #next                ; and check for zero - terminate loop
MOV.W R7,ip              ; set instruction pointer back to the start of the loop
JMP next                 ; keep looping

 

SIMPL now runs on a 16MHz MSP430 in just 872 bytes – with serial communications at 115200 baud. Most of the primitive instructions execute in about 1uS  – allowing a port pin to be toggled at 1MHz.

I have included the latest assembly source file at this Github Gist

 

Posted in Uncategorized | Leave a comment

More About SIMPL

SIMPL – a small Forth Like Language

Charles (Chuck) Moore, the inventor of the Forth computer language is one of my all time computing heroes.

In a life long career in computing, starting in the early 1950’s – he has been mathematician, programmer, hardware engineer and designer of several unique processors that execute the Forth language directly as their native instruction set.

Last year, I was fortunate enough to have the opportunity to meet him at Forth Day – at Stanford University, in November 2016 – and he has been an inspiration for some of my recent work.

I first came across FORTH as a teenager in the early 1980’s when the home computer was dominated by the BASIC programming language, and it fascinated me that Forth offered a radically different alternative.

It’s taken me the last 35 years to get around to understand what makes Forth tick – as part of my mid-life computing crisis, and as such I have taken time out to learn some of the fundamentals of this fascinating language.

Chuck Moore always looks for ways to simplify the task in hand – and it is his search for simplicity that has led me to some recent developments with a tiny programming language – that I call SIMPL.

SIMPL stands for Serial Interpreted Minimal Programming Language – and it was initialy inspired by Ward Cunningham’s Txtzyme – for Arduino or Teensy.  I took Txtzyme, added several new commands, and made it an extensible language, which could be ported to virtually any microcontroller.

SIMPL was originally written in C for Arduino, and then MSP430 and ARM. Now I am exploring writing the core routines in MSP430 assembly language for speed, efficiency and compactness.

The extended version of SIMPL – called SIMPLEX, includes a tool-kit to allow editing, loading, compiling and examination of memory – all  hosted by the target processor – in about 5K bytes.

MISC Processors and eForth

 

Inspired by Chuck Moore’s Minimal Instruction Set Computer (MISC) designs of processors from the mid 1980’s, I have chosen a similar approach. At the same time that Chuck Moore was working on designing MISC processors in gate arrays,  Bill Meunch  and Dr. Chen Hanson Ting were developing eForth – a complete forth that could be synthesised from only around 30 primitive instructions.

Meunch and Ting showed that once these 30 primitives had been coded in the native assembly language of the target processor, plus a porting of the inner and outer interpreters, then the whole eForth language could be created very quickly.

Ting once remarked that to put eForth on any new microprocessor, starting from scratch generally took him about 2 weeks.

A Comfortable Computing Environment

eForth creates a comfortable computing environment within any new, unfamiliar processor,  – kind of like establishing a base-camp in a hostile terrain. Additionally, once you have studied the processor architecture and instruction set sufficient to write the eForth primitives in it’s assembly language – you have probably become quite intimately conversant with it’s inner workings.

Forth is an extensible interactive language, which provides a complete computing environment hosted by the target machine. It dispenses with the need for sophisticated compiler tool chains or  IDEs, and allows all coding and interaction to be done from a simple serial terminal.  At the same time it provides an editor, loader, compiler, and the means to examine memory by way of a hex dump.

Registers and memory mapped I/O can be poked or examined interactively, and the low level drivers for I/O can be developed by experimentation from the ground up.  Initially the only I/O requirements are to allow a character to be sent or received via the serial UART – and virtually every modern microcontroller has at least one UART.

eForth however is some 6K to 8K bytes in a typical implementation, so I asked myself whether there was a way of creating an extensible language from an even smaller starting point – and so the idea of SIMPL came into being.

The Mechanics of Interpreted Languages

eForth, SIMPL and other tiny Forth like languages rely on efficiently coding a small set of primitive instructions onto a stack-based, 16 bit virtual machine, running in the native assembly language of the host processor.

The virtual machine consists of an inner interpreter that fetches the next instruction token from memory, performs a decode operation (either by jump table or switch-case structure) and executes the code  associated with that operation. The efficient coding and execution of the primitives and the inner interpreter is what gives these threaded, interpreted language their speed.

Ideally there will be fewer than 32 primitives, so in a hardware cpu implementation they may be represented by a 5 bit token, and mapped directly into the cpu machine instructions – this is the approach that Chuck Moore used with his Forth processors, and has been widely been adopted since – such as James Bowman’s J1 Forth CPU.

However in a software implementation we need to encode these in the native assembly language of the target processor, be it an 8-bit, 16-bit or 32 bit processor. Forth was developed at a time when 16-bit minicomputers were just appearing – and so traditionally it has used a 16 bit integer word size.  The late 1970’s and early 80’s saw Forth being adapted to 8-bit microprocessors – some of which were a good fit to the Forth virtual machine model – and some were not.

An example of a tiny Forth for 8-bit AVR

In one very compact implementation – T. Nakagawa’s “Tiny Forth” written for the 8-bit AVR, 25 primitives were encoded into just 286 instructions – an  average of about 11 machine language instructions per primitive.

Despite the limitations of an 8-bit architecture – Nakagawa’s AVR assembly language implementation is very efficient in about 840 instructions of AVR code and 137 bytes of tables – this assembles to around 1.8K bytes.

Typically the cpu does the following pseudo-code by fetching the next primitive instruction from memory,  testing it’s token value and branching to the next in the table if it’s not a match.

In hardware this is equivalent to the instruction fetch and decode.  Getting this process to run efficiently in assembly language is key to the speed of the resulting language.

Test the token character of the primitive
Is it the one we are looking for - no, branch to next in the table
Get the operands off the stack and into a pair of working registers
Perform the operation
Put the result back onto the stack
Correct the stack pointer
Return to the inner interpreter

Here is the AVR implementation of ADD

prim6: ; +              ; This is the ADD primitive with token=5
 cpi r16, 5             ; Is the token=5?
 brne prim7             ; No - branch to next primitive
 ld r19, X+             ; get operand 1 off the stack
 ld r18, X+
 ld r17, X+             ; get operand 2 off the stack
 ld r16, X+
 add r16, r18           ; add the low bytes
 adc r17, r19           ; add the high bytes plus any carry
 st -X, r16             ; Put the result back on the stack
 st -X, r17
 ret                    ; Return to the inner interpreter

In an 8 bit processor such as the AVR,  it takes 4 instructions to load up the working registers with two 16 bit operands (Forth uses a 16-bit integer size), and two instructions to perform arithmetical or logical operations on them. Passing the result back to the stack is another two instructions followed by a return.  With the overhead of the instruction decode and return, the basic arithmetical and logical operations are typically 11 instructions.   So in this light, an 8-bit Forth generally runs at about one tenth or less of the instruction rate of the processor.   The decode operation takes 2 instructions – even if there is no-match, so the frequently used primitives should be put closer to the start of the list.

Nakagawa’s Tiny Forth is concise and well documented – and I have reproduced it in this Github Gist

Whilst the coding of the primitives takes up around a third of the total program, the thirty short routines to implement the inner and outer interpreter, and the support routines such as serial input and output, multiplication and division take up the remaining two thirds.

With SIMPL, I strive to reduce all of these overheads, and this means an implementation using a 16-bit microcontroller – which is fundamentally a far better match to the 16-bit Forth Virtual Machine.

The MSP430 – A good match to SIMPL.

The MSP430 has been around for many years, originally developed by Texas Instruments in 1993 as a low power, mixed signal, RISC processor with a von Neuman architecture. It has an un-complicated 16 bit register set, and a very wide range of peripheral modules, including 24 bit ADCs, Low Energy Accelerators for FFT and DSP math operations and a large selection of standard peripherals.

In recent years it is one of few processors to have on-chip non-volatile FRAM – which has a much faster writing speed than Flash, and with 10E14 write cycles – virtually indestructible.

For a 16-bit processor, such as the MSP430, 16-bit integer maths and register operations are single machine instructions.

The ADD primitive becomes:

add: add @stack +, tos    ; Add 2nd stack member to the top of stack
     $NEXT                ; Call macro that executes the NEXT instruction

Clearly this increases the efficiency of executing the primitives and reduces the code space requirements, provided that the overheads of decoding an instruction token, and then fetching the next virtual machine instruction are kept to a minimum.

 

 

Simplifying Forth

Forth uses the concept of a “Dictionary” to organise the Words or subroutines.  The Words are written in natural language, and the dictionary search function normally uses the first three characters of the word and the length in order to find that word within the dictionary.  This is a powerful and flexible approach, but adds quite a significant coding overheads to the outer interpreter.

The approach I use in SIMPL was inspired by Ward Cunningham’s Txtzyme Interpreter – where just matching a single character using  switch-case statements causes the code associated with that character to be executed.  Written in assembly language this can be done with a hash-table,  or just by jumping to a program address that has been calculated in some way from the ascii character value.

Critics might say that this approach limits you to only a few dozen unique commands – and yes it does, but if your application needs more than this – perhaps it’s worth rethinking how you are going to solve your application.

As Einsten said “Make things as simple as possible, but not simpler”

There are 85 printable ascii codes between  32 and 127,  plus 10 numeric digits.  If we group these into

26 upper case characters

26 lower case characters

33 symbols and punctuation characters

SIMPL uses most of the symbols and punctuation characters as primitive instructions – such as &  +  !  % etc.   The lower case characters are used for invoking higher level commands – synthesised from the primitives. These were influenced by Txtzyme, tend to be more microcontroller specific – and may include commands to exercise the I/O and peripherals, read or write to a digital port, get a reading from and ADC or provide  hex dump of memory.  Finally are the uppercase characters which are reserved for the Users application commands – and provide the mechanism by which the language is extended.

 

The Primitive Instructions
Most of the primitive Forth words are located in the ASCII characters 32 to 63 that leaves 64 to 126 available for the users vocabulary and constructs made from primitives.

There are approximately 32 printable ascii punctuation and symbol characters – and with a bit of pre-processing they can be used to form the basic machine primitives.

Stack Commands (7) PUSH, POP, DUP, DROP, SWAP, OVER, NOP

, PUSH
. POP
" DUP
' DROP
$ SWAP
% OVER
SP NOP

Maths & Logical Operations (9) ADDC, SUB, MUL, DIV, AND, OR, XOR, NOT, SHR 

+ ADDC
- SUB
* MUL
/ DIV
& AND
^ XOR
` SHR
| OR
~ INV (NOT/COMPL)

MUL and DIV my be substituted for Shift Left and Shift Right 

Transfer Instructions (8)

: CALL (COMPILE)
; RET
( LOOP-START
) LOOP-END
< LESS
= EQU
> MORE
\ JUMP - condition follows

IN/OUT (2)

? KEY (INPUT)
. PRINT (OUTPUT)

Memory Instructions (3)

@ LOAD
! STORE
# LIT

Others (5)

_ _ String Print
[ ] String Store
{ } Array of elements - switch/case

 

The use of single ascii characters as tokens, makes the language a lot less verbose than Forth, and snippets of source code can be sent in very few characters – for example as an SMS message between 2 systems equipped with GSM modems.

It is a fairly simple task to have a word table – where the verbose form of the words are stored, and can be printed out, to expand the source into something more readable – for example

“‘% -> DUP DROP OVER

Three ascii characters expanded to 11 plus 3 spaces

SIMPL does not use the space as word separator, but in the special case where you have several numbers to put on the stack, the space is used to indicate “push onto stack”

The SIMPL Virtual Machine

SIMPL is based around a virtual stack machine that provides an interactive programming environment.  SIMPL primitive instructions are effectively the native assembly language of that virtual machine, and have been chosen to provide a compact but versatile instruction set.

Ultimately, I wish to create an open-core processor using FPGA devices that can execute SIMPL code directly – but in the meantime I need to rely on a virtual machine written in MSP430 assembler to do this for me.

SIMPL is Forth-like in that it uses words or tokens that cause blocks of code to be executed sequentially. It differs from Forth in that the words consist of single ascii characters that are decoded directly causing the cpu to call a routine at a given address, execute the subroutine found there and then return control to the inner interpreter.
The inner interpreter is very compact on the MSP430 with the main interpreter loop fitting into about 300 bytes. On top of this interpreter loop, you have the code for the low level primitive symbols – listed below, and then higher level words – such as “toggle a port pin” “n milliseconds delay” “read an adc channel” “output a string to serial port” “produce a hex dump of memory” which are represented by the lower case alphabet characters.

This layer of the language is what I refer to as the “Arduino Layer” – all those useful helper functions that are processor specific and deal with timing and I/O peripherals.

The next layer is the users application code where the users words use captital letters. For example – the classic washing machine program example:

FILL
WASH
EMPTY
FILL
RINSE
EMPTY
SPIN

This would be condensed in SIMPL as FWEFRES
– which is a code reduction factor of >5

But each of thes can have a parameter – for example if the WASH is an AGITATE cycle A

– In Forth

: WASH 50 DO AGITATE LOOP;

In SIMPL we also use a colon definition to define W. In fact the ascii code for W is used to store this code snippet at a given address – calcuated as (W-61)*32. So as soon as the interpreter encounters W, it jumps to that code address

:W50(A) the code in the parenthesis is repeated 50 times.

 

In Conclusion

SIMPL is a compact, extensible, interactive language that provides a minimum tool-kit to allow small applications to be developed on microcontrollers.

It provides a development environment, which may be ported easily to virtually any microcontroller with very low overheads.

The use of single ascii characters as primitive instructions and commands makes it compact but retains a certain level of human readability – that some machine languages do not.  As there is a one to one substitution between the ascii character and the Word – a translation table may easily be created to expand each symbol out to its full text representation.

SIMPL can act as a Lingua Franca between widely different classes of computing machines, or as messaging between several specialised processors on the same board . A lot of meaning can be conveyed in a few characters, and so a simple microcontroller could communicate tasks between dedicated co-processors in just a few characters.

At it’s heart, SIMPL has a text interpreter that reads character strings from memory and translates these into machine actions.  This is a very powerful technique applicable to all sorts of everyday tasks such as desktop manufacturing, CNC milling/routing, 3D printing, laser cutting, graphics drawing etc.

 

 

 

Posted in Uncategorized | Leave a comment