SIMPL on EDSAC

SIMPL on EDSAC

When you start coding on a new processor architecture, you always have the initial problem of how to get some sort of a bootloader into memory, to allow you to then load your application program into memory.

Originally this was done manually using toggle switches, to set up the data word for a given address and then write that bit pattern directly into memory, incrementing the address each time. Naturally this bootloader had to be as short as possible – as toggling long sequences of raw data into memory is not only tedious but prone to errors.

The engineers who devised those early bootloaders were masters of minimalism – with EDSAC requiring only 31 instructions – a program called “Initial Orders” which was fed into memory using an arrangement of hard-wired uniselectors. A uniselector was a piece of telephone switch gear – a relay operated “stepping switch” which could be used to generate a fixed output pattern for a given input – effectively a ROM.

With “Initial Orders” loaded, EDSAC could then be loaded from a teletype paper tape reader – using 5-bit teletype code to represent instructions, numbers and characters.

Having pre-packaged routines to assist with the coding of a new processor  is always useful – and being able to take this “toolkit” of known, working routines from processor to processor – makes life so much simpler.  This was what Charles Moore (the inventor of Forth) effectively did – in the 1960s, when he was coding on many different mainframe architectures. It was through the need to have a portable language – regardless of the architecture of the machine, that Moore’s toolkit eventually evolved into Forth –  How this was achieved is described in his article – “Forth – The Early Years”

Inspired by Chuck Moore, I wanted to develop my own language framework, but I needed something to start me off in the right direction.  That incentive came at a talk I attended in May 2013 – where I learned of a very simple interpreter called TXTZYME – by Ward Cunningham.

Inspired by txtzype, I began to  develop a programming framework called SIMPL. (serial interpreted minimal programming language) – which I have ported to a variety of microcontrollers including MSP430, ARM and AVR.

SIMPL is best described as a character interpreter running in a while loop, with routines to handle numerical input and numerical output.

SIMPL interprets a string of characters, and performs actions based on the characters. It is one level above machine language, providing a framework and environment that is independent of the specific details of the cpu. In theory, once you have ported SIMPL – you have a common machine interface – and no need to deal with the quirky machine code anymore!

Porting SIMPL to EDSAC

Porting SIMPL to a new cpu architecture is an ideal way of gaining an understanding of that architecture and it’s machine code.  As EDSAC has very few useful instructions, I viewed the porting of SIMPL to EDSAC as something of a coding challenge.

Fortunately it has ADD and Subtract, and these can be used with the conditional jump instructions for branching and looping. There is also AND, which can be combined with a subtraction from FFFF to negate the bits to create NAND and from NAND we can generate OR, NOR and XOR.

The ability to string primitive instructions together as short sequences to make compound ones is something that can be readily automated by the SIMPL interpreter.

As I stated in the earlier post – it seems unusual to have a hardware multiplier on such an early machine – but Maurice Wilkes predicted that EDSAC would be used for real mathematical computation – so the multiplier was purposefully included right form the start.

The other problem is that the chosen target board – an MSP430 with just 512bytes of RAM has to host the version of SIMPL running on the virtual EDSAC machine – which means cramming a meaningful version of SIMPL into about 200 instructions – which are 16 bit words.

The point of this exercise is to make programming the EDSAC model a lot easier, than raw machine language, and also to get a feel for how a “language” can be created from an instruction set that lacks many of the basic addressing modes.

Bite the Bullet – Itch that Scratch – and so in two days of spare time whist on vacation, an EDSAC simultor was coded on Day 1 – and large chunks of SIMPL were coded on Day 2.

A Model of SIMPL.

SIMPL deals with character input and character output. This is handled with the get_char and put_char routines which communicate with a serial terminal program such as TeraTerm.

Every printable ASCII character is treated as a command. This needs a jump table of 96 entries to be able to jump to any code body address, having received a character from the serial input routine.

Numbers are treated slightly differently by a routine called “number” which reads each character in turn and assembles a 16-bit integer from the given input string.

Similarly there is an output routine “printnum” which takes a 16-bit number off the stack and generates a printable string from it.

In addition to these three routines, there are the action  routines associated with each command.  These consist of a basic set of primitive instructions – mostly represented by punctuation characters and symbols. Higher level functions are represented by lower case alpha characters, and upper case alpha characters are used for User Definable words. In this respect – SIMPL is a bit like Forth, but lacks the dictionary structure.

EDSAC has a pair of Jump instructions – but it lacks a subroutine Call and Return – as these devices were not invented until 1952 – 3 years after EDSAC first ran.  So EDSAC will have to rely on a Jump to the code body for a given character, and then a Jump back to the code that fetches the NEXT character from RAM.

SIMPL needs about 1Kbytes of program memory on the MSP430 – it’s about 250 instructions for the basic kernel.

The EDSAC instruction set is indeed very basic when compared to MSP430 – and so new methods will need to be devised to get the program structure to work. EDSAC code is entirely stored in RAM – so there is the means to self modify code (scary) – but think of this as a means to an end to create the necessary jump table structure needed by SIMPL.

EDSAC only has about 16 or 18 instructions – and they were represented by capital letters – making the assembly language a lot easier to read. I chose the top 16 instructions and wrote an EDSAC simulator in C to allow me to experience this lower level of computing.

SIMPL provided the communications framework, and allowed me to dedicate a few commands to controlling the  virtual machine, such as resetting the program counter and dumping the first 32 locations of RAM in an easily disassembled manner.

EDSAC predates ascii by about 14 years – so all of the communication and I/O was via a wartime surplus teletype that used 5-bit codes. For the modern simulation this may all be updated.  The instruction set – nominally some of the capital letters A through to Z may be simplified to 4 bit numerical opcodes 0-F.

So I derived an EDSAC virtual CPU model in a few lines of C code. This runs under Energia 17 – which is the MSP430 version of the  Arduino IDE

The EDSAC cpu model allows me to hack about with the instruction set,  and work out how exactly to port an interactive language to a resource limited cpu by creating a virtual machine.

Character and Number Input and Output

Character input and output is handled by the EDSAC’s I and O instructions which load a character from tape into a specific memory address and print a given character on the teletype.  For convenience these will interact with uart routines equivalent to get_char and put_char and allow communication from a terminal application.

print_num

The EDSAC literature also includes a listing of the code for printing out the squares of numbers from 0 to 100. This is useful because it features an integer print routine – which is also an essential routine when implementing SIMPL.

number

The “initial orders” program has a routine that tests each character from a sequence of consecutive characters to see if it is a number, and by a process of successive multiplication by 10, expands the input characters as  a decimal number in memory. This is equivalent to the “number” routine in SIMPL.

Jump Table

Fundamental to SIMPL is the jump table – identify a character in the input buffer and execute  a block of code associated with that character. Get this mechanism right – and SIMPL will be a breeze on EDSAC.

One trick with the EDSAC instructions is that both the instruction field and the address field can be manipulated via the accumulator and then deposited back in memory. This allows, for example, a jump table to be created  – based on the value of an input character.  This powerful technique is just what is needed to create an interactive language such as SIMPL – which really is one big jump table.

So we have ascii characters beginning at 32 – we need to create a jump table – so lets load the character into the bottom of the accumulator as an address of where to find the “jump”.  This has to be done in two steps – as we first have to create the pseudo instruction in memory.

T0 - clear the ACC
I2 - get  the character into m[2]
A2 - ADD m[2] into ACC
A3 - ADD m[3] which has been preloaded with the instruction for Jump
U4 - store the assembled instruction at m[4]
E4 - Jump to m[4] - which is now a jump to the jump table

 

Every character now has somewhere to jump to – including numbers, lower case, upper case and punctuation symbols. Numbers can be handled by a routine which looks at each character and creates an integer in the accumulator, which can then be stored on the top of the stack.

With the jump table – we have a mechanism for addressing the code that contains the primitives.

A register model

EDSAC does not use registers in the usual sense. Whilst there is a register to hold the multiplier  – for both the “multiply and add into accumulator” and the “multiply and subtract from accumulator”.  It is also used to hold one of the operands of the AND (collate) instruction.  The multiplier register can only be loaded from a memory and not directly through the accumulator.

However – selecting a small group of memory locations near the start of RAM and giving them nominal register names – will be a good way of creating some order – and a framework for the SIMPL implementation.  Instruction Pointer, stack pointer, return stack and program counter can be implemented thus.

 

Advertisements

About monsonite

mostly human
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s