Using SIMPL as a tool to bootstrap unusual processors

I have written a lot about SIMPL over the years, but it is my conviction that it has uses as a tool to help bootstrap various novel processors – and to ease the early stages of processor code development.  It’s not a fully fledged interpreted language like Forth or BASIC but just enough to make writing code easier on unfamiliar processors.

SIMPL may be written in about 100 lines of C code, or ported directly to the native assembly language of the  target processor for even more compactness and speed, which is the approach I have used with the MSP430.

With this in mind, I will attempt to use SIMPL to provide an interactive coding environment for the 1949 EDSAC,  – or at least a simulated or an FPGA softcore version of it – as part of the Wuthering Bytes EDSAC Challenge  – to be held i Hebden Bridge in early September.

This is a work in progress – so don’t expect too much just yet.

 
SIMPL
SIMPL is a character interpreter running within a loop and as such, models a virtual machine:
NEXT:              Fetch next Character from buffer
                         Increment Program Counter
                         Decode to form unique Jump Address
                         Jump to code body, execute code – end with Jump back to NEXT
This forms the heart of a very simple virtual machine which is capable of executing unique code functions based on the character found in the memory buffer.  This elementary execution model is equivalent to most modern processor systems, so with a little bit of work, SIMPL can form the basis of a simulator for most basic processor architectures.
The character may be fetched from a serial terminal input buffer and executed immediately in an interactive manner, or the Instruction Pointer can point to a general area of program memory, and execute the character instructions found there.
It is this combination of using SIMPL either interactively at the terminal or as a means of entering and editing text into memory and subsequently running it from RAM, that makes SIMPL a powerful tool – to have in the programmer’s tool kit.
As a kernel, capable of executing the basic inner kernel loop,  SIMPL may be coded in about 300 bytes on most small processors such as the MSP430 – this includes the overhead of setting up the mcu, oscillator, UART and GPIO etc – so that it supports serial terminal interaction.
As more functions are added, the code size grows by approximately 16 bytes per function – so it’s possible to have a small but usable SIMPL running in about 1K bytes.
The users application code will further extend this by perhaps a similar amount – but it still has an extremely small footprint, making it an ideal serial command shell for resource limited processors.
SIMPL when coded in C has been ported to a number of microcontrollers including ATmega, ATtiny, ARM M4, and MSP430.  The kernel has also been rewritten in MSP430 assembly language for speed and compactness, and work is ongoing to port it to the J1 Forth mcu.
The ability to encode the basic inner interpreter virtual machine in the native assembly language of a range of processors makes this a powerful technique of getting the basis of a useful “monitor” programme onto a variety of processors.  It also means that once the kernel has been coded in the native assembly language, then the user is working with a common set of instructions running on the virtual machine, at a level somewhat higher than the native assembly.  This creates a common language instruction set – a kind of “lingua franca” that makes interaction with a wide range of processors much easier.  It also opens up the door for different processors to communicate with each other through this common character encoded language.
SIMPL has some of it’s roots in Forth, and indeed Charles Moore’s Forth language has been a source of inspiration for the development of SIMPL.  However SIMPL is not a fully fledged language, more like a multi-function tool (Leatherman) which just makes the interpreted interaction with a range of small processors a lot easier, and less painful than the usual edit-compile-run cycle of code development found within a modern C compiler IDE.
 
Assembler and Instruction Set Decoder.
At it’s heart, SIMPL, when the kernel is coded in C, uses techniques such as switch-case statements and look-up table, or hash table to create the code address associated with the input character.
This technique allows not just a jump address to be associated with the character, but also other data, such as instruction look-up – for extending an 8 bit character into a longer instruction length – with individually coded bit-fields. This is the method used in generating J1 assembly language instructions from the common SIMPL command set.  This is a form of microcoding and is a powerful technique used when writing assembly language for different target hosts.  The (expanded instructions decoded in SIMPL can be written to RAM, creating the object code for the new target.
Processor Simulation
SIMPL also has uses in the automation of cpu simpulation.  Any small cpu (with small instruction set) may be simulated in a few dozen lines of C code, and a processor model including memory, registers and ALU created.  SIMPL commands can be used to create assembly instructions to test this model,  single step through executable code runs and examine the results on a hex dump display that shows the memory contents and principal register contents.  A full screen of serial data can be updated in less than 0.1 seconds  – making single stepping more productive.
It is this processor simulation that I chose to explore next, and my target cpu will be one of the earliest processors – the 1949 EDSAC – built at Cambridge university.
Fortunately the EDSAC had very few instructions and is well documented.  On a less fortunate note, the machine uses an instruction set input via a 5 bit paper tape, which is somewhat different in order to the more familiar ascii character set, and uses “figure shift” and “letter shift” to access the various characters of the teleprinter keyboard. This means that serially typed ascii characters will have to go through a further layer of decode to turn them into the Murray coded characters as used on the Creed teleprinters of that time, and a further translation back again into ascii for output to a serial terminal.  However these  character translations can be done with simple look-up or switch-case statements.

About monsonite

mostly human
This entry was posted in Uncategorized. Bookmark the permalink.

8 Responses to Using SIMPL as a tool to bootstrap unusual processors

  1. Hawkeyeaz1 says:

    I have actually been poking around for the last few days to see if a *minimal* FORTH like interpreter could be stuffed into the PC MBR sufficient to load a (full) FORTH (stage 1, and then the) kernel to complete the initializing of the PC (and then provide a full experience like ForthOS does). I am still unsure, but hints (like yours) seem to point to “maybe”.

  2. DM Vieau says:

    Want a challenging target embedded processor that badly needs an interpreter? Can you believe there are no ‘real – fill blown’ ports of Forth to the PIC32 (PIC32MX in my case)??? The ‘most portable language in history: Forth – or at very close to the ‘top’ DOES NOT have a real version of Forth (due to the architecture). Do you see any problems in porting SIMPL to this target device?

  3. monsonite says:

    With SIMPL you are creating a virtual machine that executes bytecode out of RAM. Each byte causes a call to a function that can be written in flash ROM – such as a 16-bit ADD. I don’t think that the PIC’s Harvard architecture would cause any difficulties in implementing such a scheme. Have a look at the Gigatron TTL Microcomputer project. It has an 8-bit Harvard architecture which is used to implement a 16-bit Von Neuman virtual machine, which executes code out of RAM. https://gigatron.io/

    It’s about 15 years since I wrote any PIC code – all in assembler for speed. I wrote routines to allow PICs to dial DTMF digits, and then communicate bi-directionally with V23 modem tones over any telephone line. A PIC, a high voltage transistor for a hook-switch and a few resistors was all that was needed.

    Sadly I have forgotten most of my PIC assembly language.

    • ga2500ev says:

      Ken,

      I remember reading about your PIC based Remote Access Terminal. Awesome project.

      I’ve been studying SIMPL. You are correct that it would be rather trivial to port to PIC32MX line. Be aware that that line is based on the 32-bit MIPS CPU and is primarily targeted towards HLL development systems. Honestly, a bog standard C implementation of SIMPL would pretty much port unchanged.

      As I’ve wandered through your discussions and Gists on SIMPL a couple of thoughts have come to mind that I hope you may consider.

      1. I understand your goal of a compact, highly efficient monitor that can fit in a bootloader space. But I think it would be helpful to also maintain a current canonical C implementation of the project that is essentially a drop and go portable version that can be used with minimal changes. I tested an undated Gist labeled SIMPL-04-19 on an Arduino. But it is unclear if that is the latest C version. You’ve posted that SIMPL could serve as a lingua franca across different platforms, and I agree. But there needs to be a definitive reference implementation with a complete standardized mapping of all the core routines to pull that off.

      2. It feels like control structures have mutated over the development cycle. I’ve seen snippets of counting loops using k, conditionals, case statements, PIC style skips, and for loop implementations. However, what seems to be missing, at least in implementation, is a while loop. I thought I saw an example along the lines of w(){} discussion but I can’t find an implementation of it.

      3. Another missing item that seems crucial for embedded systems development is an event loop implementation. The zeconds implementation sows the seeds of the idea. But as with while, a conditional (or unconditional) event loop seems to be elusive. Might it be possible to use 0z as an infinite unconditional event loop for example?

      4. Like FORTH, SIMPL is an interactive noodling language. But also like FORTH a developer can build a collection of routines that become a pretty effective application. However, SIMPL needs some runtime support to facilitate that ability. The two crucial items needed for this is turnkey startup and a save/restore mechanism for user routines (if possible on the platform). My quick thought on turnkey startup is having a marker on one of the user routines, like ‘M’ for main, or ‘T’ for turnkey, or ‘S’ for startup that represents the starting point for the application. It seemed to me that repurposing the ‘:’ symbol as a marker for turnkey application could be effective. So once the application is finished a line something like:

      :M:0z()
      then saving the application would cause it to run automatically upon the next reboot. Of course since ‘z’ has an escape via the UART, the interactive mode can be reactivated by issuing the escape character while the application is running. Note the compiled ‘:’ is a null operation and the leading ‘:’ is consumed by txtcheck.

      5. Other random items include a flag register for event based communication, the actual separation of what you call the ‘Arduino layer’ into a separate HAL, and some mechanism for processing interrupts.

      Sorry for the brain dump. I was just trying to find somewhere to discuss the thoughts I was having about SIMPL. No hurry in answering any of these. I hope there is something you think is worth discussing here.

      ga2500ev

      • monsonite says:

        Hi ga2500ev,

        I am always happy to receive feedback, and to discuss how SIMPL can be improved.

        First some ancient history,

        It has evolved over the years from Ward Cunningham’s Txtzyme and as a result was heavily reliant on the Arduino ecosystem for the serial, I/O and timing routines.

        I did port it to an STM32F407 in standard C in 2014, and ended up rewriting the serial, I/O and timing routines to suit the STM32. The result was a 20K ROM image – that didn’t excite me about the possibilities of it being a monitor that would run from within a bootloader.

        In 2016, I ported in to the MSP430 – in particular those parts with FRAM, as that would allow the user code to be saved to the non-volatile FRAM. It could of course be stored in the EEPROM on an AVR – you can get a lot of SIMPL source snippets into 1k bytes.

        In early 2017, I then moved onto implementing it in ASM on the MSP430 – it was purely as a challenge to see how small the kernel could be reduced to on a 16-bit processor. I wrote the getchar and putchar UART routines in MSP430 ASM and had something that would fit into under 1k bytes.

        The 04-19 version that you found was something that I hacked together last month – just to get it running on another Arduino variant.

        I agree that it needs a while() {} structure. Originally it only had a down counting loop until the loop counter k=0.

        I then thought about a case statement that would allow me to selectively execute the nth block of code. I was thinking about instruction decoders for bytecoded processors.

        Zeconds was an idea – sadly never really taken to a full implementation.

        Right now, have been distracted by the Gigatron TTL computer which is evolving rapidly with a monitor, TinyBASIC – and very soon a usable C compiler. I’m looking into ways how SIMPL could be implemented from it’s current instruction set – without having to implement a full-blown Forth on it’s 16-bit virtual machine.

        It would be great to hear your ideas – and perhaps we can work towards a canonical C implementation.

        I can be reached at ken dot boak at gmail dot com if you wish to correspond directly,

        best regards

        Ken

  4. monsonite says:

    For some reason the comment mechanism has failed – here’s a reply from ga2500ev:

    Ken,

    Thanks for the quick reply. I’m happy to keep the dicussion public for the time being if that’s OK with you. I took another look at the 2019-04-19, which seems like a good starting point. I realized that there are a couple of fundamental issues that need to be addressed. The first is that the return stack is hidden. This is done by having each capital letter do a recursive call to txtEval. Second is that without an exposed return stack, that control constructs cannot be nested. Both of these can be fixed by creating a return stack for buf and simply pushing buf onto the stack for each of the control structures.

    Right now it looks like ‘b’ and ‘c’ are redundant as ‘e’,’f’, and ‘p’ can be used to print millis and micros. Here’s a plan to repupose them for looping contructs. Note that char *start and char *loop are redundant. Instead have a char *start, *end for the current control construct. end can be set after the while loops that skip the loop body in ‘(‘ and ‘{‘. Note that for nesting there would need to be matching counts for () or {} pairs. in any case once the end is marked, then generalized constructs can be build using ‘b’ for break, which sets buf to end, and ‘c’ for continue, which resets buf back to start. To implement nesting, push start and end onto a stack with each ‘(‘ or ‘{‘ before getting started and pop them off with each ‘)’, ‘}’, or ‘b’. A stack of 12 to 16 pointers should be well more than enough to satisfy any nesting folks may need.

    The best thing about it is now ‘j’ becomes extremely useful. Anywhere in these constructure one can do a conditional then use ‘j’ to skip a ‘b’ or a ‘c’ giving the equivalent of a if(test) break or if (test) continue.

    One final item. I see that ‘(‘ was copied from ‘{‘. However, the ‘k = x’ should not be in ‘(‘ as k is not involved in the logic loop pair ‘()’

    Hope this helps,

    ga2500ev

  5. monsonite says:

    ga2500ev,

    Thank you for the various contributed points in your reply.

    Exposing the stack and using ( or { to push the stack and ) or } to pop the stack is a bit of foresight that I failed to see.

    b for Break and c for Continue are a good choice

    Formalising what the jump j does is also a great idea.

    I have one version of SIMPL that works with the MSP430 and an external serial SRAM. d was reserved for doing a 1K hexadecimal dump from that RAM, and e was used to execute a word from a particular 32 byte boundary.

    I have some “me-time” coming up at the end of the week – I’ll try to formalise some of the words and their definitions.

    Ken

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s