A Minimum Interactive Language Toolkit

In previous posts I began to describe a minimum interactive language toolkit which could work standalone in virtually any small microcontroller. Here are the following prerequisites I stated in the previous post:

  1.  Interactive – commands may be typed at the keyboard for immediate execution
  2.  Commands can be combined into programs then compiled for later execution
  3.  Extensible – from a small kernel of commands, a whole application can be built
  4.  No external tools required other than a serial terminal – all tools run on chip.

This may appear to be a huge break from conventional wisdom, as virtually all embedded microcontroller development is done in high level C – compiled using a package of tools hosted on a more powerful machine – such as laptop.

Microcontrollers have evolved to have a fairly large proportion of their memory as flash-ROM, and modest quantities of  RAM – so it’s not unusual to find a micro with 32K bytes of flash but only 2k bytes of RAM. Whilst this partition of memory resources is not ideal for interactive languages (more RAM would be nice) – it’s enough to get started.

In the early days of microprocessors – it was commonplace to have a “Monitor” program – which consisted of a few simple commands to allow hexadecimal opcodes to be entered directly into memory and then executed from a given address. Some of these monitor programs also allowed a hex-dump to be sent to a terminal, to examine the contents of memory, plus primitive editing commands.

Typically the monitor would take a few hundred bytes of ROM, but it provided the absolute basics of being able to write machine code. Assemblers seldom existed, given the meager resources of the early home computers, and so a lot of coding was done by hand assembly – using pencil and paper – or by copying hexadecimal listings out of magazines.

What I am proposing here is a program that offers the same low level support as a monitor, but rather than programming in raw Hex or machine language, the User has access to a small instruction set of highly mnemonic commands.  These are executed out of memory by a  generic virtual machine, hosted on the MSP430 microcontroller.

This can be coded in about 562 bytes of assembly language, of which 128 bytes are a 64 entry look-up table. The MSP430 generally dos not have a good code density with an average of around 2.95 bytes being required for each instruction.

The mechanics of this virtual machine are as follows:

  1. Take in a string of serial characters and place in a buffer until the newline character is detected.
  2. Parse through the characters one at a time creating a jump address into a look-up table based on the ascii value of the character.
  3. Numerical character strings are handled by the number routine, forming them into a 16-bit integer which is placed on the stack.
  4. All other characters cause program flow to jump to a table-selected address from where code is executed. Numerical parameters may be used from the stack.
  5. Where appropriate putchar is used to provide serial output of numbers and strings
  6. Jump back to the parser in (1).

The prototype has been coded up in MSP430 assembly language – and is available at the following github repository.

A More Efficient Instruction Set?

Most MSP430 instructions are two bytes,  but moving byte or word constants into registers, byte comparisons and byte subtractions require a further two bytes to hold the constant.  Calls also require the second pair of bytes to hold the call address.

As much of the interpreter is based on the testing and handling of characters and the manipulation of ascii values – it would appear prudent to have a class of instruction that could efficiently handle 8-bit values.  This could be done by allowing an 8-bit immediate value to be coded into the instruction, with a 4-bit field for the source/destination register and a 4 bit instruction.

This becomes a balancing act between a processor that can efficiently handle 16-bit maths operations via registers and still efficiently handle 8-bit immediates. It becomes quite a challenge to shoe-horn this into a 16-bit instruction wordlength, but it’s no different to the challenge that the 8-bit ISA designers had when the immediate was the byte following the opcode.

Last year I reviewed the 8080/Z80 common instruction set based on how they were decoded – and I produced the following image.  Whilst the 8080 and Z80 were 8-bit micros, they did have a small capability to handle 16-bit numbers by way of their register pairs plus a very few 16-bit instructions.  perhaps there could be some inspiration from the 8080 instruction set – for a processor that could handle 8-bit immediates and short jumps – whilst being predominantly a 16-bit machine.


When expressed in octal – and a bit of colour added  – it’s clear to see how the instructions were grouped with the most significant 2 bits defining the class, the next 3 bits defining the operation, and the bottom 3 bits defining the operand(s). It’s a clever bit of encoding – that ensured that almost all of the 256 opcodes did something useful.

If we take the top two bits to define the instruction Group – a very rough description is as follow – where only Group 01 and 10 are entirely regular and defined by the bit fields. Groups 00 and 11 need further decoding to derive their classification.

00  xxx   yyy     – INC, DEC, double ADD, relative jumps etc

01  DST   SRC     – regular source to destination register moves

10  ALU  SRC    – register-accumulator ALU operations ADD ADC SUB SBC AND XOR OR CMP

11  FLG  JMP    – conditional CALL, RET, JMP operations

With Groups 00 and 11 being a bit of a mash-up — it might be possible to re-hash them into something a bit more logical.

A Hybrid CPU.  – Neither Fish- Nor Fowl

The 8080 was based on a modest set of 8 registers that could be combined to form 16-bit pairs  bc, de, hl.  Limited arithmetic was permitted within these pairs – in that they could be added to hl as an accumulator, incremented or decremented.  They could be loaded with immediate 16-bit constants and could be used as indirect pointers for load and store operations to memory.  The hl pair had special preference in that it was used as a memory pointer and any of the other registers could load or deposit through (hl). Push and Pop operations were done on the register pairs.

With 8 registers we have the possibility of creating a small stack – and with a rich, orthogonal set of register to register moves (Group 01), there is little or no requirement to have the usual stack manipulation instructions.  Group 10 instructions allow any register to have math or logical operations done using the accumulator as the destination.  So in theory if the register file were considered to be a pseudo-stack – any of the members could interact with the accumulator – or top of stack.

A machine capable of executing Group 01 and 10 instructions would be fairly versatile – so lets keep these in place.  That leaves manipulating Groups 00 and 11 to provide the other necessary instructions.






About monsonite

mostly human
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s