In my exploration of minimal instruction set (MISC) cpus, it is often a worthwhile exercise to define the instruction set and then simulate the behaviour of the cpu in software.
This can be done relatively straightforwardly using even a humble Arduino to mimic the operation of the cpu. A couple of evenings programming and you have a simple simulator – that can readily be changed to suit new architectures or new instruction sets. Whist the original Arduino is resource limited – the same code will run on the MEGA or DUE bards – offering a whole lot of extra RAM – and in the case of the DUE – speed.
The Arduino Uno or Duemillenove has 2Kbytes of RAM – and this is just about enough to write short programs – particularly if the instructions are multiple bytes long.
To create a CPU simulator, we first have to create an array in memory that will hold the program in the form of the machine instructions for the proposed processor. If the cpu needs a 16-bit instruction width, then for practical purpose this array should only be about 512 words long, which will immediately consume 50% of the available RAM.
We also have to define the principal registers – including Program Counter (PC), Instruction Register (I), Accumulator, and possibly the address of any memory location that is updated by the instruction.
Now we come to the instruction set and architecture (ISA) – and the instruction decoder. First we need to choose how many instructions, and what operands they will act upon.
For a MISC machine we should really be looking at about 32 or fewer instructions – mostly based on what the chosen ALU can do. We will also require program branching, loops and other conditional program flow control instructions.
In the case of th J1 Forth CPU – there are just 5 classes of instructions
A true load-store architecture will be able to operate on any location in the available program RAM. A register based architecture will have a group of registers that are operated on preferentially – either implemented in block RAM on an FPGA or possibly in off chip RAM – occupying the low addresses of the zero page – which minimises the addressing overhead. Finally – with a stack based architecture, there will be the top of stack, the next of stack, and some elements in the return stack that have preferential access.
In the case of EDSAC, which is a traditional load-store architecture – all of it’s 1024 words of RAM storage are treated identically by the ALU. However, there is no reason why for improving the programming model – that the firs few words in memory could not be given register names – and used preferentially for moving data about.
The Instruction Decoder can just be based on a switch-case statement. For a machine with just 32 instructions – this can be as little as a few lines of C-code, that calculate the new state of the Accumulator, Program Counter and Memory – based on whatever instruction has been executed.