Hello world

Our CPU is finally ready to run the program every programmer must run first, hello world.

Building on the assembly from the last post we will create the init sequence for the LCD and send characters one by one.

CPU simulator is at the end of the page.

LCD init

More complicated devices like the LCD have a whole protocol that you need to use to communicate with them. The steps we need to go through to get our display running are not long:

// clear the screen send 0x401 // return the pointer send 0x402 // set pointer direction // to the right send 0x406 // turn cursor on send 0x40F // turn shift on send 0x41C // make LCD write 2 lines // in simple font send 0x438 //now we can send whatever //we want

The sequence has to be in this order.

LCD input is connected directly to the bus and in Micromemory programming we created the "PRINT" instruction to directly send the data to it.

"PRINT" sends the data that is in the accumulator, so to send the above instructions to the LCD we first load it to the ACC and than use the "PRINT" instruction.

// clear the screen LDAI 0x401 PRINT 0 // return the pointer LDAI 0x402 PRINT 0 // set pointer direction // to the right LDAI 0x406 PRINT 0 // turn cursor on LDAI 0x40F PRINT 0 // turn shift on LDAI 0x41C PRINT 0 // make LCD write 2 lines // in simple font LDAI 0x438 PRINT 0 //now we can send whatever //we want loop B loop

Loading is done using LDAI, remember, LDAI loads the number next to the instruction to the ACC. You can refresh your memory of the supported instructions here.

Printing letters

To print a letter we need to send the ascii representation of it to the LCD. If we were to follow its protocol we would also need to set another pin of the LCD high, but that is done automatically in my processor. So to write an 'A' we just send the bitwise representation of it via the PRINT instruction.

Here is a simple way to write 'Hello world' on the LCD (you can always test this assembly in the simulator at the bottom of the page).

loop LDAI 0x401 PRINT 0 LDAI 0x402 PRINT 0 LDAI 0x406 PRINT 0 LDAI 0x40F PRINT 0 LDAI 0x41C PRINT 0 LDAI 0x438 PRINT 0 LDAI 'H' PRINT 0 LDAI 'e' PRINT 0 LDAI 'l' PRINT 0 LDAI 'l' PRINT 0 LDAI 'o' PRINT 0 LDAI 32 PRINT 0 LDAI 'W' PRINT 0 LDAI 'o' PRINT 0 LDAI 'r' PRINT 0 LDAI 'l' PRINT 0 LDAI 'd' PRINT 0 LDAI '!' PRINT 0 B loop

Note, simulator does not support the string representation of space (' ') so we insert the number representation of it (32).

We are manually loading characters to the ACC and sending the to the LCD. After we do all of that we do an unconditional branch jump to the beginning and do all of that again. And again. And again. Is there a better way?

Using SP

If we would be writing in c we would write something like this:

int data[18] = { 0x401, 0x402, 0x406, 0x40F, 0x41C, 0x438, 'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '!' } while (true){ int i=0; for (i=0; i<18; i++) print(data[i]) }

The code separates data and instructions. Data is stored in an array called "data" and we send the components of it with a for loop. We go from 0 to lenth of it and send data pieces one by one.

We would like to implement something like this in our assembler, but we somehow need to implement the for (and while) loop. Another important thing is that if we are using purely the accumulator, there is no way to dereference it. What does that mean? There is no way to interpret its value as memory address and get get contents of that address.

In other words, how do we do "data[i]" operation if i is stored in ACC and we do not have a direct instruction to fetch the value from the bus? We have to use the stack pointer.

data[i]

We created 4 nice instructions that work with stack pointer. "SP2ACC", "ACC2SP", "LDAFS" and "STA2SP". Than to fetch the contents of memory at address dictated by the ACC we do:

ACC2SP 0 LDAFS 0

Note that we do not need the address parameters of these instructions, but have to include them for all instructions to be of the same size. The value of ACC will stay inside SP.

for loop

Let us try to do this part:

int i=0; for (i=0; i<18; i++) print(data[i])

How do we do the for loop? We need to use conditional branching. If the condition is fulfilled we go at the top of the loop and in that way create the looping behaviour. If the condition is not met we exit the loop, which will in our case mean just treating that instruction as NOP.

LDAI 0 loop ADDI 1 if_ACC_is_not_18_goto_loop

This code will loop 18 times. ACC will start at 0 and be incremented by 1 in every loop till it comes to 18 at which point the if_ACC_is_not_18_goto_loop will not do anything (behave as NOP) and execution will continue. In this way we succesfully created the for loop. Now what is undreneath "if_ACC_is_not_18_goto_loop"?

We need to use conditional branching instruction, we have 2 of them "BEQ" which branches when ACC is 0 and "BNEQ" which branches when ACC is not 0. But how to make it branch if it is 18?

Easy, 18 is just an offset, we can just substract 18 from ACC and check if it is 0:

LDAI 0 loop ADDI 1 ADDI -18 BNEQ loop

This way if ACC was 18 than after substraction (addition of -18) it will be 0. The only problem is that we are destroying the ACC value when we do ADDI. We can save the value and recover it:

B begin 0 begin LDA 2 ADDI 1 STA 2 ADDI -18 BNEQ begin

This will run 18 times. We are using memory location 2 as a temporary store for the ACC. In the real solution we do not have to load and store ACC many times because we will use SP to hold the index value. ACC will than serve as a messager between different parts.

Putting it together

So if we combine the data, SP, looping and branching we get this:

B hop 3 0x401 0x402 0x406 0x40F 0x41C 0x438 'H' 'e' 'l' 'l' 'o' 32 'W' 'o' 'r' 'l' 'd' '!' hop LDAI 3 ACC2SP 0 loop LDAFS 0 STA 2 ADDISP 1 ACC2SP 0 LDA 2 PRINT 0 SP2ACC 0 ADDI -19 BNEQ loop B hop

So to recap, this:

LDAI 3 ACC2SP 0

Is just loading the address at which our data starts, which is 3. 0 and 1 are B hop and 2 is the temporary variable. And this is the meat of the program:

loop // get value from memory LDAFS 0 // increment SP (i in the c program) // we load store ACC STA 2 ADDISP 1 ACC2SP 0 LDA 2 // print to LCD PRINT 0 // check if SP (i) == 20 which is last memory // address we need to print SP2ACC 0 ADDI -20 BNEQ loop

We also have "B hop" which enables us to do "while (true)"

Check your understanding

How much is our final approach faster than the naive (first) one for large strings (example 'Hello world'*1000) ?
We need to see how many instructions are needed to print out a single character. We can approximate that every instruction lasts the same amount (most of the time is spent in fetch phase so this is a valid thing to do). Naive way has 2 instructions per char and complicated one has 9. So for large sequences complicated one is 9/2=4.5 times slower.
How much more memory efficient is our final approach compared to the naive one? (again for large strings with many characters)
To print one character in the simple case we need 2 instructions which have 2 memory addresses each, which is 2*2=4 memory addresses. Complicated one has data separate so uses 1 address per character. So complicated one is 4/1=4 times more memory efficient.

Simulator

acc:
0
pc:
0
tmp:
0
sp:
0
ir:
0
dc:
0
Clock rate (Hz)