Image by Nina Edmondson from Pixabay

With the board fully assembled, and USB enumeration working well enough to see the device, it’s time to see if the design is valid. Do I have a working Z180-based computer, or an expensive paperweight?

First, blink the lights

My initial bootrom changes the CPU clock divider from EXTAL/2 to EXTAL/1 a few bytes into execution. With the FPGA reprogrammed to flash LED1 on a 24-bit divider, a 9.216MHz PHI will blink LED1 at 0.55Hz and an 18.432MHz PHI will blink it at 1.1Hz. With LED2 still on a 26-bit divider of the 100MHz oscillator, this provides an easy to observe visual guide to whether my boot ROM is executing at all.

Which lead me to discover that my boot ROM was not executing at all. The CPU was putting out a clock signal just fine, but at 9.216MHz, not 18.432MHz.

To verify the CPU was still okay after all my earlier fumbling about I put it back into the old debug board I built early on in the process, with an 18.432MHz can oscillator. My logic analyser showed it doing the expected signal dance - but I also recorded what happened as the CPU came out of /RESET, and saw that A19 stays high after /RESET has gone high. It stubbornly doesn’t fall until the address lines are being settled for the first machine cycle, which is up to 30ns after PHI rises for clock cycle T1. On top of that /M1 falls at most 35ns after PHI rises for T1. To the limits of accuracy of my very cheap 24MHz analyser, in practice /M1 falls at the same time as A19.

This is a significant oversight in my board design - the datasheet shows, if you look properly, that /M1 could fall before the address lines have settled. An earlier revision of my design used /MREQ instead of /M1, which can be seen in the design post diagram. Despite my excessively detailed blog posts I neglected to document why I changed to /M1, but I believe it’s because I only wanted execution accesses to ROM to trigger the flip-flop. If I’d stuck with /MREQ then A19 would be settled low before /MREQ goes low, and this would have gone a bit smoother.

The consequence of all this is that the ROM overlay DFF is reset before the first memory read even happens, and ROM never gets read.

The good news is that there’s a simple fix to get the ROM overlay active at boot: I’ve put a 10kΩ 0603 resistor between the A19 and GND pins of the CPU socket, underneath the board. This pulls A19 low during reset, and it’s still low when /M1 falls for the first opcode fetch. Thereafter, the CPU continues to hold A19 low for each access.

With this fix in place, the LED starts to blink at twice the speed. I now have a computer that can run code!

Communication?

After adjusting the clock speed the boot code’s next task is to set up ASCI0. As the ROM progresses from here it prints out the character sequence ‘1234567’, with each character indicating successful completion of another step of the boot-up.

Perhaps predictably by now, my serial terminal remained stubbornly blank. No output at all - but due to another questionable design choice, I wasn’t sure this meant the boot code was failing to transmit a character or not. This time it was the fact that the /RESET line also resets the FT230X USB IC. This causes the serial port to disconnect and reconnect; the FT230X datasheet doesn’t specify a time range for this but it will read the internal memory to configure the USB descriptors, enumerate as a USB device, and negotiate a connection. That’s enough work to be a non-trivial amount of time.

So - with no way to talk to the CPU to find out what’s going on, I resorted to the logic analyser. My analyser has eight lines, so it cannot monitor the entire bus at once, but it can watch some pins and check they’re behaving. I checked the ROM’s /CE and /OE pins, which were firing properly. I checked, rechecked, triple checked, and dispiritedly checked the ASCI0 lines to the FT230X. Without a reset button, I resorted to five second samples at a slow rate, before finally cluing in that I could pull /WAIT low through the bus header, and freeze the CPU to inspect what it was doing.

Freezing the CPU after it had run for a little while showed it had gone out to lunch in the middle of memory where no predictable value could be expected. It also showed that the transceiver pin for A12 was poorly soldered, but while that needed fixing it wasn’t in a position to be causing any faults.

Freezing the CPU before it finished the first opcode fetch showed the right values on address and data lines.

So - somewhere in between the first byte read, and thousands of cycles later, the CPU goes a bit nuts.

Debugging a CPU board

It’s impractical to try and whip a jumper lead in and out of the /WAIT header and expect to see anything meaningful going on. However, my board has a powerful tool available on in the form of the FPGA. Most of the board’s signals are connected to an FPGA pin so it has broad insight into the workings of the CPU. It also has a view of the /WAIT signal.

What it does need is a way to tell me what’s going on. Since the FPGA is expected to eventually run an I2C bus, I thought it’d be convenient to give it an I2C slave interface for a µC to drive. The FPGA can drive /WAIT low when /MREQ or /IORQ falls, capture address and data lines, and send all that information to the µC in a four byte burst.

Rather than trying to greenfields my own I2C module, I read over a handful of existing implementations. In the end I used one by Daniel Beer that has a truly excellent explanation. Since it’s unlicensed, my FPGA code for this single step debugger is unpublished, but it’s not rocket surgery. I did get to learn about crossing clock domains though, to have an effective signal between an I2C event indicating the CPU should proceed to the next instruction and the CPU’s much faster clock cycle.

This work uncovered a bad solder job on A5 of the FPGA, which I also fixed. And amazingly, I had completely failed to solder one side of one of the I2C pull-up resistors. I hope that’s it for bad solder jobs, because the volume of them makes a mockery of the amount of connectivity testing I did.

The FPGA sends /MREQ, /IORQ, /RD, /RW, and the high four bits of the address bus in the first byte. The next two bytes are the rest of the address bus, and the data bus follows. After all four bytes have been read, the CPU is permitted to proceed.

On the µC side of things I just used an ESP8266 as a convenient 3v3 device, and programmed it using Arduino IDE. After a few iterations I wound up with something that could single step, run a number of steps, run to an address access, and set breakpoints. Its output looks something like this:

MiRw 0016D  ED (11101101)
MiRw 0016E  39 (00111001)
MiRw 0016F  38 (00111000)

… which is the CPU fetching and executing out0 (CBR), a.

The debug tool showed that without a USB connection, the boot ROM pauses waiting for the transmit data register empty (TDRE) flag to be set for ASCI0. When that flag is set, the CPU writes ‘1’ - and the serial terminal receives it. This confirms my hunch that the USB IC is still resetting when the CPU starts writing to it. And confirms that the serial link works!

I had mixed up my boot code somehow and tried to do the MMU work right in the middle of setting up the DMA registers. Once I fixed this, I could watch the DMA executing its copy loop, reading from ROM and writing to RAM byte by byte.

I also discovered that the MMU base registers are added to the logical address: if Common Area 1 starts at $F000 with a base register of $80 the physical address is $8F000. This makes sense from an implementation perspective - it’s a single 8-bit adder involved - and was an easy fix to set CBAR to $71 instead.

The remaining consequences of my choice to use /M1 then became apparent. A memory access to ROM space would set A19 high, with /M1 also high, no problem so far. But the next machine cycle would lower /M1 at around the same time as A19, with enough overlap between them for my flip-flop to be reset. The safest action is to jump to ROM as soon as possible, without allowing any other access to ROM space to happen first.

I re-ordered the boot code to set up the MMU first, jumped to ROM space, ran the DMA copy from there, then jumped back to, in theory, SRAM.

A transmit test that modified code showed that yes, I had code running out of SRAM.

Rough consensus and running code

Let’s take a moment to absorb this.

I have a working computer. It has boot ROM, working memory, and enough peripherals for me to interact with it.

There’s plenty more to do from here, the Y-modem upload doesn’t work, I have nothing useful to upload, the FPGA isn’t capable of self-programming yet, and there are other boards I expect to add on over time. There’s even a laundry list of things I’d do differently on a second revision of the CPU board.

But none of that matters just as this exact moment: demonstrating running code on the board means this project is now a success.