λbje - AOC 2022 - Z80

A cropped screenshot of Nibbles, via Wikipedia.

On to Day 9 in the Z80 Advent of Code solution series, continuing the recent trend of adapting code written in 2022 but now with more exhaustive descriptions.

The day 9 puzzle’s first star is pretty straightforward. A cursor moves horizontally or vertically, dragging a point behind it that may move horizontally, vertically, or diagonally, staying less than two units of distance away. A basic algorithm for this may look something like this:

Move cursor according to directions
If the tail is more than one unit distance away in X or Y:
1. Move to one unit distance away from that dimension
2. Make the other dimension equal

The trick of the day’s first star is that the challenge is to record the number of unique positions the point visits. Not much of a trick, really: chuck the coordinates in a set and the job is done. The only problem is that Z80 assembly does not have high level constructs like sets, so I’ll have to build something to do that job.

The high level approach will be:

Initialise the head and tail to the same location
While there are motions left to take:
1. Get the motion direction and count
2. Repeat
  1. Move the head
  2. Chase with the tail
  3. Record the tail’s position, discarding duplicates
3. Until the motion count has reached zero
Count the number of positions in the tail’s history

Translating this into Z80 assembly is relatively straightforward, if perhaps a shade tedious. But then, “relatively straightforward if perhaps a shade tedious” is more or less the byline for assembly programming, so this is to be expected.

        LD      HL, DATA
START:
        LD      SP, STACK
        CALL    INIT
INLOOP:
        LD      A, (HL)
        OR      A
        JR      Z, FINISH
        INC     HL
        LD      B, (HL)
        INC     HL

        ; move the cursor B times in the A direction
MOVELOOP:
        CALL    MOVE
        CALL    CHASE
        CALL    RECORD
        DJNZ    MOVELOOP

        ; go back for more input
        JR      INLOOP

FINISH:
        CALL    RESULTS
        HALT

There are a few undefined symbols in this block of code. It’s missing an implementation of MOVE and CHASE to do the cursor and tail movement, and RESULTS to print the number of unique tail positions, respectively. It’s also missing the input DATA.

As usual, parsing the input is not the interesting part of the problem, so it will be pre-converted to a direction byte and a count byte and assembled. Here’s the sample input:

DATA:
        DEFB    'R', 4
        DEFB    'U', 4
        DEFB    'L', 3
        DEFB    'D', 1
        DEFB    'R', 4
        DEFB    'D', 1
        DEFB    'L', 5
        DEFB    'R', 2

        DEFB    0

I don’t usually worry about errors in the input, but today for a change here’s a little error routine to print a complaint and terminate:

ERROR:
        LD      BC, 1 + (ERRLEN * 256)
        LD      HL, ERRMSG
        OTIR
        HALT
ERRMSG: DEFB    'Invalid direction character', 10
.EQU    ERRLEN  $ - ERRMSG

Moving the cursor is mostly about translating the character direction reference to an x/y offset. In MOVE I use CPIR to find the direction character and use the resulting index as an offset to a movement vector to add to the cursor’s position.

        ; X, Y bytes
HEAD:   DEFB    0, 0
TAIL:   DEFB    0, 0

; IN:   A       A direction character, L, R, U, or D
; OUT:          cursor position updated
MOVE:
        PUSH    AF
        PUSH    BC
        PUSH    HL
        LD      BC, 4
        LD      HL, M_DIRS
        CPIR
        JR      NZ, ERROR

        ; BC will be 0, 1, 2, or 3
        ; corresponding respectively to L, R, U, and D
        ; so M_ADDS is in reverse order to M_DIRS
        LD      HL, M_ADDS
        ADD     HL, BC
        ADD     HL, BC
        LD      A, (HEAD)
        ADD     A, (HL)
        LD      (HEAD), A
        INC     HL
        LD      A, (HEAD+1)
        ADD     A, (HL)
        LD      (HEAD+1), A

        POP     HL
        POP     BC
        POP     AF
        RET
M_DIRS: DEFB    'LRUD'
              ; (0, 1), (0,-1), (1, 0), (-1,0)
M_ADDS: DEFW    00100H, 0FF00H, 00001H, 000FFH

It’s about now that a little test suite is in order. The assertions below will be tested live and should get a little ‘OK’ beside them.

        LD      SP, STACK
        LD      A, 'R'
        CALL    MOVE
        LD      BC, (HEAD)
        ; ASSERT BC=0001h
        LD      A, 'L'
        CALL    MOVE
        LD      BC, (HEAD)
        ; ASSERT BC=0000h
        LD      A, 'U'
        CALL    MOVE
        LD      BC, (HEAD)
        ; ASSERT BC=0ff00h
        LD      A, 'D'
        CALL    MOVE
        LD      BC, (HEAD)
        ; ASSERT BC=0000h
        HALT

And if all went well, I have the cursor moving about as expected based on the directions given. The range is only (-128, 127) in each dimension, and it’s quite possible a full puzzle input will exceed these bounds - if that happens the cursor will need to change to 16-bit values instead.

With the cursor moving correctly it’s time to look at chasing it with the tail. It doesn’t matter whether I check X or Y first, so I am arbitrarily going to choose X. As a simplification I will also take it as a precondition that the tail is never more than two away from the head in either dimension.

; IN:           (HEAD) and (TAIL) values
; OUT:          tail position updated
CHASE:
        PUSH    AF
        PUSH    BC
        PUSH    HL

        ; find difference between HEAD and TAIL in X dimension
        LD      A, (HEAD)
        LD      HL, TAIL
        SUB     (HL)
        ; if A is 2 or -2, move X
        CP      2
        JR      NZ, XNEQ2
        ; set TAIL.X to HEAD.X - 1
        LD      A, (HEAD)
        DEC     A
        LD      (TAIL), A
        JR      SETY

XNEQ2:
        CP      -2
        JR      NZ, CHASEY
        ; set TAIL.X to HEAD.X + 1
        LD      A, (HEAD)
        INC     A
        LD      (TAIL), A
        ; fall through to SETY

SETY:
        ; make Y equal
        LD      A, (HEAD+1)
        LD      (TAIL+1), A
        JR      CHASERET

CHASEY:
        ; find difference between HEAD and TAIL in Y dimension
        LD      A, (HEAD+1)
        LD      HL, TAIL+1
        SUB     (HL)

        ; if A is 2 or -2, move Y
        CP      2
        JR      NZ, YNEQ2
        ; set TAIL.Y to HEAD.Y - 1
        LD      A, (HEAD+1)
        DEC     A
        LD      (TAIL+1), A
        JR      SETX

YNEQ2:
        CP      -2
        JR      NZ, CHASERET
        ; set TAIL.Y to HEAD.Y + 1
        LD      A, (HEAD+1)
        INC     A
        LD      (TAIL+1), A
        ; fall through to SETX

SETX:
        ; make X equal
        LD      A, (HEAD)
        LD      (TAIL), A
        ; fall through to CHASERET

CHASERET:
        POP     HL
        POP     BC
        POP     AF
        RET

This is the sort of code that will be full of corner cases and logic bombs, so it’s best to do a sanity check of this code as well. I will use the examples from Advent of Code for this.

        LD      SP, STACK
        LD      BC, 0102h       ; .....
        LD      (HEAD), BC      ; ..H..
        LD      BC, 0301h       ; .....
        LD      (TAIL), BC      ; .T...
        CALL    CHASE           ; .....
        LD      BC, (TAIL)
        ; ASSERT BC=0202h
        LD      BC, 0103h       ; .....
        LD      (HEAD), BC      ; .T.H.
        LD      BC, 0101h       ; .....
        LD      (TAIL), BC
        CALL    CHASE
        LD      BC, (TAIL)
        ; ASSERT BC=0102h
        HALT

With this looking reasonably good so far it’s time to see what this looks like in action. Each pixel in the canvas below represents one point in the puzzle’s coordinate space; it will be coloured red if it’s the cursor, green if it’s the tail, and black if it’s the memory of the tail. Click <Play> to begin.

The resulting diagram looks quite a lot like the day 9 sample input’s path. I’m pretty confident at this point that the movement code is fine.

I have assumed so far that my puzzle input’s movements will be constrained to the range of a signed 8-bit integer in both dimensions. I don’t really know if that’s true, so before I go ahead and work out how to represent my tail position history I will verify this assumption. To do this I’ll need to feed my puzzle input as data instead of the sample input.

My own puzzle input is magically available at 01000h. The following code block executes the program using my input instead of the sample data:

INPUT:
        LD      HL, 01000h
        JP      START

At first glance this might send one to panic stations: it clearly underflows near the end of the path, wrapping from the top to the bottom. I don’t think all is lost, however, as the underflowed portion doesn’t overlap anything else. I could simply shift the starting point a little and avoid the problem entirely - or simply ignore it, as all the code written so far works as well for a toroidal surface as it does for a cartesian plane.

There’s a couple of options for set representations. Fewer than 10% of total spaces are visited, so a sparse representation may be suitable: a sorted list would probably do just fine. Inserts would be $O(n)$ due to the copying involved but $n$ is relatively small. There’s no space overhead at all, which is a very nice property for an algorithm used on a machine with a maximum of 64k memory. I don’t believe it’s possible to improve on the time complexity without sacrificing space, but using something like a bit field would take a fixed 8k of memory for $O(1)$ time, changing the total execution complexity from $O(n^2)$ to $O(n)$ .

I’m going to go with the bit field. It only introduces one small concern: DEFS only demarcates memory for a purpose, it does not initialise it with anything. The INIT subroutine clears the bit field to zero. Failing to do this wouldn’t be the first time I’ve discovered code works fine in an emulator but fails on a real board because memory is not cleared to zero on reset.

COUNT:  DEFW    0

INIT:
        PUSH    BC
        PUSH    DE
        PUSH    HL

        LD      C, 0
        LD      HL, BITS
        LD      DE, BITS+1
        LD      (HL), C
        LD      BC, 256*256/8 - 1
        LDIR

        POP     HL
        POP     DE
        POP     BC
        RET

; in:   position in (TAIL) to record
; out:  (COUNT) has a tally of unique positions seen
RECORD:
        PUSH    AF
        PUSH    BC
        PUSH    HL

        ; ignore sign: unique value is all that matters
        ; the index into the bitfield is (x << 5) + (y >> 3)
        ; BITS is aligned to the size of the bitfield using
        ; .ORG, so the expression below sets H to the high
        ; portion of the bitfield's address, shifted right
        ; by 5. After the shifts left, it will be correct.
        LD      H, BITS >> 13
        LD      A, (TAIL)
        LD      L, A
        ADD     HL, HL          ; << 1
        ADD     HL, HL          ; << 2
        ADD     HL, HL          ; << 3
        ADD     HL, HL          ; << 4
        ADD     HL, HL          ; << 5
        LD      B, 0
        LD      A, (TAIL+1)
        LD      C, A
        SRL     C               ; >> 1
        SRL     C               ; >> 2
        SRL     C               ; >> 3
        ADD     HL, BC

        ; I want to run BIT A & 7, (HL)
        ; which is CB 46, CB 4E, CB 56, ...
        ; or CB (46 + A << 3)
        ; so here's some self-modifying code
        LD      A, (TAIL+1)
        AND     A, 7
        RLCA
        RLCA
        RLCA
        ADD     A, 46h
        LD      (BIT+1), A

        ; and similarly SET n, (HL) being c6+n
        OR      A, 80h
        LD      (SET+1), A

BIT:    BIT     0, (HL)
SET:    SET     0, (HL)

        ; the BIT will have cleared ZF if the bit wasn't already set
        JR      NZ, RECORDED
        LD      BC, (COUNT)
        INC     BC
        LD      (COUNT), BC

RECORDED:

        POP     HL
        POP     BC
        POP     AF
        RET

Nothing too fancy, just one little thing in there: the Z80 has instructions to test and set bits indirectly at (HL), just perfect for a bitfield - except the bit to test or set is wired into the opcode. There are three ways to manage this sort of thing. First, one could forgo the use of the bit instructions and just load the value into A and use logical operations OR and AND. This is exactly what clang for the Z80 does, and it’s a pretty solid approach. Second, one could do a little jump table with all 8 variants written up, but this is a fair bit of mucking about.

Third, one could modify the BIT and SET opcodes based on the value at hand. They both encode the same way, except BIT has bit 7 reset while SET has it, uh, set. A little bit of ALU time gets the right opcode value written into program memory.

Self-modifying code was a very common trick on resource-constrained 8 bit machines, where it can shave off precious cycles or save precious bytes. It’s fallen out of favour for various reasons since, which are more than I should go into in this blog post and which have been gone into in more faithful detail elsewhere anyway.

The following code is a re-hash of the print methods seen before.

; does not preserve any registers.
RESULTS:
        ; track whether zero has been seen in E
        LD      E, '0'
        LD      HL, (COUNT)

        LD      BC, 10000
        CALL    DIGIT

        LD      BC, 1000
        CALL    DIGIT

        LD      BC, 100
        CALL    DIGIT

        LD      BC, 10
        CALL    DIGIT

        ; always print the 1s
        LD      E, '!'
        LD      BC, 1
        CALL    DIGIT

        ; newline
        LD      A, 10
        OUT     (1), A

        RET

; Prints a digit.
; IN:   HL      the number to test
;       BC      the digit to print, eg 10,000
;       E       '0' if zeros should be skipped
; OUT:  HL      the number less BC*digit
;       E       '!' if either E or the digit was not '0'
;       A       the digit printed
DIGIT:  LD      A, '0' -1
        OR      A       ; reset CF

        ; keep subtracting the digit and increasing A
SUBS:   INC     A
        SBC     HL, BC
        JR      NC, SUBS

        ; add the digit back on after overflowing it
        ADD     HL, BC

        ; if E is '0' and A is '0' print nothing
        CP      A, E
        JR      Z, NOPRINT

        LD      E, '!'
        OUT     (1), A

NOPRINT:
        RET

The program’s output appears here; you can click <Play> for either the sample input or my puzzle input and see the result, as many times as you like.

CONSOLE OUTPUT:

The bitfield is defined at the end of the program, tucked out of the way, with the stack a similar size after it. Remember: the puzzle input is at 1000h and shouldn’t be overwritten.

        .ORG    04000h
BITS:   DEFS    256*256/8
.EQU    STACK   $ + 02000h

The first star is now successfully solved. The second star, as usual, introduces a new wrinkle to the problem. Instead of a head and a tail there’s now a chain of ten points following each other around, all using the same rules for movement as the original tail point did.

Adapting to these new rules shouldn’t be too much of a worry. The basic structure of the program remains the same - read the input, move the head (unchanged!), then chase (changed!), record (unchanged!), and print results (unchanged!).

STAR2:
        LD      HL, DATA

START2:
        LD      SP, STACK
        CALL    INIT

INLOOP2:
        LD      A, (HL)
        OR      A
        JR      Z, FINISH2
        INC     HL
        LD      B, (HL)
        INC     HL

        ; move the cursor B times in the A direction
MOVELOOP2:
        CALL    MOVE
        CALL    CHAIN
        CALL    RECORD
        DJNZ    MOVELOOP2

        ; go back for more input
        JR      INLOOP2

FINISH2:
        CALL    RESULTS
        HALT

The only thing that needs to change, in fact, is the bit that has TAIL catch up to HEAD - the new CHAIN needs to introduce 8 additional points, but will start with HEAD and finish with TAIL. Because I already have CHASE written I am going to re-use it, so I’ll store the other points in POINTS here and swap them through HEAD and TAIL as needed.

POINTS:
        DEFW    0, 0, 0, 0, 0, 0, 0, 0

; IN:   (HEAD) is the start of the chain
;       (TAIL) is the end of the chain
;       (POINTS) is the 8 intermediate values
CHAIN:
        PUSH    BC
        PUSH    DE
        PUSH    HL

        ; hang onto these
        LD      DE, (HEAD)
        PUSH    DE
        LD      DE, (TAIL)
        PUSH    DE

        LD      HL, POINTS
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW
        CALL    FOLLOW

        ; move the original tail
        POP     DE
        LD      (TAIL), DE
        CALL    CHASE

        ; restore the original head
        POP     DE
        LD      (HEAD), DE

        POP     HL
        POP     DE
        POP     BC
        RET

; IN:   HL      address of next point to move
;       (HEAD)  point to follow
; OUT:  HL      HL+2
;       (HEAD)  point that was moved
;       (TAIL)  point that was moved
;       (HL-2)  point that was moved
FOLLOW:
        LD      E, (HL)
        INC     HL
        LD      D, (HL)
        DEC     HL
        LD      (TAIL), DE
        CALL    CHASE
        LD      DE, (TAIL)
        LD      (HL), E
        INC     HL
        LD      (HL), D
        INC     HL
        LD      (HEAD), DE
        RET

As expected the head goes waving all over the place but the tail never moves. There’s another set of sample data for star 2, because the tail never moves for the original data, and I’m going to try that because I’m suspicious that it can’t be so easy.

SAMPLE2:
        LD      HL, DATA2
        JP      START2
DATA2:
        DEFB    'R', 5
        DEFB    'U', 8
        DEFB    'L', 8
        DEFB    'D', 3
        DEFB    'R', 17
        DEFB    'D', 10
        DEFB    'L', 25
        DEFB    'U', 20
        DEFB    0

Surprisingly this actually seems to produce the right answer (and image), so the only thing left to do is load up the full input data and give it a whirl.

INPUT2:
        LD      HL, 01000h
        JP      START2

At this point the end of the rope is jumping around a lot, and it doesn’t look right. Something must be going wrong in my point movement: every point should always be touching the next point along the chain, and every point should only ever be moving one unit vertically, horizontally, or diagonally. If a point ever jumps two spots at once it will “break” the rope.

..........    ......H...    ......H...    ......H...
......H...    ..........    ......1...    .....21...
.....1....    .....1....    ..........    ..........
....2.....    ....2.....    ....2.....    ..........
.9..3..... -> .9..3..... -> .9..3..... -> .9..3.....
.87654....    .87654....    .87654....    .87654....
 Starting        UP 1       1 follows     2 follows

This is the last little sequence of the moment my rope breaks. It happens because point 2 winds up two away from point 1 in both the X and Y dimensions at once, which violates my assumption that the distance between two points is always less than two. I look at X first, it’s two away so I move it to one away. I then took a shortcut and said that if the point got pulled in the X direction I could just set Y to the same value as the pulling point - but that’s not true here, instead, it should move one closer to the pulling point.

This means a bit more complex code in CHASE. Instead of just yanking the point to match Y (or X) I need to move either -1, 0, or 1 towards that value. If it’s equal, I don’t move. If it’s less, I move 1. If it’s more, I move -1. Don’t forget: comparing signed numbers on the Z80 is harder than just looking at the carry flag - you need to compare both the sign and the parity flags.

; IN:           (HEAD) and (TAIL) values
; OUT:          tail position updated
CHASE_2:
        PUSH    AF
        PUSH    BC
        PUSH    HL

        ; find difference between HEAD and TAIL in X dimension
        LD      A, (HEAD)
        LD      HL, TAIL
        SUB     (HL)
        ; if A is 2 or -2, move X
        CP      2
        JR      NZ, XNEQ2_2
        ; set TAIL.X to HEAD.X - 1
        LD      A, (HEAD)
        DEC     A
        LD      (TAIL), A
        JR      SETY_2

XNEQ2_2:
        CP      -2
        JR      NZ, CHASEY_2
        ; set TAIL.X to HEAD.X + 1
        LD      A, (HEAD)
        INC     A
        LD      (TAIL), A
        ; fall through to SETY_2

SETY_2:
        ; compare Y
        LD      A, (TAIL+1)
        LD      HL, HEAD+1
        SUB     (HL)
        LD      HL, TAIL+1
        JR      Z, CHASERET_2   ; Yh = Yt
        JP      PO, NXOR_Y
        XOR     80h
NXOR_Y: JP      M, INCY_2       ; Yh > Yt
        DEC     (HL)
        JR      CHASERET_2
INCY_2: INC     (HL)
        ; fall through to CHASERET_2

CHASEY_2:
        ; find difference between HEAD and TAIL in Y dimension
        LD      A, (HEAD+1)
        LD      HL, TAIL+1
        SUB     (HL)

        ; if A is 2 or -2, move Y
        CP      2
        JR      NZ, YNEQ2_2
        ; set TAIL.Y to HEAD.Y - 1
        LD      A, (HEAD+1)
        DEC     A
        LD      (TAIL+1), A
        JR      SETX_2

YNEQ2_2:
        CP      -2
        JR      NZ, CHASERET_2
        ; set TAIL.Y to HEAD.Y + 1
        LD      A, (HEAD+1)
        INC     A
        LD      (TAIL+1), A
        ; fall through to SETX_2

SETX_2:
        ; compare X
        LD      A, (TAIL)
        LD      HL, HEAD
        SUB     (HL)
        LD      HL, TAIL
        JR      Z, CHASERET_2   ; Xh = Xt
        JP      PO, NXOR_X
        XOR     80h
NXOR_X: JP      M, INCX_2       ; Xh > Xt
        DEC     (HL)
        JR      CHASERET_2
INCX_2: INC     (HL)
        ; fall through to CHASERET_2

CHASERET_2:
        POP     HL
        POP     BC
        POP     AF
        RET

I want to test this one against the problematic inputs and the original ones:

        LD      SP, STACK
        LD      BC, 0103h       ; .....
        LD      (HEAD), BC      ; ...H.
        LD      BC, 0301h       ; ..*..
        LD      (TAIL), BC      ; .T...
        CALL    CHASE_2         ; .....
        LD      BC, (TAIL)
        ; ASSERT BC=0202h
        LD      BC, 0102h       ; .....
        LD      (HEAD), BC      ; ..H..
        LD      BC, 0301h       ; ..*..
        LD      (TAIL), BC      ; .T...
        CALL    CHASE_2         ; .....
        LD      BC, (TAIL)
        ; ASSERT BC=0202h
        LD      BC, 0103h       ; .....
        LD      (HEAD), BC      ; .T*H.
        LD      BC, 0101h       ; .....
        LD      (TAIL), BC
        CALL    CHASE_2
        LD      BC, (TAIL)
        ; ASSERT BC=0102h
        HALT

With the way I have set up the Z80 assembler and emulator for this page I can’t redefine symbols as I go. The whole page is assembled in two passes, then executions merely pick an entry point and run to HALT or a timeout. That might mean I’d need to rewrite CHAIN to call CHASE_2 and the main loop to call the new CHAIN_2 but that’s a lot of repetitive nonsense I can skip by overwriting CHASE to jump to CHASE_2 instead.

INPUT3:
        LD      A, 0c3h         ; JP
        LD      BC, CHASE_2
        LD      (CHASE), A
        LD      (CHASE+1), BC
        LD      HL, 01000h
        JP      START2

With this change in place everything is fine up until the wrap-around, where the Y dimension shifts from -128 to 127 during an UP. As I said earlier, an easy fix for the wrap-around for my particular input is to just shift the starting position a little, so I’m going to do that. A more general solution is more bits for coordinates, but that would break my bitfield set quickly. The goal is, as always, to solve the puzzle for my own input. Generality is a bonus, not a goal.

INPUT4:
        LD      A, 0c3h         ; JP
        LD      BC, CHASE_2
        LD      (CHASE), A
        LD      (CHASE+1), BC
        LD      BC, 2000h
        LD      (HEAD), BC
        LD      (TAIL), BC
        LD      (POINTS), BC
        LD      (POINTS+2), BC
        LD      (POINTS+4), BC
        LD      (POINTS+6), BC
        LD      (POINTS+8), BC
        LD      (POINTS+10), BC
        LD      (POINTS+12), BC
        LD      (POINTS+14), BC
        LD      HL, 01000h
        JP      START2

I have included the solver for star 1 here to show the two images side by side. You can race them, if you’re so inclined. You might even be able to get them to interfere with each others’ output if you time it exactly right.

Star 1

Star 2

Here’s the console repeated from above so you don’t have to scroll.

CONSOLE OUTPUT:

The series

Day 1 | Day 2 | Day 3 | Day 4 | Day 5 | Day 6 | Day 7 | Day 8 | Day 9 | Day 10 | Day 11 | Day 12 | Day 13 | Day 14 | Day 15 | Day 16 | Day 17 | Day 18 | Day 19 | Day 20 | Day 21 | Day 22 | Day 23 | Day 24 | Day 25 |