Some of the implementation choices ---------------------------------- gbz80: Load from direct space: Alternatives: 1. Via HL ld hl,#dir ld x,(hl) inc hl ld y,(hl) 2. Via a ld a,(dir) ld x,a ld a,(dir+1) ld x,a 1 is bad when x or y involve HL (1b) 8 16 32 1 = 12 + n*(8+8) - 8 20 36 68 1b = n*(12+12+8) 32 64 128 2 = n*(16+4) 20 40 80 So choose 2. Hmm. (2) is too hard to support in the current model. On stack word push 1. lda hl,x(sp) ld a,(hl+) ld h,(hl) ld l,a push hl 2. lda hl,x(sp) ld e,(hl) inc hl ld d,(hl) 1 = d + 8 + 8 + 4 2 = d + 8 + 8 + 8 Structure member get: Normally fetch pair Then add pair and constant with result in hl ld l,c ; 4 ld h,b ; 4 inc hl .. ; 6 = 8 + 6n or ld l,c ; 4 ld h,b ; 4 ld a,#0x06 ; 7 add a,c ; 4 ld l,a ; 4 ld a,#0x00 ; 7 adc a,b ; 4 ld h,a ; 4 = 38 alt: (only when result=hl and left, rigth = pair, const) ld hl,#const ; 10 add hl,pair ; 11 = 21 So (1) is best for n <= 2, (2) is just bad, (3) is good n > 2 How about: pair = pair + constant: 1: ld a,#0x08 ; 7 add a,c ; 4 ld c,a ; 4 ld a,#0x00 ; 7 adc a,b ; 4 ld b,a ; 4 = 30 2: ld hl,#const ; 10 add hl,pair ; 11 ld c,l ; 4 ld b,h ; 4 = 29 One cycle. If I cache HL later it will throw away the advantage. Choose 1. PlusIncr on pairs: 1: inc pair ; 6 = 6n 2: ld a,#0x04 ; 7 add a,c ; 4 ld c,a ; 4 ld a,#0x00 ; 7 adc a,b ; 4 ld b,a ; 4 = 30 So n <= 5 (1) is better. Frame pointer: It's nice to use HL as the temp register, but what if I used it as the frame pointer instead of ix? Instead of: ld e,5(ix) ; 19 ld d,6(ix) ; 19 = 38 ld hl,#5 ; 10 add hl,sp ; 11 ld e,(hl) ; 7 inc hl ; 6 ld d,(hl) ; 7 = 41 Things get better when you access the same set over, as you get rid of the setup. But they get worse when both ops are on the stack/in direct space. Easiest this way for now. iy may benifit...