SAM Instruction Timings

Introduction

This page describes memory and I/O contention on the SAM Coupé. Contention adds wait states to the execution of some CPU instructions, reducing the potential speed of running code. Knowledge of it is important for raster-level display effects, and general code optimisation.

Rules

Contention delays affect:

ALL other accesses are uncontended, meaning no waits for ROM or External RAM.

Delays

The size of the delay (if any) depends on the type of access and what the ASIC is doing at the time:

Depending on where code/data is located and the I/O ports used the same Z80 instruction will take a different length of time to execute. The table below aims to show timings for the most common situations.

Alignment

Memory and I/O accesses occur at different positions within a machine cycle. Even though screen and I/O port accesses are both limted to 1 in 8 cycles, each will add a different number of wait states from the same starting position.

In cases where there is a single contended access during instruction execution, the overall timings will become aligned to the end of the contention region. Repeats of the same instruction will benefit from reduced contention, and in some cases no additional contention. This is most often seen with code running in external RAM (no opcode fetch contention) that accesses internal RAM or a contended I/O port. This is probably best seen with repeated blocks of LDI/OUTI instructions, which remain 16T even if writing to a contended destination.

The table won’t help with instructions that span regions with different contention rules, though the timing will be between the two values. The SimCoupe debugger is better used for this, and for general code/timing analysis.

Test Setup

The timings below were generated programmatically using the same CPU core and contention rules as SimCoupe v1.2.5.

To ensure a representative timing value, each instruction is run twice with the same starting environment (contention rules and RAM/register values). The first run is from a fixed point in the appropriate contention region, and the second run uses this ending cycle position as its start point. The time for the second run is taken as the instruction time.

The first run is purely for starting alignment, to avoid unwanted bias. It also ensures that the timing is representative for repeats of the same instruction, which is important for unrolled blocks of LDI/OUTI.

Timings

The columns headers below show pairs of Code/Data locations (Internal or External RAM), plus whether the raster is over Border or Screen. For ROM cases use the Ext column since they’re both uncontended.

When multiple values are given (e.g. 24/19), they represent the taken/not-taken timings for conditional instructions, or the loop/no-loop case for block repeat instructions such as LDIR.

Values in brackets represent the time for contended ASIC ports (F8-FF).

Undocumented instructions are shown in italics.

Hover over the timing values to see what contributes to the total time. WAIT MREQ and WAIT IORQ show memory and I/O contention delays, respectively.

Code/Data
Opcodes
Int/Int
Border
Int/Int
Screen
Ext/Int
Border
Ext/Int
Screen
Ext/Ext
n/a
ADC|SBC HL,ss 16 24 15 15 15
ADD HL,ss 12 16 11 11 11
ADD IX,rr 16 24 15 15 15
ADD|ADC|SUB|SBC A,(HL) 8 16 8 8 7
ADD|ADC|SUB|SBC A,(IX+d) 20 32 20 24 19
ADD|ADC|SUB|SBC A,n 8 16 7 7 7
ADD|ADC|SUB|SBC A,r 4 8 4 4 4
AND|OR|XOR|CP (HL) 8 16 8 8 7
AND|OR|XOR|CP (IX+d) 20 32 20 24 19
AND|OR|XOR|CP n 8 16 7 7 7
AND|OR|XOR|CP r 4 8 4 4 4
BIT b,(HL) 12 24 12 16 12
BIT b,(IX+d) 24 40 20 24 20
BIT b,r 8 16 8 8 8
CALL cc,pq 20/12 40/24 20/10 24/10 17/10
CALL pq 20 40 20 24 17
CPD/CPI 16 24 16 16 16
CPIR|CPDR 24/19 32/27 24/19 24/19 21/16
DAA/CPL/CCF/SCF 4 8 4 4 4
DI/EI 4 8 4 4 4
DJNZ e 16/12 16/16 13/8 13/8 13/8
EX (SP),HL 24 40 24 40 19
EX (SP),IX 28 48 28 40 23
EX AF,AF’ 4 8 4 4 4
EX DE,HL 4 8 4 4 4
EXX 4 8 4 4 4
IM m 8 16 8 8 8
IN A,(n) 12 (16) 16 (24) 11 (16) 11 (16) 11 (16)
IN r,(C) 12 (16) 16 (24) 12 (16) 12 (16) 12 (16)
IN X,(C) 12 (16) 16 (24) 12 (16) 12 (16) 12 (16)
INC|DEC (HL) 12 24 12 16 11
INC|DEC (IX+d) 24 40 24 32 23
INC|DEC IX 12 16 10 10 10
INC|DEC r 4 8 4 4 4
INC|DEC rr 8 8 6 6 6
INIR|INDR 24/19 (24/19) 32/27 (32/27) 24/19 (24/19) 24/19 (32/27) 21/16 (24/19)
IND|INI 20 (24) 32 (32) 16 (24) 16 (24) 16 (16)
JP (HL) 4 8 4 4 4
JP (IX) 8 16 8 8 8
JP cc,pq 12/12 24/24 10/10 10/10 10/10
JP pq 12 24 10 10 10
JR cc,e 12/8 16/16 12/7 12/7 12/7
JR e 12 16 12 12 12
LD (BC|DE),A 8 16 8 8 7
LD (HL),n 12 24 12 16 10
LD (HL),r 8 16 8 8 7
LD (IX+d),n 24 40 20 24 19
LD (IX+d),r 20 32 20 24 19
LD (nn),A 16 32 16 16 13
LD (nn),dd 24 48 24 32 20
LD (nn),HL 20 40 20 24 16
LD (nn),IX 24 48 24 32 20
LD A,(BC|DE) 8 16 8 8 7
LD A,(nn) 16 32 16 16 13
LD A,I 12 16 9 9 9
LD A,R 12 16 9 9 9
LD dd,(nn) 24 48 24 32 20
LD dd,nn 12 24 10 10 10
LD HL,(nn) 20 40 20 24 16
LD I,A 12 16 9 9 9
LD IX,(nn) 24 48 24 32 20
LD IX,nn 16 32 14 14 14
LD r,(HL) 8 16 8 8 7
LD r,(IX+d) 20 32 20 24 19
LD r,RL|RR|RLC|RRC|SLA|SRA|SLL|SRL (IX+d) 28 48 24 32 23
LD R,A 12 16 9 9 9
LD r,n 8 16 7 7 7
LD r,r’ 4 8 4 4 4
LD SP,HL 8 8 6 6 6
LD SP,IX 12 16 10 10 10
LDIR|LDDR 24/19 40/35 24/19 32/27 21/16
LDD|LDI 20 32 20 24 16
NEG 8 16 8 8 8
NOP 4 8 4 4 4
OTIR|OTDR 24/19 (24/19) 32/27 (32/27) 24/19 (24/19) 24/19 (32/27) 21/16 (24/19)
OUT (C),r 12 (16) 16 (24) 12 (16) 12 (16) 12 (16)
OUT (C),0 12 (16) 16 (24) 12 (16) 12 (16) 12 (16)
OUT (n),A 12 (16) 16 (24) 11 (16) 11 (16) 11 (16)
OUTD|OUTI 20 (24) 24 (32) 16 (16) 16 (24) 16 (16)
POP AF/BC/DE/HL 12 24 12 16 10
POP IX 16 32 16 24 14
PUSH AF/BC/DE/HL 16 24 12 16 11
PUSH IX 20 32 16 24 15
RES|SET b,(HL) 16 32 16 24 15
RES|SET b,(IX+d) 28 48 24 32 23
RES|SET b,r 8 16 8 8 8
RET 12 24 12 16 10
RET cc 16/8 24/8 12/5 16/5 11/5
RETI/RETN 16 32 16 24 14
RLA|RRA|RLCA|RRCA 4 8 4 4 4
RLD|RRD 20 32 20 24 18
RL|RR|RLC|RRC|SLA|SRA|SLL|SRL (HL) 16 32 16 24 15
RL|RR|RLC|RRC|SLA|SRA|SLL|SRL (IX+d) 28 48 24 32 23
RL|RR|RLC|RRC|SLA|SRA|SLL|SRL r 8 16 8 8 8
RST p 16 24 12 16 11

LDD/LDI/LDDR/LDIR

With these copy instructions the source (HL) and destination (DE) locations may be in different contention regions. Rather than complicate the table above, the table below shows the timings for just these mismatched cases.

Code
Src→Dst
 
Int
Int→Ext
Border
Int
Int→Ext
Screen
Int
Ext→Int
Border
Int
Ext→Int
Screen
Ext
Int→Ext
Border
Ext
Int→Ext
Screen
Ext
Ext→Int
Border
Ext
Ext→Int
Screen
LDIR|LDDR 24/19 32/27 24/19 32/27 24/19 24/19 24/19 24/19
LDI|LDD 16 24 20 24 16 16 16 16