SPRUIE9D May 2017 – May 2024 DRA74P , DRA75P , DRA76P , DRA77P
The pipeline is fully interlocked – the CPU stalls in case of source operand registers have pending loads. Data bypassing for read-after-write dependency is implemented at the end of EXE and WB stages to increase instructions-per-cycle (IPC).
The load data has a single-cycle load use penalty since the load data is written back to register file at WB stage. In the following example, the
ADD
instruction stalls for a cycle to allow load to complete:
LDW *+R0(0), R0 ; Load a word into R0
ADD 4, R0, R0 ; Increment R0 to the next word address
MVK 100, R1 ; Move a value 100 into R1 Since the CPU allows a nondependent instruction to continue executing, this stall is avoided, if the
MVK
(no dependency on the load data) instruction is placed in the load delay slot – the CPU executes all three instructions without a stall.
LDW *+R0(0), R0 ; Load a word into R0
MVK 100, R1 ; Move a value 100 into R1
ADD 4, R0, R0 ; Increment R0 to the next word address