I just checked my sources -
> Why? -- well, it's quite simple if you look in the cum_partial code:
> .set noreorder
> .set noat
> andi $1,%5,2 # Check alignment
> beqz $1,2f # Branch if ok
> subu $1,%4,2 # delay slot, Alignment uses up two bytes
> bgez $1,1f # Jump if we had at least two bytes
> move %4,$1 # delay slot
> j 4f
> addiu %4,2 # delay slot; len was < 2. Deal with it
> 1: lw %2,(%5)
> addiu %4,2
> addu %0,%2
> sltu $1,%0,%2
> addu %0,$1
> We will reach label '1' if we have at least 2 byets to check and the
> address is aligned to an *ODD* halfword boundary. So the CPU would do
> an address fault (unaligned word access?), and we even does not inc
> the address pointer, so *all* word accesses would be unaligned from
> then on.
> So, the patch would be to write
> 1: lhu %2,(%5)
> addiu %5,2
Wrong, this must be %4.
> instead of the 'lw' line.
1: lhu %2,(%4)
That's what my sources contain since long time.