On Mon, Aug 17, 1998 at 08:45:00PM +0200, Ulf Carlsson wrote:
> Got this little message which crashed the machine:
>
> Got a bus error IRQ, shouldn't happen yet
> $0 : 00000000 1004fc01 88008048 00000000
> $4 : 88008000 88008000 fffffc18 00000001
> $8 : 88009fe0 3004fc01 8803d2f8 00000003
> $12: 00000038 000003e2 88350a08 89f5be18
> $16: 00000000 00000000 8800a16c 00000f00
> $20: a8747310 9fc45da0 00000000 9fc45da0
> $24: 00000001 2ab0c110
> $28: 88008000 88009e90 9fc45f0c 88013650
> epc : 880262b8
> Status: 1004fc03
> Cause : 00004000
> Spinning...
>
> That's in 'schedule'
>
> 88026298: 03c0e821 move $sp,$s8
> 8802629c: 8fbf0040 lw $ra,64($sp)
> 880262a0: 8fbe003c lw $s8,60($sp)
> 880262a4: 8fb40038 lw $s4,56($sp)
> 880262a8: 8fb30034 lw $s3,52($sp)
> 880262ac: 8fb20030 lw $s2,48($sp)
> 880262b0: 8fb1002c lw $s1,44($sp)
> 880262b4: 8fb00028 lw $s0,40($sp)
> 880262b8: 03e00008 jr $ra
> 880262bc: 27bd0048 addiu $sp,$sp,72
>
> 00000000880262c0 <__wake_up>:
> 880262c0: 27bdfff8 addiu $sp,$sp,-8
> 880262c4: afbe0000 sw $s8,0($sp)
>
> Do you know anything about this Ralf? Maybe it's fixed in some version I don't
> have yet?
No, I have no idea what might be causing this. I myself got bus errors
now and then but for me they have disappeared. Looks like I fixed them
en passer.
The bad thing with a bus error is that it may be delayed for a very long
time thus resulting in a useless program counter. What happens is that
the CPU writes to some invalid address but the write access over the
system bus is delayed because the writeback cache policy is being used.
Later, maybe even much later, when the cacheline gets written back to
memory for some reason the system board signals a bus error interrupt.
At this point the program counter may already be completly useless.
I'm shure this has happened in your case as well.
Ralf
|