On Fri, 23 Apr 2004, Ralf Baechle wrote:
> > > success report for the MC Bus Error handler :)
> > >
> > > Apr 19 23:17:32 resume kernel: MC Bus Error
> > > Apr 19 23:17:32 resume kernel: CPU error 0x380<RD PAR > @ 0x0f4c6308
> > > Apr 19 23:17:32 resume kernel: Instruction bus error, epc == 2accf310, ra
> > > == 2accf2c8
> > >
> > > I guess i have bad memory. The interesting point is that the machine
> > > continued to run for another 2 days. Shouldnt a memory error halt the
> > > machine ?
> >
> > As it happened in the user mode, I'd expect only the victim process to be
> > killed.
>
> The KSU bits are meaningless. On Indy like most other MIPS systems a
> bus error exception may be delayed. So the generic solution requires
I beg your pardon? AFAIK, bus errors are documented to be reported
precisely and my past experience with the systems I use confirms this.
Otherwise bits in <asm/paccess.h> wouldn't work, but they do. Of course
this is true for errors happening on read transactions (I have troubles
imagining a delayed read), but the semantics of the exception is defined
only for reads anyway. For other transactions a general-purpose interrupt
should be used (and normally is). Such an interrupt can happen any time,
indeed (but here it was an IBE, not an interrupt).
> tracking down the actual user, something which in the current kernel is
> relativly easy due to rmap.
Well, that may be tough anyway -- imagine an uncorrectable memory error
on a DMA transaction. ;-)
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
|