Hi,
this question is posted here in the hope, it will be picked up and answered
by some of the <*@*engr.sgi.com> gurus, i apologize to the other members of
this mailing-list for annoying them with it as well ;-)
Is it save to assume, that memory bus errors (mc cpu_error_stat & 0x400) on
IP28 - due to R10k's precise exception model - can be asynchronous only when
caused by an aborted (misspeculated) instruction ?
The R10k manual, experiences with spurious bus errors and experiments with
"real" and speculated loads/stores seem to suggest this.
Moreover, could it be enough to recognize the bus error as asynchrounous,
when the exception code in cp0_cause doesn't say "Instruction bus error
exception" (6) or "Data bus..." (7), but "Interrupt" (0) ? (i.e. without
analyzing the instruction at epc and register contents)
Rationale for this question: if a memory bus error can reliably be identified
as originating from a misspeculated memory access, it would be possible to get
rid of the myriads of cache barriers before *loads* (stores will remain
protected by cache barriers anyway) again, and spending some thousand machine
cycles on analyzing a bus error every three days of uptime is clearly more
efficient than having a cache barrier in kernel code every seventeen
instructions...
with kind regards
pf
|