linux-mips
[Top] [All Lists]

Re: Malta crashes on the latest 2.4 kernel

To: "Kevin D. Kissell" <kevink@mips.com>
Subject: Re: Malta crashes on the latest 2.4 kernel
From: Ralf Baechle <ralf@oss.sgi.com>
Date: Thu, 11 Jul 2002 13:59:57 +0200
Cc: "H. J. Lu" <hjl@lucon.org>, Jun Sun <jsun@mvista.com>, linux-mips@oss.sgi.com
In-reply-to: <005c01c228a2$fb2bf450$10eca8c0@grendel>; from kevink@mips.com on Thu, Jul 11, 2002 at 08:19:55AM +0200
References: <3D2CBF73.50001@mvista.com> <20020710164900.A28911@lucon.org> <20020711043601.B3207@dea.linux-mips.net> <005c01c228a2$fb2bf450$10eca8c0@grendel>
Sender: owner-linux-mips@oss.sgi.com
User-agent: Mutt/1.2.5.1i
On Thu, Jul 11, 2002 at 08:19:55AM +0200, Kevin D. Kissell wrote:

> Excuse me, but I've seen this statement used by others in
> the past as an excuse for doing something silly or not doing
> something reasonable, and it generally hasn't washed.
> In what specific cases have the CP0 pipeline hazards 
> changed between minor revisions of any production
> MIPS CPU?  The *documentation* may have been
> corrected, but these hazards are fairly fundamental
> artifacts of the pipeline microarchitecture of a given
> processor.

Ancient TLB exception handler code was assuming out of order execution of
the instruction stream in cp0 based on the documentation in appendix H
of the R4400 manual, version 2.  I wrote that code for a R4400 version 5.0
and it was running fine on R4000 3.0 but somebody found it to break on
R4000 version 2.2.  At least that are the details as I remember them.  I
don't blame MIPS (well, probably SGI at that time ...) for not documenting
these details perfectly right for each and every R4[04]00 implementation.
The code broken was written extremly aggressivly and eventually had to be
changed anyway for the sake of other processors.

> The CP0 hazard between a write of EntryHi
> and a subsequent TLBWI instruction is flagged
> in the MIPS32 spec and noted as being "typically" 
> 2 cycles.  I'm not going to spend the time going
> through my full set of users manuals, but a representative
> sampling shows this hazard as being specified for
> every R4xxx and R5xxx CPU I checked.  The fact
> that a given CPU *may* get away with it is no
> excuse for not protecting common code.

No argument about this one.  We definately were lucky.

> I note that Ralf has, in fact, applied the fix to the
> OSS CVS repository.  I also note that "BARRIER"
> is still defined to be a string of 6 nops.  I would argue
> (again) that those really, really ought to be ssnops,
> and that if they *were* ssnops, one could probably
> have fewer of them.

I've applied it because I think the whole update_mmu_cache implementation
is ready for a reimplementation anyway.  On the performance this isn't
going to have measurable impact anyway as update_mmu_cache is only being
called once per page fault.

BARRIER is defined as 6 nops since it was written somewhen during the summer
'96.  By that time Linux didn't yet support any processor that was featuring
ssnop, so 6 nops certainly were too paranoid.  These days you're certainly
right, ssnops are the way to go, especially because they don't have any
negative impact on pre-ssnop implementations.

  Ralf


<Prev in Thread] Current Thread [Next in Thread>