[Top] [All Lists]

Re: mips32_flush_cache routine corrupts CP0_STATUS with gcc-2.96

To: "Maciej W. Rozycki" <>
Subject: Re: mips32_flush_cache routine corrupts CP0_STATUS with gcc-2.96
From: "Gleb O. Raiko" <>
Date: Thu, 11 Jul 2002 17:11:27 +0400
Organization: NIISI RAN
References: <>
"Maciej W. Rozycki" wrote:
> On Thu, 11 Jul 2002, Gleb O. Raiko wrote:
> > I don't wonder if other IDT CPUs also require this, including those that
> > conform MIPS32.
>  Well, for r3k it may seem somewhat justified as cache flushing requires
> cache isolation.  But the IDT manual for their whole family of processors
> claims the D-cache can function as an I-cache (when swapped; doesn't
> apply when not, obviously) and cache flushing can run from KSEG0.
>  See "IDT MIPS Microprocessor Family Software Reference Manual", chapter 5
> "Cache Management", section "Invalidation":
>  "To invalidate the cache in the R30xx:
> [...]
>  The invalidate routine is normally executed with its instructions
> cacheable.  This sounds like a lot of trouble; but in fact shouldnt
> require any extra steps to run cached. An invalidation routine in uncached
> space will run 4-10 times slower."

Aha, you also stepped on this rake. :-) The problem with IDT manuals
that they frequently contradict itself. You're right, SW manual allows
cached flushes, but hardware manuals for the family prohibit this and
state that flashes must be uncahed.
(a hw manual on family, the same chapter, the same section :-) )

It's not only the place where IDT manuals are wrong. For example, their
wbflush example suggests *(int*)KSEG0 instead *(int*)KSEG1.

> > Basically, requirement of uncached run makes hadrware logic much simpler
> > and allows  to save silicon a bit.
>  Why?  I see no dependency.  What's the problem with interleaving cache
> fills and invalidations?

There're two possible optimization:
1. (Requires only the instruction that swaps caches must run uncached)
        CPU may skip implementation of double check of cache hit on loads.
        Scenario: mtc0 with cache swapping with ensuring next instructions are
in cache
        (pipelining here!); swap occurs; must check again the instructions are
        the cache because the same cacheline in the data cache may have valid
bit set
        and CPU will get data instead of code.
2. (Requires the whole routine must run uncached)
        CPU may skip check of cache hit on loads from an isolated cache. 

i don't know what optimization IDT made, perhaps, number 3. But, 1. is
really worth to implement.


<Prev in Thread] Current Thread [Next in Thread>