> > There're two possible optimization:
> > 1. (Requires only the instruction that swaps caches must run uncached)
> > CPU may skip implementation of double check of cache hit on loads.
> > Scenario: mtc0 with cache swapping with ensuring next instructions are
> > in cache
> > (pipelining here!); swap occurs; must check again the instructions are
> > in
> > the cache because the same cacheline in the data cache may have valid
> > bit set
> > and CPU will get data instead of code.
> I can't really see a problem here for proper implementations. The CPU
> may have fetched a few instructions beyond the mtc0 doing a cache swap.
Load from memory into I-cache, setting the valid bit.
> It's OK since we didn't modify the code. As long as the swap doesn't
> complete, the CPU is using the real I-cache. Once it's completed, it uses
> the D-cache. Since the new cache is used in the normal mode of operation,
> now tag matches and line replacements occur here as if it was the real
> I-cache. No need to do any extra checks at any stage.
Have to check the cacheline at given address again. D-cache may have the
valid bit set for the cacheline at the same address. Address means
location in a cache, not memory. Check at address requires one extra
tick as opposed to checking the bit.
Please, note that CPU isn't a monolitic program, but rather a set of
functional blocks, so "proper implementation" may require additional
signals on wires and delays.
> It's possible they broke something, simply.
My guess they implemented No. 1. more or less.
Anybody from IDT here with strong willing to broke NDA ? :-)