On Wed, Jan 10, 2001 at 09:08:08AM +0100, Carsten Langgaard wrote:
> You are absolutely right, it is implementation dependent.
> I just tend to use the mips32 implementation for my R4000s as well, and here
> as
> Ralf mention it is performance improving.
> Actually we have included a CPU option flag (MIPS_CPU_CACHE_CDEX), what tells
> us
> if the CPU has the Create_Dirty_Exclusive CACHE operation available.
> So we should probably use it, now it is here :-)
Homework for somebody with some time at his hands - we have a large number
of unrolled loops for all sorts of cache variations in r4xx0.c. Benchmark
if they actually improve performance. I wouldn't wonder if due to a large
number of pipeline stalls in one of those routines the whole unrolling
business doesn't buy us anything.
Ralf
|