On Wed, Jul 14, 2004 at 05:35:19PM +0100, Dominic Sweetman wrote:
> If you use hit-type cache operations in a kernel routine, then you're
> safe. I can't envisage any circumstance in which Linux would try to
> invalidate kernel mainline code locations from the I-cache (well, you
> might be doing something fabulous with debugging the kernel, but
> that's not normal and you'd hardly expect to be able to support such
> an activity with standard cache management calls).
>
> So this problem can only arise on index-type I-cache invalidation. I
> claim that a running kernel on a MIPS CPU should only use index-type
> invalidation when it is necessary to invalidate the entire I-cache.
> (If you use index-type operations for a range which doesn't resolve to
> "the whole cache" then that should be fixed).
>
> That implies that a MIPS32-paranoid "invalidate-whole-I-cache" routine
> should:
>
> 1. Identify which indexes might alias to cache lines
> containing the routines's own 'cache invalidate' instruction(s),
> and thus hit the problem. There won't be that many of them.
>
> 2. Arrange to skip those indexes when zapping the cache, then do
> something weird to invalidate that handful of lines. You could
> do that by running uncached, but you could also do it just by using
> some auxiliary routine which is known to be more than a cache line
> but much less than a whole I-cache span distant, so can't possibly
> alias to the same thing...
>
> This is fiddly, but not terribly difficult and should have a
> negligible performance impact.
>
> Does that make sense? Am I now, having named the solution,
> responsible for figuring out a patch (yeuch, I never wanted to be a
> kernel programmer again...).
You don't have to :-) What became a architectural restriction for MIPS32
did already show up earlier as an erratum for the TX49/H2 core. This is
the solution which we currently have in the kernel code:
#define JUMP_TO_ALIGN(order) \
__asm__ __volatile__( \
"b\t1f\n\t" \
".align\t" #order "\n\t" \
"1:\n\t" \
)
#define CACHE32_UNROLL32_ALIGN JUMP_TO_ALIGN(10) /* 32 * 32 = 1024 */
#define CACHE32_UNROLL32_ALIGN2 JUMP_TO_ALIGN(11)
static inline void mips32_blast_icache32(void)
{
unsigned long start = INDEX_BASE;
unsigned long end = start + current_cpu_data.icache.waysize;
unsigned long ws_inc = 1UL << current_cpu_data.icache.waybit;
unsigned long ws_end = current_cpu_data.icache.ways <<
current_cpu_data.icache.waybit;
unsigned long ws, addr;
CACHE32_UNROLL32_ALIGN2;
/* I'm in even chunk. blast odd chunks */
for (ws = 0; ws < ws_end; ws += ws_inc)
for (addr = start + 0x400; addr < end; addr += 0x400 * 2)
cache32_unroll32(addr|ws,Index_Invalidate_I);
CACHE32_UNROLL32_ALIGN;
/* I'm in odd chunk. blast even chunks */
for (ws = 0; ws < ws_end; ws += ws_inc)
for (addr = start; addr < end; addr += 0x400 * 2)
cache32_unroll32(addr|ws,Index_Invalidate_I);
}
All it takes is using this for all MIPS32 / MIPS64 or maybe even all
processors and some tuning of constants to make this suitable for
all possible I-cache configurations.
Ralf
|