Ralf Baechle <firstname.lastname@example.org> writes:
>> Then there's the language-lawyerly code I gave to Peter on gcc-patches@:
>> void foo (int x)
>> int array;
>> if (x)
>> bar (array[0x1fff]);
>> This function is valid if x is never true, so we cannot assume that all
>> accesses off the stack and frame pointers are actually in-frame. You're
>> assuming either (i) the kernel doesn't use code like that or (ii) that
>> "garbage" addresses in the range [$sp - 0x8000, $sp + 0x7fff] will not
>> trigger the problem. I imagine both are reasonable assumptions, and I'm
>> perfectly happy for us to make them. But they're the kind of assumption
>> we need to state explicitly.
> Interesting test case. I've been thinking about it myself but in the end
> decieded to believe Peter's analysis since he's banged the head for longer
> to the wall about this problem that I have ;-) I'm quite but not absolutely
> certain that this case cannot happen for realworld code, so I'd rather
> err on the side of caution.
> Peter & Thomas - we could make the stack thing bullet proof by vmallocing
> stacks and ensuring a sufficient virtual address gap exists around the stack
> such that the stack is the only addressable thing in the range of
> $sp +0x7fff / -0x8000?
FWIW, my first cut at the option restrictions were based on what
the patch exempts (and doesn't exempt). We could instead get gcc
to only exempt accesses that it can prove are either to the current
function's stack frame or to its stack arguments. I.e. rather than
consider every $sp-based access to be safe, we'd instead do some
bounds checking on the value. (We could also use MEM_ATTRS to
pick up cases where a stack variable is acceesed via something
other than the stack or frame pointers, as happens for large frames.)
>> Peter's patch also treated accesses to constant integer and symbolic
>> addresses as safe. Again, this involves making assumptions about how
>> constant integer and symbolic addresses are used, and this is a much
>> less obvious assumption than the stack one.
> The latter assumption is also needed for -msym32 kernels, so it's well
> proven to be valid. The former hold, too.
>> Again, I understand that
>> it's a reasonable assumption to make in the linux context, but it's one
>> we need to pin down. E.g. there must be no run-time guarding of
>> target-specific constant integer IO-mapped addresses in cases where
>> those addresses might trigger the problem on other systems that the
>> same kernel image supports.
> In case of a hypothetic multi-platform kernel of which at least one needs
> the R10000 workarounds, all code would be uniformly compiled with the
> magic -mr10k-cache-barrier option and all source level workaround would
> be enabled.
Hmm. This probably shows I am misunderstanding the problem, but I was
thinking about the IO-mapped case. I thought one of the problems was
that if you had a cached speculative load or store to an access-sensitive
IO-mapped address, the IO-mapped device might "see" that access even if it
doesn't take place. Could you not have a situation where a KSEG0 or
XKSEG0 access is access-sensitive on one machine and not another?
The patch won't insert countermeasures before symbolic and constant
addresses, because it believes all such addresses to be safe.
I'm also a little worried that the compiler is free to make up accesses
that didn't exist in the original program, provided that those accesses
are never actually performed in cases where they'd be wrong. So how about:
Insert a cache barrier at the beginning of any sequentially-executed
series of instructions that contains a load or store. For the purposes
of this option, GCC can ignore loads and stores that it can prove
are an in-range access to:
(a) the current function's stack frame;
(b) an incoming stack argument;
(b) an object with a link-time-constant address; or
(c) a block of uncached memory
It can also ignore sequences that are always immediately preceded by
an untaken branch-likely instruction.
Here, a ``sequentially-executed series'' is one in which calls,
jumps and branches occur only as the last instruction.
Like -mr10k-cache-barrier=load-store, but ignore all loads.