peter fuerst <email@example.com> writes:
> could text like this help to pin the assumptions down (from
> "http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01446.html") ?
> What cases of $N can be exempted from this measure?
> - Stack-addresses and constant (static) addresses ("sd $M,symbol+n") will
> be used for DMA, since DMA-buffers are allocated at runtime.
> - Uncached accesses will not be done speculatively, but they fall under the
> "constant"-case already or will not be recognized at compile-time.
> Besides the DMA-problem, depending on the mis-speculation path (up to four
> branches deep), one of the frequently reused multi-purpose registers $N
> will contain some "random" value, which may be a legal but invalid kernel-
> address (say a800000061234567), reaching the memory-controller...
> However, there are cases where a register $N's content is well defined, no
> matter what (mis-)speculation path took us to this instruction:
> - The stack-pointer points to the stack from kernel-initializtion on.
> - Constant addresses ("symbol+n") are well defined "per se".
> (Luckily, legal-but-invalid doesn't occur in user mode, where no cache-
> barriers can be used. There we get either an address-error or a TLB-miss,
> leaving memory/bus untouched.)
Well, the explanation of the exceptions doesn't really address the
corner cases I was trying to draw attention to in the message you
replied to. What about top of the stack + X? Do we guarantee that
the code will never cause the compiler to generate a store to such
an address, even with an always-false guard? Or do we guarantee
that stores and loads to [top-of-stack, top-of-stack + 0x7fff] can
be speculated safely? Do we guarantee that every store and load to
a cached constant address in the kernel image will not result in
a harmful IO access on any target that the image supports?
Perhaps we should just turn this around slightly and instead say:
what must the compiler do, and when must it do it? The reasons why
aren't that important from the compiler's perspective. So if we can
just phrase it as:
Insert a cache barrier at the beginning of any sequentially-executed
series of instructions that contains a load or store. For the purposes
of this option, GCC can ignore loads and stores that it can prove:
(a) access a region in the range [-0x8000 + bottom of stack frame,
0x7fff + top of stack frame]; or
(b) access a link-time-constant address.
Here, a ``sequentially-executed series'' is one in which calls,
jumps and branches occur only as the last instruction.
Like -mr10k-cache-barrier=load-store, but ignore all loads.
And if you guys are willing to make sure that's safe, and change
the kernel whenever you find instances that it isn't safe, then
that should be enough. (Bear in mind that there's ongoing work
to do link-time optimisation in gcc, so translation-unit separation
is no real guarantee.)