On Fri, Oct 01, 2010 at 02:45:17PM -0700, David Daney wrote:
> In user space the rmb() must expand to a SYNC instruction. I am not
> sure what your version in the patch is doing with all those NOPs. That
> is not guaranteed to do anything.
That's a rather old version of the kernel rmb macro I think. The NOPs
where there to enforce ordering of a mix of cached and uncached accesses
on the R4400 (not R4000) where according to my reading the manual leaves
it a bit unclear if a SYNC is sufficient or if the pipeline needs to be
drained in addition. See version 2 of the R4000/R4400 User's Manual.
> The instruction set specifications say that SYNC orders all loads and
> stores. This is a heaver operation than rmb() demands, but is the only
> universally available instruction that imposes ordering.
>
> For processors that do not support SYNC, the kernel will emulate it, so
> it is safe to use in userspace. I wouldn't worry about emulation
> overhead though, because processors that lack SYNC probably also lack
> performance counters, so are not as interesting from a perf-tool point
> of view.
Yes, just use SYNC. SYNC-less processors would only be R2000/R3000
processors and a few other oddball processors which for performance
optimization are totally uninteresting since years.
Ralf
|