On Thu, Apr 07, 2005 at 08:14:06AM -0400, Greg Weeks wrote:
> What's the performance hit for doing a pref on a cache line that is
> already pref'd?
A wasted instruction.
(More complicated on certain multi-issue in-order processors such as the
SB1 CPU core. Mentioning this for completeness; we shouldn't worry about
it here.)
> Does it turn into a nop, or do we get some horrible
> degenerate case? Are 64 bit processors always at least 32 byte cache
> line size?
The smallest D-cache line I know of is 16 bytes.
> I don't really expect anyone to know the answers right now. I
> expect I'll need to time code to tell. This makes generating them at run
> time look better and better.
Indeed. Initially when we started doing such things some people felt it
might be really bad to debug and everything but in practice it's been a
relativly minor problem, so I guess the resistance against yet another
run-time generated group of functions is getting less.
One interesting issue to solve - memcpy, memmove and copy_user are combined
into a single big function, so the fixups for userspace accesses need to
be handled at runtime as well.
Ralf
|