Prefetching

From LinuxMIPS
Revision as of 17:03, 5 December 2005 by Ralf (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

There are basically three types of uses of prefetch instructions in the kernel:

  • Prefetching data structures as done by chained list walking macros and similar.
    Linux uses 'Prefetch for Load' (hint 0) if CONFIG_HAS_PREFETCH is defined for a particular configuration.
  • Prefetching in memcpy
    Linux will use prefetch hints 0 and 1 - but currently only on cache coherent platforms.
  • Prefetching in copy_page / clear_page
    These functions are generated at runtime and Linux has detailed knowledge about which processors have a useful prefetch implementation:
    • R4000 and R4400:
      These processors don't have prefetch instructions but Linux uses the CreateDirtyExclusive cacheop to achieve the same effect as PrepareForStore, so it should probably be mentioned here.
    • R5000 and many variants
      While this processor has a prefetch instruction it's a nop, so would only harm performance. Linux therefore will not use prefetching.
    • R10000, R12000, RM9000
      Linux uses 'LoadStreamed' (hint 4) and 'StoreStreamed' (hint 5). In case of the RM9000 this is due to a processor bug in early revisions.
    • All others
      LoadStreamed (hint 4) and PrepareForStore (hint 30).

Userspace use of prefetching

The kernel has no control over the use of prefetch instructions in userspace, so it is upto applications to make proper use of prefetching in situations and on hardware where this is expected to result in a performance gain.

Gcc can prefetch large arrays if enabled -fprefetch-loop-arrays. This option is disabled by -Os.