On Tue, Sep 23, 2003 at 12:35:44PM +0100, Dominic Sweetman wrote:
> As usual, I guess the first thing is to try doing it the standard way
> and then try to measure how much time is being spent in extra TLB misses
> generated by your application. Some MIPS CPUs have "performance
> counters" which might be able to count TLB misses, but you'll more
> likely have to instrument the TLB miss code.
> If it does turn out that TLB replacement is a big drain:
> Most MIPS CPU hardware allows you to map large chunks of memory with a
> single TLB entry: often up to 16Mbytes at a time. But I don't know
> how you'd persuade Linux how to do that.
As an indication at how effective large pagesize support can be for
applications, take a look at the two USENIX 98 papers titled "General
Purpose Operating System Support for Multiple Page Sizes" by SGI about
IRIX and the "Implementation of Multiple Page Size support in HP-UX"
presented on the same. Given that we have what QED once called the
slowest TLB reload handler they've even seen the impact could be even
stronger than demonstrated in these two papers. The implementation
described has been condemened by Linus as stupid and unacceptable. I
expect a conceptually different optmization on MIPS late this year.
In any case the paper show how costly TLB exception handlers can be;
the reason why I yell at about everybody who's mentioning the phrase
"wired tlb entries".
For the time being Linux has large page support for the kernel - read
KSEG0 / KSEGX. Another optimization is also the use of the global bit
for all kernel mappings and for 2.6 support for hugetlbfs on MIPS should
also be fairly easy.
Btw, again and again the MIPS r4k-style TLBs are a bit of a pain because
each entry maps a pair of pages which share some of their attributes ...