[Top] [All Lists]

Re: TLB Management , C library

To: (Didier Frick)
Subject: Re: TLB Management , C library
From: Systemkennung Linux <>
Date: Thu, 6 Jun 1996 05:34:33 +0200 (MET DST)
In-reply-to: <> from "Didier Frick" at Jun 6, 96 02:29:42 am
Hi all,

>       It works regardless of the format of the context register, so
>       maybe it will be useful for other currently unsupported 
>       MIPS processors.
>       I'll get back to optimize it later  (I know, they all say that :-))
>       Now the system boots a good deal further, unfortunately it
>       hangs on the kernel_thread call which is supposed to launch
>       init. Do you have any suggestions on what I should look for
>       to catch this one ?

If the system loops: check that the executable you're using is really
a MIPS ELF executable with the x bits set.  I've uploaded an simple
hello, world program written in assembler a long time ago.  Try to install
the program as /bin/sh (remove your init proram before trying this) and to
boot your system.  You should get "hello, world!\n" messages.

If the system just hangs: my first attempt would be to double check the
tlb exception handlers once again.

>     linux-mips> Which libc have you been using for your compilation
>     linux-mips> attempts?
>       I don't have it here, but if I remember well it was something
>       like 4.6.28, with two patches. We took the complete state
>       on the fnet server on May 26.

Ouch.  libc 4.6.27 is outdated.  Due to the massive changes between
version 1.2.11 and 1.3.0 of the kernel that broke binary compatibility
the only use of this libc 4.6.27 is being able to build libgcc.a for
the a.out (targets mips{el}-linux).

Please use the supplied binaries of GNU libc snapshot 960501.  These
are compiled for plain R3000 and contain no optimizations for the R4000
so they should work for you.  You might also try the binaries from
root-0.01.tar.gz or one of the other binary archives.

Is your system little or big endian?

>       In our case, we use the script a lot: our target boots from
>       a Flash ROM (currently the minimal compressed kernel takes around
>       170 Kb, with Minix, serial driver, plus a minimal set of features.
>       In the current state (kernel thread creation hangs when launching
>       init), it takes around 450Kb code and 205Kb data).
>       To do this, we need some linker customization which is best done
>       in a linker script.

If you need a special linker script you can put it into an new directory
in arch/mips/<target>/ld.script and add the required options for ld in

>       I'm not sure they were in the .sbss section, but there were
>       definitely some kernel variables overwritten during kernel_startup
>       until we moved the definition of _end.

I just checked again - the binutils (snapshot 960502) generate a .sbss
section for whatever reasons but the section is always empty.  Older
versions of the binutils generated non-empty .sbss sections (there
shouldn't be a .sbss section at all), therefore the hack with _end.

>       If your problem with the binutils has anything to do with binary
>       format, we have a patch for the bfd lib to handle output addresses
>       correctly when creating a binary file. I think this patch has
>       already been made available by Robin Farine, but if anyone wants
>       it please ask.

I'm interested in taking a look at this patches.

>       About the state of our work:
>       We have planned to release a set of clean patches when the
>       system runs.
>       Until now, we're working in a real hurry and we absolutely 
>       don't have time to spend on anything else than getting the
>       system to boot and be usable.
>       If anybody can be content with our raw work files, with:
>       - Virtually uncommented changes
>       - Untested modifications
>       - Absolutely no file headers or legalese of any sort
>       - Scrambled indentation because we use the default mode
>         of emacs (no time to add emacs tags to source :-)).

Ceterum censo Emacs esse delendam  (Asterix readers should understand this :-)

>       - But if you know diff (or better, ediff), you should
>         be able to cope, maybe find some ideas or most likely
>         undig some bugs.

I'd like to get the R3000 diffs from you into my sources as soon as
possible so that I can help the DECstation people this way.

I've written a short text that describes the handling of the pages tables
a bit closer.  I hope this help you people out these to understand this
part of the kernel better because it's by far more complex than eg. on
Intel or m68k.  I'll also put this text into arch/mips/doc/ in the
kernel sources.


Opposed to other architecures like i386 or m68k architecture all MIPS
CPUs only implement the TLB itself and a small set of functions to
maintain it as hardware.  The actual maintenance of the TLB's contents
is implemented in software only.

The TLB has a relativly small number of entries.  This limits the
maximum address space that can mapped by the TLB using 4kb pages and
without consideration of wired entries to a maximum of 512kb for the
R1000, 384kb for the R4000/4400 and 256kb for the R2000/R3000.  This
actual size of mappable space is even smaller due to the wired entries.

Especially for processes with a huge working set of pages it is therefore
important to make the process of reloading entries into the TLB as
efficient as possible.  This means:

 - Choosing a data structure that can be handled as efficient as
 - The implementation of the low level pagefault handling has to be
   implemented in a efficient way.

The Linux kernel itself implements three level page tables as a tree
structure.  Linux implementations that don't need three levels of page
tables can fold one level of the page tables so that effectivly a two
level page table remains.  The exact size and content of the entries
is upto the implementation.

Opposed to this the MIPS hardware architecture implies by the data
provided in the c0_context/c0_xcontext registers a simple array of
4 byte elements (for R2000/R3000/R6000) or 8 byte elements (for the
other 64bit members of the CPU family).

The page tables are mapped to the address TLBMAP (which is usually
defined as 0xe4000000 in <asm/mipsconfig.h).  The page which contains
the root of the page table of the current process, the "page directory"
and is therefore mapped at (TLBMAP + (TLBMAP >> (12-2))) (this is the
value of the define TLB_ROOT which is defined as 0xe4390000).  That
way the kernel itself can access the page tables as a tree structure
while the exception handlers can work with maxiumum efficiency accessing
the page tables as simple array.

The tlb refill handler itself is very simple.  For the R4x00 family it
has just 14 instruction, for the R4600 and derivatives it can be
optimized to 12 instruction, even further for the R10000.  This
exception handler is very simple and fast and therefore doesn't any
checking for errors or special cases.

It can therefore happen that the entry that is attempted to be reloaded
isn't mapped via the pagetables thus resulting in a double tlb refill
exception.  Due to the EXL flag set in c0_status this exception goes
through the general exception vector and from there to handle_tlbl.
Handle_tlbl is a more complex exception handler that is - compared
to the first handler - complex and called far less often.  It features
handling of special cases and some error checking for debugging.  This
second handler still doesn't reenable interrupts, change to the kernel
stack or save registers to be as efficient as possible.  Therefore
only the two registers k0/k1 are available for use.  All this is only
done when do_page_fault() in arch/mips/mm/fault.c is called.  For the
normal case this handler just reloads the entry mapping the pte table
which again contains the entries to be loaded in the tlb.  Since the
original fault address has been lost this exception handler cannot
complete the job.  So it just returns to the main program which after
taking another exception via the first tlb refill handler reloads the
originally missing entry into the TLB and continues normal execution.

Another special in the Linux/MIPS page handling is the handling of
pages in non-existant branches of the page tables.  To avoid that
the exception handlers have to handle this special case the kernel
maps these ptes (page table entries) to invalid_pte_table.  This is a
4kb page full of invalid entries.  On an attempted access to such an
invalid page the kernel then reloads - eventuall via a double fault
this invalid entry into the tlb.  The CPU then takes a tlb invalid
exception resulting in a call to do_page_fault() which usually will
take the apropriate measures like sending SIGSEGV.

Downsides of this implementation are it's complexity and the faster
handling of the majority of exceptions is bought at the expense of
having to handle page aliasing problems with the page tables (which
are accessed at TLBMAP and in KSEG1) itself.  This is done using
uncached accesses which are especially on older machines with slow
memory subsystems painfully slow.  The implementation is done this
way because for the original hardware which Linux/MIPS was intended for
had a blindingly fast memory interface.

<Prev in Thread] Current Thread [Next in Thread>