[Top] [All Lists]

Re: Linux on the O2

To: "William J. Earl" <>
Subject: Re: Linux on the O2
Date: Fri, 5 Dec 1997 02:33:16 +0100
Cc: Greg Chesson <>, David Chatterton <>, Lige Hensley <>, Chris Carlson <>,
In-reply-to: <>; from William J. Earl on Thu, Dec 04, 1997 at 03:33:01PM -0800
References: <Pine.SGI.3.96.971204001929.20475A-100000@barramunda> <> <> <> <> <> <>
On Thu, Dec 04, 1997 at 03:33:01PM -0800, William J. Earl wrote:

>  > Indeed - and you're pointing to what I consider _the_ problem with
>  > current Linux kernels.  Linux uses the ``buddy system'' described by
>  > Donald Knuth's ``Algorithems And Data Structures'' to maintain it's
>  > free pages.  This algorithem results in massive fragmentation even
>  > after a short uptime.
>      Are you saying that linux uses the buddy system for all of memory,
> or just for the kernel heap?  (I would be surprised if it were used
> for other than the kernel heap, although that is bad enough.)  

The buddy system is only used to maintain the pool of free pages.  The
buddy system has the advantage that it is very fast.  On top of this
lowest level we've got additional layers:

 - The ``slab allocator''.  It's basically what Jeff Bonwick from Sun
   describes in his USENIX paper from '94.  Miguel has the paper on his
   homepage.  The slab allocator has been added during the Linux 2.1.x
 - The ``simp allocator''.  Yet another memory allocator for cached objects
   that has been added during 2.1.x.  Pretty fast and still under
 - kmalloc() is the kernel equivallent to malloc(3).  It's mostly used
   for allocations smaller than a single page but can be used to allocate
   upto 128kb.  For Linux 2.0 kmalloc() is getting it's pages directly from
   the free page pool.  For Linux 2.1.x kmalloc is implemented on top of
   the slab allocator.
   Both implementations are directly or indirectly getting the memory from
   the pool of free pages (that's KSEG0 on MIPS), therefore have to
   live with the advantages and disadvantages of the buddy system.

>       One thing we do in IRIX is to always allocate kernel heap buffers
> of 1 page or larger as an integral set of pages, mapped into kernel
> virtual space, but not part of the main kernel heap (which is used only
> for smaller buffers).  Except for fragmentation of the kernel mapped
> space pool (a pool of address space, not real memory), this guarantees
> you can always get large buffers if pages are available.  Fragmentation
> of smaller buffers of course requires a better heap manager, although
> using zone allocation (where a zone has blocks all of the same size,
> and there are zones for most popular sizes) helps a lot and is also
> faster than using the regular heap manager.

Actually we also have this type of kernel virtual memory in Linux.  On
MIPS the address space >=KSEG2 is being used for that purpose.  The
functions vmalloc(9) and vfree(9) are used for that purpose.  However
vmalloc is rarely being used in the kernel.  Among the reasons is that
vmalloc() is slower than other types of memory allocation.  Furthermore
the primitive PC-style DMA hardware often does not have the required
support to use vmalloc'ed memory as scatter/gather buffer.  Finally
accessing vmalloced memory may result in TLB reloads on some architectures
while pages allocated from the pool of free pages don't.  So accessing
vmalloced memory may imply a performance penalty.

>       We (the SGI people on the list) will have to talk with the
> general manager for the O2 platform.  I personally think that would be
> ok.  As I mentioned in my earlier message, a lot of the IRIX value
> added for O2 is in essentially generic facilities, not in the
> device-dependent drivers, and we already tell customers in high level
> terms about the overall architecture (such as the unified memory for
> graphics and I/O), so there is not much in the hardware documentation
> which needs to be treated as a trade secret.  Conceptually, the O2 graphics
> pipeline is fully described by the OpenGL reference (software) implementation
> and specification, so the hardware value added is in the internals
> of the implementation, not in the interface (which is essentially
> a large part of the OpenGL pipeline).

Well, sounds good.  You're making the differenciation between the interface
and the implementation of the O2, something that so far people I've been
discussing with haven't done.  As a software guy all need is the
interface and maybe a peek into internals for a better understanding of
the interface.


<Prev in Thread] Current Thread [Next in Thread>