[Top] [All Lists]

Re: help offered

To: <>
Subject: Re: help offered
From: "Greg Chesson" <>
Date: Wed, 25 Nov 1998 13:57:05 -0800
Cc: Ariel Faigon <>, Olivier Galibert <>,
In-reply-to: <> "Re: help offered" (Nov 25, 4:38pm)
References: <>
On Nov 25,  4:38pm, <> wrote:
> Subject: Re: help offered
> On Wed, 25 Nov 1998, Greg Chesson wrote:
> > But the memory subsystem is ccNUMA.  That means any channel in the system
> > can read/write any memory in the system.  With io buffers that comprise
> > multiple pages, and with the pages of the buffer located on several
> > memory controllers, multiple io channels can burst (in parallel) to the
> > "array" of pages that comprise the buffer.
>     Except some of this has to go through the CrayLink.  The memory you

there are 8 IO links in the example I gave plus numerous Craylinks  -
I think it's 16 for the example.  The bandwidth definitely does not go
down a single link.

> are "bursting" to is not on the same node.  Therefore, if you have a
> dual-threaded application that runs over the data, at most the max
> bandwidth is 1.6GB/s (seeing as it's advantagous to spread your code to

the application in the example is single-threaded.
Lot's of people just want a bigpipe and a single file descriptor.

> two nodes and split the memory between them).  If you application can make
> use of all processors on that box, then you get the full bandwidth.  The

a single-thread app can easily malloc pages from all the processor slots
on the box.  Can't do that in a cluster or a shared-nothing machine.
You can think of processor slots as just extra memory controllers.
For some applications we ship with "sparse" processor nodes for just
this purpose.

> most any single processor in that Origin can handle is 800MB/s and if it
> needs to get that data, eventually that data is shoveled through the
> CrayLink (and hopefully is gets migrated there).  Is there anything flawed
> with this reasoning?

The single processor limit is set by the memory controller bandwidth.
It can peak at over 600 MB/s, but 500 MB/s is a good number for sustained
random access ops.
>     I don't see why it cannot be done.  The page-cache/file system buffer
> cache are supposed to be merged.  If you mmap that data, you should just
> get a pte pointing to that area in the page cache.


>     But that bandwidth isn't single node bandwidth.  No single node can do
> 4GB/s.  All nodes need to use their local memory to achieve max bandwidth.

we do make systems where single node bandwidth is many gigabytes/sec.
They're called vector supercomputers.

The max amount of memory on a motherboard is 4 GBytes, I think.
In order to get a bigger memory, more processors must be added.
Do you want to criticize that, too?

The "beauty" of the ccNUMA memory architeture is that by using off-the-shelf
memory circuits, both bandwidth and capacity can be aggregated in a modular
way and still be mapped into a coherent virtual address space.


Greg Chesson

<Prev in Thread] Current Thread [Next in Thread>