linux-mips
[Top] [All Lists]

Re: mpp kernel interface

To: Andrew.Tridgell@anu.edu.au
Subject: Re: mpp kernel interface
From: alan@lxorguk.ukuu.org.uk (Alan Cox)
Date: Thu, 16 May 1996 18:19:16 +0100 (BST)
Cc: lm@gate1-neteng.engr.sgi.com, linux-mc@arvidsjaur.anu.edu.au, Linus.Torvalds@cs.Helsinki.FI, linux@neteng.engr.sgi.com, alan@cymru.net
In-reply-to: <199605161420.AAA24100@arvidsjaur.anu.edu.au> from "Andrew Tridgell" at May 17, 96 00:20:34 am
Sender: owner-linux@cthulhu.engr.sgi.com
> We use sockets to implement the stdin/stdout/stderr of parallel
> processes. The paralleld that launches parallel programs on each cell
> first creates 3 sockets back to the launching program, setting them up
> as file desciptors 0, 1 and 2. When it then does a fork()/exec() the
> parallel program inherits them.

Two things strike me here. Firstly if you are doing that kind of output
redirection across 192 cells you are going to need 192 logical connections
however you do it. Secondly you really want your node end library to be
a bit smarter and pass a tty check across the link so you can use tty/pty
pairs if the real descriptor is a tty.

> parallel programs. If we had a remote fork() and/or remote exec()
> and also had a way for the file descriptors of remote forked processes
> to feed back into the parent cpu then it would be much better. 

MOSIX does this by trapping them at the VFS layer. Effectively each inode
and file handle has a host field and if the operation is remote you RPC.

> We'd probably also need to use a tree structure to feed the file
> descriptors (and paging for that matter) back up into the parent
> process. 1000 children all writing to one parent would not be pretty. 

It would be an interesting application of multicast groups to allow the parent
to roam as well. With 1000 children thats an even bigger scaling problem, and
for sending stuff to a large number of nodes (eg a loosely synchronized SIMD
job) its going to be needed.

> The problem is really latency. Ethernet type systems have latencies
> which aren't much lower than the system clock tick interval. This
> means it often makes sense to do things is quite different ways to
> what we have to do.

Yes. The latency also means that attacking from two other angles is interesting
Firstly 10Mbit ethernet - latency is no worse really just we have to be more
reluctant to bulk copy data, and also combining it with something like the
TTL PAPERS device for the fast sync stuff (its a $60 to build parallel port
synchronization system with about a 3uS overhead). Very limited but might
solve some of our problems on ethernet linked boxes.

Alan


<Prev in Thread] Current Thread [Next in Thread>