On Tue, 22 Jan 2002, Tommy S. Christensen wrote:
> Well, why not use the stack?
> I am not quite familiar with the requirements on this "thread register",
> but couldn't something like this be made to work:
> #define TID *((sp & ~(STACK_SIZE-1)) + STACK_SIZE - TID_OFFSET)
Last time I looked at how pthreads worked it did use the stack pointer to
decide what the TID is. It got rather ugly because the stack on thread 0
was not under program control, so it had all sorts of unknown properties.
But that could be fixed with kernel support I think.
The only reason I can think of to have a *fast* thread-local variable is
to implement thread-local storage. This is a good thing for glibc and
multi-threaded programs - the ultimate implemenation would probably be to
have gcc know about it (if ia64 has dedicated hardware, it is not
unimaginable, and other compilers do implement this)
extern int errno __attribute__((thread_local));
On i386 this has often been done using fs/gs to point to a block of ram.
However, I expect you could probably also base the thread-local ram on the
top/bottom of the stack which means each procedure can compute the
(constant!) base in a couple of instructions. The runtime can know how
much to set aside before it begins executing the new thread. Aligning SP
can be done in a kernel independent way for tid 0.
I don't know if this is worse than making the TLB handler slower to free
up k0/k1, it entirely depends how many functions will be using thread