On Tue, 1 Jun 2004, Kevin D. Kissell wrote:
> > > Now that gcc 3.4 has incompatible ABI changes (on o32 mostly affecting
> > > mipsel) I've been discussing with Thiemo if I'd be the right point to
> > > take this ABI change as a possibility to additionally reserve a TLS
> > > register.
> > > He suggested $24 (t8) another discussed possibility would be $27 (k1)
> > > which is already abused by the PS/2 folks for ll/sc emulation.
> > > Another possibility would be to reserve such a register only in the
> > > n32/n64 ABIs and let o32 stay without __thread and TLS forever.
For Linux the n32/n64 ABIs can be considered being in the initial stage
of deployment, so backwards compatibility is a non-issue. Whatever is
found to be the best solution may be accepted. So the problem of defining
a TLS pointer exists for the o32 ABI only and given the existence of
MIPS32 ISA and its implementations ignoring the issue won't only affect
ancient (but still alive) hardware.
> > Sigh, we'e been through this really often enough. Reserving a register
> > comes at a price so my approach was to implement a fast path in the
> > exception code. I've benchmarked that long time ago; it had less than
> > half the overhead than normal syscall and such a function would be subject
> > to normal code optimizations so calls should be few only. Alpha already
> > does something similar using their PAL code.
It seems a reasonable balance, IMO.
> The overhead realtive to a normal syscall is much less interesting
> to measure than the overhead relative to having the pointer already
> in a register - after all, half of a whole lot of instructions is still a
> lot of instructions.
The interesting factor is how much software really needs threading.
AFAIK, the majority does not -- I can count threaded software I know of
(but not necessarily use) using fingers of one hand. That does not mean
there are no niches that make use of that approach extensively -- they
could see a benefit, but why to penalize the rest?
> As some, but perhaps not all of you know, MIPS is working on
> multithreaded extensions to the instruction set architecture and
> the hardware, which include the ability to create and destroy
> parallel threads of executioon without OS intervention in the
> "expected" case (and yes, I have thought about how Linux
> could support this, but I'm not gonna go into that here).
> In such a framework, it would not be acceptable to do a
> system call to get a TLS value.
Well, this is exactly a good counter-argument for having a TLS pointer in
a gp register. I someone needs the fastest threading possible, then let
them use the right hardware (with the threading ASE) or accept the
inefficiency of hardware that predates the concept of threading.
> I don't yet have an opinion as to whether we need to retrofit
> things so that user-level multithreading is compatible with o32,
> but I would comment that if we go for a TLS register, k1 may
> not be a very good option. The LL/SC emulation trick with k1
> works by virtue of k1 being *destroyed* by exceptions - it doesn't
> change its status as a register reserved for kernel use.
Actually the trick has never found its way into the mainline, and perhaps
it's best to keep it outside as messing with the k registers is inherently
fragile. Of course, this applies to a possibility to use one of them for
the TLS, too. ;-)
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+ e-mail: email@example.com, PGP key available +