I'm working on linuxce ( http://www.linuxce.org/ ) for the NEC Vr41xx MIPS
processors ( http://www.ltc.com/linux-mips/ ). We're targeting a range of
devices, from 32M subnotebooks with large secondary storage (128M
CompactFlash, 340M IBM Microdrive) to Pilot-sized devices with maybe 8M of
RAM and no file store besides ramdisk. We have working systems with PPP,
NFS, framebuffer support, and even some Microwindows demos.
Code size is pretty important for this project. If you don't think this
constraint is real or interesting, you can stop reading now. :-)
Some of the Vr41xx chips support the MIPS16 16-bit instruction encoding,
which promises 30-40% increases in code density. I spent a long time
experimenting with the egcs support for MIPS16, and that range appears to be
accurate. But both gcc (mips.md, mips.c) and binutils (GOT relocations, new
assembler expressions) would have to be significantly modified to support
MIPS ABI position-independent code for MIPS16. Also,
I decided to back up a little. What would it take to get
non-position-independent code working? Absolute objects (built with
gcc -mno-abicalls) seemed like a prereq for reusing any existing MIPS16
support. Spending a lot of time on MIPS16 wouldn't be useful for people
with devices with older Vr41xx CPUs, or any of the Philips 39xx devices (in
the Philips Nino handhelds). And -mno-abicalls code is significantly
smaller. For example, busybox, a single executable that acts as
ls/cat/more/mknod/etc depending on how it's invoked, went from 181k with
abicalls to 103k with no-abicalls. glibc 2.0.7 libc.a (with appropriate ld
invocations to throw out relocations) went from ~940k to ~560k.
It would be a nice starting point if no-abicalls executables could use
standard ABI shared libraries. This looked like it required a new libgcc,
but that was easy to do with multilib support. Some executables started
working. I found this hard to believe until I noticed that the PLT stubs
generated by ld in the main program that jumped to the shared libc managed
to set up some registers properly---in particular, they always jumped
through the $t9 register---PIC code uses in the function prologue to get
access to the function's GOT.
dhrystone was especially interesting because performance dropped by a factor
of four. I hypothesized that this could have been the result of dynamic
symbols in the main program's GOT not being properly updated, and thus doing
a symbol resolve from scratch every time they were invoked. I hacked up ld
to generate PLT stubs that forcibly loaded the gp register with the main
program's GOT. This didn't seem to fix the performance problem, and I
stopped poking at that.
Applications that use stdio (besides stdin/out/err) were still segfaulting.
I imagine this could be the result of either a) pointers to data in the main
program not being resolved properly, or b) embedded pointers to static libc
functions that did not show up in the main program's PLT. I can't really
explore this without gdb, so I'm blocking on getting access to a big enough
mipsel machine to run gdb on.
So right now intercall between abicalls and no-abicalls code is a problem.
What about a totally no-abicalls, absolute object system? Dynamic linking
goes out the door, sure, but Linux survived without it for a few years on
Obviously I could drop glibc and get newlib and graft linux system calls on
it. That sounds like it would create endless porting problems for standard
linux utilities. Better to use glibc in some form.
The obvious way to support an absolute glibc shared library is to link
libc.a at some fixed address and just mmap it into each executable at that
address. This is exactly the scheme libc 2 through early libc 4 used on
Linux x86. Besides DLL symbol export issues, the biggest loses of this are
that a) hunks of the address space have to be handed out to shared library
owners by a central authority, b) version bumps of the shared library are a
pain even when jump tables and data reordering tools are used to try to keep
symbols at the same address, which means that c) it's *hard* to build and
maintain shared libraries.
For Linux on these small devices it still might make sense. Many of the
applications and libraries used on a handheld are not going to be able to
share much code with the mainline desktop/server Linux apps. Versioning
isn't as much of an issue when you can crosscompile all of the apps and
libraries on an 8M initial ramdisk in an hour, and relink in a matter of
minutes. And the code size reduction is compelling---even if we end up not
sharing a few libraries because of the difficulty in doing so, we probably
still come out ahead because of that huge drop in code size.
I don't have an absolute shared glibc binary, although I do have an absolute
libc.a. I'm working on a few tools to automate the build. Do glibc people
have ideas on this? I've noticed that "is using the ELF object format" is
often conflated with "is using SVR4 ABI ELF shared libraries" in the source.
I'm also looking for help on alternate ABIs for MIPS that could help with
code size problems; perhaps the the biggest problem is the caller/callee
save nature of the $gp register, but I'm hoping hardcore MIPS people
understand these issues much better than I do.