PIC code

From LinuxMIPS
Revision as of 16:11, 19 January 2008 by Skim (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A quick PIC howto

All userspace code in Linux is PIC code. It is currently not possible to mix non-PIC object files and PIC object files when linking. Therefore the customer needs to generate PIC objects.

Any PIC code function is invoked with its address in register $t9, that's register $25. Using the address in $t9 the callee can compute the address of the GOT (global offset table). This looks like this:

function:
        .set    noreorder
        .cpload $25
        .set    reorder

.cpload computes the address of the GOT. The assembler requires .cpload to be used in a noreorder section so that's the two .set pseudos. Also note .cpload is only needed for functions which reference global data or addresses. That is the assembler equivalent of

int add(int a, int b)
{
        return a + b;
}

doesn't need to do a .cpload but

int var;

int get_ptr(void)
{
        return &var;
}

would need to do a .cpload.

Next there is the .cprestore operation. The global pointer which is stored in the $gp register (aka $28) is a callee saved register. So if a function is doing something that clobbers it, the value needs to be restored before returning. The assembler does that automatically when .cprestore is used. .cpload takes an argument which is the offset into the stackframe where the GP is stored. Putting all this together we get:

foo:    .set    noreorder
        .cpload $25
        .set    reorder

        subu    $29, $29, 32
        .cprestore 16

But if you assemble this the assembler will complain about a missing .frame operator. And missing .ent and .end. So we need to provide those we get:

blah:   .ent    blah
        .frame  $29, 32, $31
        .set    noreorder
        .cpload $25
        .set    reorder

        subu    $29, $29, 32
        .cprestore 16
...
        .end    blah

.ent foo and .end foo only mark the beginning and end of the code for function foo. The arguments of .frame are in order the framepointer which usually is just $29 the stackpointer, the size of the stack frame which is 32 and $31 which is the return address register.

So let's complete this into a full hello world program:

        .globl  main
main:   .ent    main
        .frame  $29, 32, $31
        .set    noreorder
        .cpload $25
        .set    reorder

        subu    $29, $29, 32
        .cprestore 16
        la      $4, hello
        jal     printf

        addiu   $29, $29, 32
        jr      $31
        .end    main

        .data
hello:  .asciz  "Hello world\n"

Note that the assembler code itself hasn't changed so much, it was just a bunch of pseudo ops we had to throw in. Also the -KPIC option needs to be passed to the assembler but by default gcc already does that. The big difference tonon-PIC code becomes visible when disassembling with objdump -d --reloc:

00000000 <main>:
   0:   3c1c0000        lui     gp,0x0
                        0: R_MIPS_HI16  _gp_disp
   4:   279c0000        addiu   gp,gp,0
                        4: R_MIPS_LO16  _gp_disp
   8:   0399e021        addu    gp,gp,t9
   c:   27bdffe0        addiu   sp,sp,-32
  10:   afbc0010        sw      gp,16(sp)
  14:   8f840000        lw      a0,0(gp)
                        14: R_MIPS_GOT16        .data
  18:   00000000        nop
  1c:   24840000        addiu   a0,a0,0
                        1c: R_MIPS_LO16 .data
  20:   8f990000        lw      t9,0(gp)
                        20: R_MIPS_CALL16       printf
  24:   00000000        nop
  28:   0320f809        jalr    t9
  2c:   00000000        nop
  30:   8fbc0010        lw      gp,16(sp)
  34:   03e00008        jr      ra
  38:   27bd0020        addiu   sp,sp,32
  3c:   00000000        nop

As you notice the la and jal instructions are macro instructions and have been expanded into machine instructions in a rather different way than for non-PIC code.

If you wonder why the overhead - PIC code can easily be relocated without copying the entire code. That can save huge amounts of memory compared to the non-PIC code model when a binary is loaded multiple times.

See also

Though SGI has migrated away from MIPS processors for their systems their IRIX 5 documentation is a wealth of information on MIPS programming. Note that IRIX was using SGI's proprietery toolchain so not everything in these documents is directly aplicable to Linux/MIPS.