P32 Linux ABI

From LinuxMIPS
Jump to navigationJump to search

This page describes the p32 Linux ABI for nanoMIPS. It is still subject to change (until it is upstream).

System Call Register ABI

The p32 system call ABI has a few notable differences with the o32/n32/n64 ABIs and some similarities.

  • The return value is put in $4 (a0) instead of $2 (v0) to match the normal p32 calling conventions
  • Errors are encoded with negative return values rather than a positive error code in the return value and a 1 in a separate output register $7 (a3)
  • System call numbers use generic numbering (see below) unlike the other MIPS ABIs which each have custom numbering
  • Arguments are passed in $4-$10 (a0-a6) like n32/n64
  • 64-bit arguments are split and aligned on an even register like o32
  • Saved registers $16-$23 (s0-s7), $28-$31 (gp, sp, t8/fp, ra) are preserved
System Call Register ABI Comparison
Direction o32 n32 n64 p32 p32 Comments
System call number in $2 (v0) $2 (t4) o32/n32/n64 also use $2 (v0) for error code / return value
custom numbering custom numbering custom numbering generic numbering (asm-generic/unistd.h) See below
4000-4999 5000-5999 6000-6999 0 - __NR_syscalls
Argument 1 in $4 (a0)
Argument 2 in $5 (a1)
Argument 3 in $6 (a2)
Argument 4 in $7 (a3) o32/n32/n64 also use $7 (a3) for error condition output
Argument 5 in $29 (sp) + 16 $8 (a4)
Argument 6 in $29 (sp) + 20 $9 (a5)
Argument 7 in $29 (sp) + 24 N/A o32 uses when split 64-bit values need aligning on an even argument (sync_file_range and fadvise64_64).

p32 uses packed versions of these two syscalls to avoid needing 7th argument register (see below).

Argument 8 in $29 (sp) + 28 N/A o32 needs for __NR_syscall for syscalls using 7th arg
Instruction SYSCALL SYSCALL[32] only

(not SYSCALL[16] or SYSCALL)

Supporting both encodings will add a little overhead (to check which encoding using BadInstr to be able to correctly step over the instruction).

The code operand is implicitly = 0 which can be encoded with either nanoMIPS encoding (SYSCALL[16] or SYSCALL[32]), however there could be future nanoMIPS hardware which only supports the 32-bit subset of nanoMIPS instructions, i.e. naturally aligned with no 16-bit or 48-bit instructions.

A switch would cause the tools to only emit the 32-bit subset of instructions, so that SYSCALL is encoded with SYSCALL[32] (which may become the default encoding for SYSCALL anyway on most assemblers)

So for maximum compatibility we should allow (and for best performance require) the use of the SYSCALL[32] encoding.

If SYSCALL[16] was ever used, the PC may get incremented by 4 instead of 2, skipping the next instruction or landing mid-instruction.

There are currently 292 syscalls in the generic list (as of statx), though a few are arch specific spaceholders and others will be unimplemented due to being superseded by newer calls, so the likely space cost in libc will be around half a kilobyte.

Compatibility wise (sharing code with previous MIPS ABIs), its another difference to be abstracted in the assembly by using the SYSCALL32 mnemonic instead of SYSCALL, however it is generally confined to C libraries so is acceptable.

Error condition out $7 (a3) != 0 $4 (a0) >= (unsigned)-4095 Slightly more standard and simplifies syscall restart.

ptrace man page says clear errno beforehand to tell if an error has occurred. It appears userland clears errno for specific peek operations.

Error code out $2 (v0) -$4 (a0) Changed to a0 to match p32 calling conventions
Return value out $2 (v0) $4 (a0)
Clobbered $1 (at)
$3 (v1)
$8-$15 (t0-t7)
$24-$25 (t8-t9)
[hi, lo]
$1 (at)
$3 (v1)
$10-$11 (a6-a7)
$12-$15 (t4-t7)
$24-$25 (t8-t9)
[hi, lo]
$1 (at)
$3 (t5)
$10-$11 (a6-a7)
$12-$15 (t0-t3)
$24-$25 (t8-t9)
All other non clobbered/input/output registers are preserved

System Calls

p32 will use minimal generic system call numbering, as found in asm-generic/unistd.h.

Deprecated System Calls

Redundant system calls (which can be implemented using newer system calls) are deprecated and will not be implemented.

Deprecated system calls
Deprecated System Calls Implement Using System Call Comments
fork, vfork __NR_clone
open __NR_openat System calls which take file paths use a *at variant which allows the directory the path is relative to to be specified in an argument.

Generally speaking you can implement syscall with syscallat by passing:

dirfd = AT_FDCWD
mknod __NR_mknodat
mkdir __NR_mkdirat
readlink __NR_readlinkat
symlink __NR_symlinkat
newdirfd = AT_FDCWD
link __NR_linkat
olddirfd = AT_FDCWD
newdirfd = AT_FDCWD
flags = 0
unlink __NR_unlinkat
dirfd = AT_FDCWD
flags = 0
rmdir
dirfd = AT_FDCWD
flags = AT_REMOVEDIR
chmod __NR_fchmodat
dirfd = AT_FDCWD
flags = 0

Note that fchmod still exists, and apparently can't be implemented using AT_EMPTY_PATH (yet)

access __NR_faccessat
chown __NR_fchownat
dirfd = AT_FDCWD
flags = 0
lchown
dirfd = AT_FDCWD
flags = AT_SYMLINK_NOFOLLOW
fchown
dirfd = fd
pathname = ""
flags = AT_EMPTY_PATH
utimes, futimesat __NR_utimensat These can be implemented using newer versions (in a transitive fashion to implement both using utimensat):
Older system call Newer system call Notes
utimes futimesat
dirfd = AT_FDCWD

futimesat is superseded by utimensat, see below.

futimesat utimensat utimensat supersedes futimesat with nanosecond precision and an extra flags argument.

convert times from struct timeval to struct timespec (nanosecond precision).

flags = 0
rename, renameat __NR_renameat2 renameat2 supersedes renameat with an extra flags argument.

Rename functions can be implemented using the newer rename functions (in a transitive fashion to implement them all using renameat2):

Older system call Newer system call Notes
rename renameat
olddirfd = AT_FDCWD
newdirfd = AT_FDCWD
renameat renameat2
flags = 0
old (use struct __old_kernel_stat):
stat, lstat, fstat

newer (use struct stat):
newstat, newlstat, newfstat

32-bit expansions (use struct stat64):
stat64, lstat64, fstat64

newest *at versions:
newfstatat (64-bit, struct stat)
fstatat64 (32-bit, struct stat64)

__NR_statx (use struct statx) See stat and statx man pages:

Data may need copying between struct types if implementing one userland function using another syscall (e.g. to implement stat() with statx).

Stat functions can be implemented using newer stat functions (in a transitive fashion to implement them all using statx):

Older system call Newer system call Notes
stat newstat Possibly convert the newer struct stat to the userland struct stat (but ideally libc would expose the newer struct stat to applications anyway rather than the older struct __old_kernel_stat, since the older system calls aren't exposed in the kernel API).
lstat newlstat
fstat newfstat
Older system call Newer system call Notes
newstat newfstatat
dirfd = AT_FDCWD
flags = 0
stat64 fstatat64
newlstat newfstatat
dirfd = AT_FDCWD
flags = AT_SYMLINK_NOFOLLOW
lstat64 fstatat64
newfstat newfstatat
dirfd = fd
pathname = ""
flags = AT_EMPTY_PATH (since v2.6.39)
fstat64 fstatat64
Older system call Newer system call Notes
newfstatat statx
flags |= AT_STATX_SYNC_AT_STAT
mask = STATX_BASIC_STATS

Convert struct statx to userland struct stat or struct stat64.

fstatat64
sync_file_range __NR_sync_file_range2 64-bit arguments. sync_file_range2 arguments are in a different order to pack more tightly on 32-bit ABI
getrlimit, setrlimit prlimit64 Applied [PATCH 03/20] asm-generic: Drop getrlimit and setrlimit syscalls from default list

Ideally without getrlimit or setrlimit system calls, libc would expose a struct rlimit matching struct rlimit64, and RLIM_INFINITY matching RLIM64_INFINITY. This makes the implementation of these functions simpler:

Older system call Newer system call Notes
getrlimit prlimit64
pid = 0
new_limit = NULL
old_limit = pointer to a struct rlimit64

Convert output struct rlimit64 to struct rlimit if necessary.

Be careful to limit it to a maximum of RLIM_INFINITY if necessary (i.e. if rlim_t and RLIM_INFINITY are smaller than rlim64_t and RLIM64_INFINITY).

setrlimit Convert input struct rlimit to struct rlimit64 if necessary.

Be careful to special case RLIM_INFINITY so it converts to RLIM64_INFINITY if necessary (i.e. if rlim_t and RLIM_INFINITY are smaller than rlim64_t and RLIM64_INFINITY).

pid = 0
new_limit = pointer to a filled out struct rlimit64
old_limit = NULL

Note that p32 will use asm-generic/resource.h constants, since o32/n32/n64 ABIs reorder resource numbers 5 to 9 compared to asm-generic:

Constant o32/n32/n64 value p32 (asm-generic) value
RLIMIT_NOFILE 5 7
RLIMIT_AS 6 9
RLIMIT_RSS 7 5
RLIMIT_NPROC 8 6
RLIMIT_MEMLOCK 9 8

prlimit64 uses RLIM64_INFINITY to represent non-limited resources, so RLIM_INFINITY is unused by the p32 kernel API.

Generic System Call Notes

Some p32 system call specifics:

Generic system call specifics
System Call Implementation Num Args (arg regs) a0 a1 a2 a3 a4 a5 Comments
__NR_clone 5 ulong clone_flags ulong newsp int *parent_tidptr ulong tls int *child_tidptr N/A Like o32/n32/n64, p32 clone uses CONFIG_CLONE_BACKWARDS argument ordering
__NR_pipe sys_pipe2 2 int pipefd[2] int flags N/A o32/n32/n64 uses a custom calling convention, returning the two pipes in v0 and v1 registers

p32 will use standard calling convention, writing the file descriptors to user memory via pipefd pointer

__NR_mmap sys_mmap2_4koff 6 ulong addr ulong len ulong prot ulong flags ulong fd ulong pgoff Like o32/n32/n64, p32 pgoff is in units of 4096 bytes, regardless of current page size
__NR_sync_file_range2 sys_sync_file_range2 4 (6) int fd uint flags loff_t offset loff_t nbytes Unlike sync_file_range, flags before offset to pack more tightly
__NR_fadvise64_64 sys_fadvise64_64_2 4 (6) int fd int advice loff_t offset loff_t len Unlike standard fadvise64_64, advice before offset to pack more tightly

MIPS Specific System Calls

MIPS system call specifics:

MIPS system call specifics
System Call Num Args a0 a1 a2 p32 Comments
__NR_set_thread_area 1 ulong addr N/A preserved Still going to have and use UserLocal register, so it makes sense to keep this.
__NR_sysmips 3 long cmd long arg1 long arg2 removed All sub-operations can be safely removed, see below
3 MIPS_ATOMIC_SET addr new Use LL/SC instructions
2 MIPS_FIXADE flags N/A MIPSr6 allows unaligned accesses so this is redundant
1 FLUSH_CACHE N/A SYNCI globalisation is architecturally required
__NR_cacheflush 3 ulong addr ulong bytes uint cache removed Use SYNCI instruction

Error Numbers

The MIPS o32/n32/n64 ABIs use completely different error codes (defined in asm/errno.h) to the normal Linux error codes (e.g. asm-generic/errno.h), I think for IRIX compatibility.

p32 switches to using the generic error numbers to match other architectures.

Signal Numbers

p32 switches to generic signal numbers. The MIPS o32/n32/n64 ABIs use their own numbering, including a non-standard SIGEMT, with their own version of struct sigaction.

Using asm-generic/signal.h on p32 has the following effects:

  • Use generic signal numbers, which differ from o32/n32/n64 (presumably for IRIX compatibility), and don't have a SIGEMT (which I think is EMulation Trap).
  • The number of signals is 64 (signals 1..64, including 32 real time signals), unlike o32/n32/n64 which had 128 signals (1..128) which caused some difficulties. This affects the size of sigset_t and system calls like sigaction which must pass a sigset_t size of 8 on p32 instead of 16.
  • sigaction constants changed to match asm-generic:
    • SA_NOCLDWAIT = 0x2 on p32 rather than 0x10000 on o32/n32/n64.
    • SA_SIGINFO = 0x4 on p32 rather than 0x8 on o32/n32/n64.
  • sigprocmask constants changed to match asm-generic:
    • SIG_BLOCK = 0 on p32 rather than 1 on o32/n32/n64.
    • SIG_UNBLOCK = 1 on p32 rather than 2 on o32/n32/n64.
    • SIG_SETMASK = 2 on p32 rather than 3 on o32/n32/n64.
  • struct sigaction has handler before flags (and obviously different sized mask)
  • struct sigaltstack has flags before size (no need for IRIX compatibility)

The p32 struct sigcontext will be reduced with optional ASE context (DSP, FPU and MSA) represented as extended context entries (o32, n32 and n64 only represent the upper MSA register state using extended context entries)

  • sigcontext no longer has sc_fpregs or sc_fpu_csr (FPU). See fpu_extcontext.
  • sigcontext no longer has sc_mdhi or sc_mdlo (r2 lo/hi registers, now belong to DSP). See dsp_extcontext.
  • sigcontext no longer has sc_{hi,lo}{1,2,3} or sc_dsp (DSP). See dsp_extcontext.
  • msa_extcontext no long has wr[32], since the full vector registers will be accessible in fpu_extcontext

Using more generic siginfo.h on p32 has the following effects

  • struct siginfo
    • si_errno before si_code
    • no _irix_sigchld
    • should we define __ARCH_SI_TRAPNO to get _sigfault::_trapno? o32/n32/n64 managed without it...
  • SI_ASYNCIO = -4 on p32 rather than -2 on o32/n32/n64
  • SI_TIMER = -2 on p32 rather than -3 on o32/n32/n64
  • SI_MESGQ = -3 on p32 rather than -4 on o32/n32/n64

Other Misc Changes

  • No struct stat and struct stat64 structures (asm-generic/stat.h), since the fstat64 and fstatat64 syscalls are deprecated in favour of statx (see above).
  • Generic struct statfs and struct statfs64 structures (asm-generic/statfs.h)
    • These are likely to be deprecated in favour of a future statx like system call anyway
  • Generic fcntl constants (asm-generic/fcntl.h), rather than custom ones
  • Generic resource constants (asm-generic/resource.h), rather than custom ones (see prlimit64 system call notes in table above)

Other Possible Changes

  • Generic ioctl constants (asm-generic/ioctl.h, asm-generic/ioctls.h)
  • ptrace interface likely to be cleaned up
  • Generic mman.h constants?
  • Generic msgbuf.h struct?
  • Generic poll.h?
  • Generic sembuf.h?
  • Generic shmbuf.h?
  • Generic siginfo.h?
  • Generic socket.h?
  • Generic sockios.h?
  • Generic termbits.h?
  • Generic termios.h?
  • Any ucontext changes desired?