[Top] [All Lists]

RE: [PATCH RFC v2 24/70] MIPS: asm: spinlock: Replace sub instruction wi

To: Matthew Fortune <>
Subject: RE: [PATCH RFC v2 24/70] MIPS: asm: spinlock: Replace sub instruction with addiu
From: "Maciej W. Rozycki" <>
Date: Tue, 10 Feb 2015 16:17:02 +0000 (GMT)
Cc: Markos Chandras <>,, "" <>
In-reply-to: <>
List-archive: <>
List-help: <>
List-id: linux-mips <>
List-owner: <>
List-post: <>
List-software: Ecartis version 1.0.0
List-subscribe: <>
List-unsubscribe: <>
Original-recipient: rfc822;
References: <> <> <> <> <>
User-agent: Alpine 2.11 (LFD 23 2013-08-11)
On Tue, 20 Jan 2015, Matthew Fortune wrote:

> > >  What this shows really is a GAS bug fix for the SUB macro is needed
> > > similar to what I suggested in 12/70 for ADDI (from the situation I
> > infer
> > > there is some real work to do in GAS in this area; adding Matthew as a
> > > recipient to raise his awareness) so that it does not expand to ADDI
> > where
> > > the architecture or processor selected do not support it.  Instead a
> > > longer sequence involving SUB has to be produced.
> The assembler is at least consistent at the moment as the 'sub' macro is
> disabled for R6. I am very keen to stop carrying around historic baggage
> where it is not necessary. R6 is one place we can do that and deal with
> any code changes that are required.

 I have yet to be convinced it is merely historic baggage.  Maybe it's a 
matter of habits I got into, but I find the presence of these macros a way 
to make the MIPS assembly language actually usable for handcoding.  There 
are several reasons for this.

 One is the limited range of immediates in machine makes it necessary to 
use different instruction sequences for different immediate input 
arguments.  Given this source code instruction:

        li      $2, foo

for different values of `foo' you'll get different machine code:

    foo         code
    0x1234      addiu $2, $0, 0x1234
    0x89ab      ori $2, $0, 0x89ab
0x89ab0000      lui $2, 0x89ab
0x89ab1234      lui $2, 0x89ab; addiu $2, $2, 0x1234

now if `foo' is some sort of an externally supplied constant (e.g. set 
with a `configure' script or whatever), then without the macros you'd have 
to pessimise code, or clutter it with #ifdef's.

 Another is to abstract ABI dependencies.  Again, given this source code 

        lw      $2, foo

for different ABIs you'll get different code:

    ABI         code
o32/non-PIC     lui $2, %hi(foo); lw $2, %lo(foo)($2)
o32/PIC/extern  lw $2, %got(foo)($28); lw $2, 0($2)
o32/PIC/local   lw $2, %got(foo)($28); addiu $2, %lo(foo); lw $2, 0($2)
n64/non-PIC     lui $1, %highest(foo); lui $2, %hi(foo);
                addiu $1, $1, %higher(foo); dsll32 $1, $1, 0;
                daddu $1, $1, $2; lw $2, %lo(foo)($1)
n64/PIC/extern  ld $2, %got_disp(foo)($28); lw $2, 0($2)

You'd have to conditionalise it all too.

 And there are more cases macros address, e.g. to make the complete set of 
arithmetic conditions available for branches (with the use of SLT and SLTU 
instructions), extra operations (e.g. NOT as a shorthand for NOR), 
three-argument trapping MULOU, DIVU, REMU operations (especially 
interesting to note in the context of r6; why MODU wasn't consequently 
called REMU for portability escapes me), etc.

 All this makes assembly language programming easier and more like with 
CISC assembly languages, e.g. this x86 assembly-language instruction:

        addl    $foo, %eax

will do the right thing for any value of `foo' and the assembler will also 
pick the shortest instruction encoding available.  As a result when 
writing code you can focus on the problem you're trying to solve rather 
than getting distracted by ABI peculiarites or the assymetry of the 
machine instruction set.  It is also easier to follow when studying code 
written by someone else.

 Of course all this does not matter for compiler-generated code.  Which is 
also the reason why the MIPS16 assembly language has never included a 
complementing set of these macros -- it was only meant to be used in 
compiler-generated code and never for handcoding.  And for handcoded 
assembly if you are concerned about source code instructions expanding 
into multiple machine instructions, then you can always stick `.set 
nomacro' at the top of your source code.

> > >                   __asm__ __volatile__(
> > >                   "1:     ll      %1, %2  # arch_read_unlock      \n"
> > >                   "       sub     %1, %3                          \n"
> > >                   "       sc      %1, %0                          \n"
> > >                   : "=" GCC_OFF12_ASM() (rw->lock), "=&r" (tmp)
> > >                   : GCC_OFF12_ASM() (rw->lock), GCC_ADDI_ASM() (1)
> > >                   : "memory");
> > >
> > > (untested, but should work) so that there's still a single instruction
> > > only in the LL/SC loop and consequently no increased lock contention
> > risk.
> (Note this asm block does not appear to need to clobber memory either as
> the effects on memory are correctly stated in the constraints).

 The `memory' clobber serves the purpose of an optimisation barrier here, 
it's not about the memory accesses happening within the asm itself.

> > >  As a side note, this could be cleaned up to use a "+" input/output
> > > constraint; such a clean-up will be welcome -- although to be complete,
> > a
> > > review of all the asms will be required (this may bump up the GCC
> > version
> > > requirement though, ISTR bugs in this area).
> I believe some of these asm blocks using ll/sc already have '+' in the
> constraints for the memory location so perhaps that is either already
> a problem or not an issue.

 I just don't remember offhand if the use of `+' was in platform or in 
shared code.  If the latter, then let's just switch, if the former, we 
need to be careful.

 IIRC some versions of GCC complained and failed compilation if the list 
of constraints associated with `+' did not allow a register alternative, 
such by including the `r' constraint.  Which of course would be completely 
pointless here, and actually harmful.  Furthermore IIRC it had been a 
deliberate decision made by GCC maintainers who were unaware of some use 
cases for inline asms.  The decision was then discussed and GCC 
maintainers persuaded to change it; it can likely be tracked down in a 
mailing list archive somewhere.


<Prev in Thread] Current Thread [Next in Thread>
  • RE: [PATCH RFC v2 24/70] MIPS: asm: spinlock: Replace sub instruction with addiu, Maciej W. Rozycki <=