linux-mips
[Top] [All Lists]

Re: [patch] MIPS/gcc: Revert removal of DImode shifts for 32-bit targets

To: Richard Henderson <rth@redhat.com>
Subject: Re: [patch] MIPS/gcc: Revert removal of DImode shifts for 32-bit targets
From: Richard Sandiford <rsandifo@redhat.com>
Date: Tue, 31 Aug 2004 20:51:20 +0100
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>, Nigel Stephens <nigel@mips.com>, gcc-patches@gcc.gnu.org, linux-mips@linux-mips.org
In-reply-to: <20040810232020.GA21922@redhat.com> (Richard Henderson's message of "Tue, 10 Aug 2004 16:20:20 -0700")
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <Pine.LNX.4.58L.0407261325470.3873@blysk.ds.pg.gda.pl> <410E9E25.7080104@mips.com> <87acxcbxfl.fsf@redhat.com> <410F5964.3010109@mips.com> <876580bm2e.fsf@redhat.com> <410F60DF.9020400@mips.com> <Pine.LNX.4.58L.0408042123030.31930@blysk.ds.pg.gda.pl> <87r7qiwz54.fsf@redhat.com> <20040809220838.GE16493@redhat.com> <87zn5336h7.fsf@redhat.com> <20040810232020.GA21922@redhat.com>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)
Richard Henderson <rth@redhat.com> writes:
> Patch seems ok then.  We'd have to add a new macro/target flag
> to handle non-truncating shifts -- we've got cases:
>
>   (1) Large shift shifts out all bits (ARM)
>   (2) Large shifts trap (VAX)
>   (3) Shift count truncated to 31, always, which means QI/HI
>       shifts are yield undefined results with large shifts.  (i386)

I'm not sure whether (2) really affects things much.  By default, the code
is supposed to do exactly what libgcc2.c would do, i.e.:

  - the double-word shift guarantees no particular behaviour for
    shifts counts outside the range [0, BITS_PER_WORD * 2)

  - for counts inside that range, the code will only use well-defined
    word-mode shifts.

In the patch I posted originally, SHIFT_COUNT_TRUNCATED changed the
default behaviour in two ways:

  a) it guaranteed that the doubleword shift would truncate the shift count.

  b) it enabled some extra optimisations, particularly in the conditional
     move case.

As you say, using S_C_T was a bit limited, especially since it requires a
particular behaviour for unrelated things like ZERO_EXTRACT.  So, to deal
with (1) and (3) from your list, the patch below adds a new target hook:

int TARGET_SHIFT_TRUNCATION_MASK (enum machine_mode MODE)

     This function describes how the standard shift patterns for MODE
     deal with shifts by negative amounts or by more than the width of
     the mode.  *Note shift patterns::.

     On many machines, the shift patterns will apply a mask M to the
     shift count, meaning that a fixed-width shift of X by Y is
     equivalent to an arbitrary-width shift of X by Y & M.  If this is
     true for mode MODE, the function should return M, otherwise it
     should return 0.  A return value of 0 indicates that no particular
     behavior is guaranteed.

     Note that, unlike `SHIFT_COUNT_TRUNCATED', this function does
     _not_ apply to general shift rtxes; it applies only to instructions
     that are generated by the named shift patterns.

     The default implementation of this function returns
     `GET_MODE_BITSIZE (MODE) - 1' if `SHIFT_COUNT_TRUNCATED' and 0
     otherwise.  This definition is always safe, but if
     `SHIFT_COUNT_TRUNCATED' is false, and some shift patterns
     nevertheless truncate the shift count, you may get better code by
     overriding it.

Thus the optimisations from (b) that used to be conditional on S_C_T
are now conditional on:

    TARGET_SHIFT_TRUNCATION_MASK (word_mode) == BITS_PER_WORD - 1

The truncation behaviour from (a) is guaranteed if:

    TARGET_SHIFT_TRUNCATION_MASK (double_word_mode) == BITS_PER_WORD * 2 - 1

although the optimisation only handles the latter if the former is also true.
In other cases, it punts unless T_S_T_M returns 0 for double_word_mode.

The point of the new hook is that allows us to optimise case (1) from
your list.  If:

    TARGET_SHIFT_TRUNCATION_MASK (word_mode) >= BITS_PER_WORD * 2 - 1

then (because out-of-range shift counts are undefined for the doubleword
shifts) we can use:

    outof_target = (shift outof_input op1)

as per config/arm/lib1funcs.asm.

One potential drawback of all this is that it generates rtxes that
rely on the behaviour of shifts by more than the word width.  At the
moment, simplify-rtx.c will fold such shifts using whatever the host
compiler thinks is suitable.  E.g.:

    case ASHIFT:
      if (arg1 < 0)
        return 0;

      if (SHIFT_COUNT_TRUNCATED)
        arg1 %= width;

      val = ((unsigned HOST_WIDE_INT) arg0) << arg1;
      break;

which accepts any positive arg1, even if !SHIFT_COUNT_TRUNCATED.

This seems pretty dubious anyway.  What if a define_expand in the backend
uses shifts to implement a complex named pattern?  I'd have thought the
backend would be free to use target-specific knowledge about what that
shift does with out-of-range values.  And if we are later able to
constant-fold the result, the code above might not do what the
target machine would do.

The patch therefore refuses to optimise out-of-range counts unless
SHIFT_COUNT_TRUNCATED.  It also fixes the following sign-extension code:

      /* Bootstrap compiler may not have sign extended the right shift.
         Manually extend the sign to insure bootstrap cc matches gcc.  */
      if (arg0s < 0 && arg1 > 0)
        val |= ((HOST_WIDE_INT) -1) << (HOST_BITS_PER_WIDE_INT - arg1);

which isn't right for modes whose width is != HOST_BITS_PER_WIDE_INT.

As an example:

    unsigned long long f (unsigned long long x, int y) { return x << y; }

is now implemented thus for arm-elf-gcc -O2 -fno-schedule-insns:

        mov     r1, r1, asl r2
        rsb     r3, r2, #32
        orr     r1, r1, r0, lsr r3
        subs    ip, r2, #32
        movpl   r1, r0, asl ip
        mov     r0, r0, asl r2
        bx      lr

(without -fno-schedule-insns, the scheduler will increase register
pressure and force some call-saved registers live.)  This sequence
is at least the same length as the hand-coded lib1funcs.asm version,
so I hope it's better than the call we get now.

For mips64-elf-gcc -O2 -march=rm9000 -mabi=32, we get:

        nor     $7,$0,$6
        srl     $8,$5,1
        sll     $2,$4,$6
        srl     $8,$8,$7
        sll     $3,$5,$6
        or      $2,$8,$2
        andi    $6,$6,0x20
        movn    $2,$3,$6
        j       $31
        movn    $3,$0,$6

which is nicely superscalar, and the same length as Maciej's
hand-coded version.

As before, the patch will fall back on jumps if conditional moves
aren't available.

Bootstrapped & regression tested on i686-pc-linux-gnu and mips-sgi-irix6.5.
Also tested on arm-elf (default language set, test pattern arm-elf{,-mthumb}).
OK to install?

Richard


        * doc/md.texi (shift patterns): New anchor.  Add reference to
        TARGET_SHIFT_TRUNCATION_MASK.
        * doc/tm.texi (TARGET_SHIFT_TRUNCATION_MASK): Document.
        * target.h (shift_truncation_mask): New target hook.
        * targhook.h (default_shift_truncation_mask): Declare.
        * targhook.c (default_shift_truncation_mask): Define.
        * target-def.h (TARGET_SHIFT_TRUNCATION_MASK): Define.
        (TARGET_INITIALIZER): Include it.
        * simplify-rtx.c (simplify_binary_operation): Combine ASHIFT, ASHIFTRT
        and LSHIFTRT cases.  Truncate arg1 if SHIFT_COUNT_TRUNCATED, otherwise
        reject all out-of-range values.  Fix sign-extension code for modes
        whose width is smaller than HOST_BITS_PER_WIDE_INT.
        * optabs.c (simplify_expand_binop, force_expand_binop): New functions.
        (expand_superword_shift, expand_subword_shift): Likewise.
        (expand_doubleword_shift_condmove, expand_doubleword_shift): Likewise.
        (expand_binop): Use them to implement double-word shifts.
        * config/arm/arm.c (arm_shift_truncation_mask): New function.
        (TARGET_SHIFT_TRUNCATION_MASK): Define.

Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.108
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.108 md.texi
*** doc/md.texi 23 Aug 2004 05:55:46 -0000      1.108
--- doc/md.texi 31 Aug 2004 18:44:55 -0000
*************** quotient or remainder and generate the a
*** 2884,2896 ****
  @item @samp{udivmod@var{m}4}
  Similar, but does unsigned division.
  
  @cindex @code{ashl@var{m}3} instruction pattern
  @item @samp{ashl@var{m}3}
  Arithmetic-shift operand 1 left by a number of bits specified by operand
  2, and store the result in operand 0.  Here @var{m} is the mode of
  operand 0 and operand 1; operand 2's mode is specified by the
  instruction pattern, and the compiler will convert the operand to that
! mode before generating the instruction.
  
  @cindex @code{ashr@var{m}3} instruction pattern
  @cindex @code{lshr@var{m}3} instruction pattern
--- 2884,2899 ----
  @item @samp{udivmod@var{m}4}
  Similar, but does unsigned division.
  
+ @anchor{shift patterns}
  @cindex @code{ashl@var{m}3} instruction pattern
  @item @samp{ashl@var{m}3}
  Arithmetic-shift operand 1 left by a number of bits specified by operand
  2, and store the result in operand 0.  Here @var{m} is the mode of
  operand 0 and operand 1; operand 2's mode is specified by the
  instruction pattern, and the compiler will convert the operand to that
! mode before generating the instruction.  The meaning of out-of-range shift
! counts can optionally be specified by @code{TARGET_SHIFT_TRUNCATION_MASK}.
! @xref{TARGET_SHIFT_TRUNCATION_MASK}.
  
  @cindex @code{ashr@var{m}3} instruction pattern
  @cindex @code{lshr@var{m}3} instruction pattern
Index: doc/tm.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.360
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.360 tm.texi
*** doc/tm.texi 29 Aug 2004 22:10:44 -0000      1.360
--- doc/tm.texi 31 Aug 2004 18:45:09 -0000
*************** the implied truncation of the shift inst
*** 8731,8736 ****
--- 8731,8761 ----
  You need not define this macro if it would always have the value of zero.
  @end defmac
  
+ @anchor{TARGET_SHIFT_TRUNCATION_MASK}
+ @deftypefn {Target Hook} int TARGET_SHIFT_TRUNCATION_MASK (enum machine_mode 
@var{mode})
+ This function describes how the standard shift patterns for @var{mode}
+ deal with shifts by negative amounts or by more than the width of the mode.
+ @xref{shift patterns}.
+ 
+ On many machines, the shift patterns will apply a mask @var{m} to the
+ shift count, meaning that a fixed-width shift of @var{x} by @var{y} is
+ equivalent to an arbitrary-width shift of @var{x} by @var{y & m}.  If
+ this is true for mode @var{mode}, the function should return @var{m},
+ otherwise it should return 0.  A return value of 0 indicates that no
+ particular behavior is guaranteed.
+ 
+ Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does
+ @emph{not} apply to general shift rtxes; it applies only to instructions
+ that are generated by the named shift patterns.
+ 
+ The default implementation of this function returns
+ @code{GET_MODE_BITSIZE (@var{mode}) - 1} if @code{SHIFT_COUNT_TRUNCATED}
+ and 0 otherwise.  This definition is always safe, but if
+ @code{SHIFT_COUNT_TRUNCATED} is false, and some shift patterns
+ nevertheless truncate the shift count, you may get better code
+ by overriding it.
+ @end deftypefn
+ 
  @defmac TRULY_NOOP_TRUNCATION (@var{outprec}, @var{inprec})
  A C expression which is nonzero if on this machine it is safe to
  ``convert'' an integer of @var{inprec} bits to one of @var{outprec}
Index: target.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target.h,v
retrieving revision 1.109
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.109 target.h
*** target.h    26 Aug 2004 00:24:34 -0000      1.109
--- target.h    31 Aug 2004 18:45:10 -0000
*************** struct gcc_target
*** 378,383 ****
--- 378,387 ----
    /* Undo the effects of encode_section_info on the symbol string.  */
    const char * (* strip_name_encoding) (const char *);
  
+   /* If shift optabs for MODE are known to always truncate the shift count,
+      return the mask that they apply.  Return 0 otherwise.  */
+   unsigned HOST_WIDE_INT (* shift_truncation_mask) (enum machine_mode mode);
+ 
    /* True if MODE is valid for a pointer in __attribute__((mode("MODE"))).  */
    bool (* valid_pointer_mode) (enum machine_mode mode);
  
Index: targhooks.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/targhooks.h,v
retrieving revision 2.18
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r2.18 targhooks.h
*** targhooks.h 26 Aug 2004 00:24:34 -0000      2.18
--- targhooks.h 31 Aug 2004 18:45:10 -0000
*************** extern bool hook_bool_CUMULATIVE_ARGS_fa
*** 32,37 ****
--- 32,39 ----
  extern bool default_pretend_outgoing_varargs_named (CUMULATIVE_ARGS *);
  
  extern enum machine_mode default_eh_return_filter_mode (void);
+ extern unsigned HOST_WIDE_INT default_shift_truncation_mask
+   (enum machine_mode);
  
  extern bool hook_bool_CUMULATIVE_ARGS_true (CUMULATIVE_ARGS *);
  extern tree default_cxx_guard_type (void);
Index: targhooks.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/targhooks.c,v
retrieving revision 2.27
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r2.27 targhooks.c
*** targhooks.c 26 Aug 2004 00:24:34 -0000      2.27
--- targhooks.c 31 Aug 2004 18:45:10 -0000
*************** default_eh_return_filter_mode (void)
*** 135,140 ****
--- 135,148 ----
    return word_mode;
  }
  
+ /* The default implementation of TARGET_SHIFT_TRUNCATION_MASK.  */
+ 
+ unsigned HOST_WIDE_INT
+ default_shift_truncation_mask (enum machine_mode mode)
+ {
+   return SHIFT_COUNT_TRUNCATED ? GET_MODE_BITSIZE (mode) - 1 : 0;
+ }
+ 
  /* Generic hook that takes a CUMULATIVE_ARGS pointer and returns true.  */
  
  bool
Index: target-def.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target-def.h,v
retrieving revision 1.98
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.98 target-def.h
*** target-def.h        26 Aug 2004 00:24:33 -0000      1.98
--- target-def.h        31 Aug 2004 18:45:11 -0000
*************** #define TARGET_STRIP_NAME_ENCODING defau
*** 301,306 ****
--- 301,310 ----
  #define TARGET_BINDS_LOCAL_P default_binds_local_p
  #endif
  
+ #ifndef TARGET_SHIFT_TRUNCATION_MASK
+ #define TARGET_SHIFT_TRUNCATION_MASK default_shift_truncation_mask
+ #endif
+ 
  #ifndef TARGET_VALID_POINTER_MODE
  #define TARGET_VALID_POINTER_MODE default_valid_pointer_mode
  #endif
*************** #define TARGET_INITIALIZER                      \
*** 478,483 ****
--- 482,488 ----
    TARGET_BINDS_LOCAL_P,                               \
    TARGET_ENCODE_SECTION_INFO,                 \
    TARGET_STRIP_NAME_ENCODING,                 \
+   TARGET_SHIFT_TRUNCATION_MASK,                       \
    TARGET_VALID_POINTER_MODE,                    \
    TARGET_SCALAR_MODE_SUPPORTED_P,             \
    TARGET_VECTOR_MODE_SUPPORTED_P,               \
Index: simplify-rtx.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.202
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.202 simplify-rtx.c
*** simplify-rtx.c      27 Jul 2004 19:09:32 -0000      1.202
--- simplify-rtx.c      31 Aug 2004 18:45:15 -0000
*************** simplify_binary_operation (enum rtx_code
*** 2343,2383 ****
        break;
  
      case LSHIFTRT:
-       /* If shift count is undefined, don't fold it; let the machine do
-        what it wants.  But truncate it if the machine will do that.  */
-       if (arg1 < 0)
-       return 0;
- 
-       if (SHIFT_COUNT_TRUNCATED)
-       arg1 %= width;
- 
-       val = ((unsigned HOST_WIDE_INT) arg0) >> arg1;
-       break;
- 
      case ASHIFT:
-       if (arg1 < 0)
-       return 0;
- 
-       if (SHIFT_COUNT_TRUNCATED)
-       arg1 %= width;
- 
-       val = ((unsigned HOST_WIDE_INT) arg0) << arg1;
-       break;
- 
      case ASHIFTRT:
!       if (arg1 < 0)
!       return 0;
! 
        if (SHIFT_COUNT_TRUNCATED)
!       arg1 %= width;
! 
!       val = arg0s >> arg1;
! 
!       /* Bootstrap compiler may not have sign extended the right shift.
!        Manually extend the sign to insure bootstrap cc matches gcc.  */
!       if (arg0s < 0 && arg1 > 0)
!       val |= ((HOST_WIDE_INT) -1) << (HOST_BITS_PER_WIDE_INT - arg1);
  
        break;
  
      case ROTATERT:
--- 2343,2368 ----
        break;
  
      case LSHIFTRT:
      case ASHIFT:
      case ASHIFTRT:
!       /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure the
!        value is in range.  We can't return any old value for out-of-range
!        arguments because either the middle-end (via shift_truncation_mask)
!        or the back-end might be relying on target-specific knowledge.
!        Nor can we rely on shift_truncation_mask, since the shift might
!        not be part of an ashlM3, lshrM3 or ashrM3 instruction.  */
        if (SHIFT_COUNT_TRUNCATED)
!       arg1 = (unsigned HOST_WIDE_INT) arg1 % width;
!       else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode))
!       return 0;
  
+       val = (code == ASHIFT
+            ? ((unsigned HOST_WIDE_INT) arg0) << arg1
+            : ((unsigned HOST_WIDE_INT) arg0) >> arg1);
+ 
+       /* Sign-extend the result for arithmetic right shifts.  */
+       if (code == ASHIFTRT && arg0s < 0 && arg1 > 0)
+       val |= ((HOST_WIDE_INT) -1) << (width - arg1);
        break;
  
      case ROTATERT:
Index: optabs.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/optabs.c,v
retrieving revision 1.235
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.235 optabs.c
*** optabs.c    19 Aug 2004 22:24:54 -0000      1.235
--- optabs.c    31 Aug 2004 18:45:20 -0000
*************** optab_for_tree_code (enum tree_code code
*** 709,715 ****
--- 709,1064 ----
        return NULL;
      }
  }
+ 
+ /* Like expand_binop, but return a constant rtx if the result can be
+    calculated at compile time.  The arguments and return value are
+    otherwise the same as for expand_binop.  */
+ 
+ static rtx
+ simplify_expand_binop (enum machine_mode mode, optab binoptab,
+                      rtx op0, rtx op1, rtx target, int unsignedp,
+                      enum optab_methods methods)
+ {
+   if (CONSTANT_P (op0) && CONSTANT_P (op1))
+     return simplify_gen_binary (binoptab->code, mode, op0, op1);
+   else
+     return expand_binop (mode, binoptab, op0, op1, target, unsignedp, 
methods);
+ }
+ 
+ /* Like simplify_expand_binop, but always put the result in TARGET.
+    Return true if the expansion succeeded.  */
+ 
+ static bool
+ force_expand_binop (enum machine_mode mode, optab binoptab,
+                   rtx op0, rtx op1, rtx target, int unsignedp,
+                   enum optab_methods methods)
+ {
+   rtx x = simplify_expand_binop (mode, binoptab, op0, op1,
+                                target, unsignedp, methods);
+   if (x == 0)
+     return false;
+   if (x != target)
+     emit_move_insn (target, x);
+   return true;
+ }
+ 
+ /* This subroutine of expand_doubleword_shift handles the cases in which
+    the effective shift value is >= BITS_PER_WORD.  The arguments and return
+    value are the same as for the parent routine, except that SUPERWORD_OP1
+    is the shift count to use when shifting OUTOF_INPUT into INTO_TARGET.
+    INTO_TARGET may be null if the caller has decided to calculate it.  */
+ 
+ static bool
+ expand_superword_shift (optab binoptab, rtx outof_input, rtx superword_op1,
+                       rtx outof_target, rtx into_target,
+                       int unsignedp, enum optab_methods methods)
+ {
+   if (into_target != 0)
+     if (!force_expand_binop (word_mode, binoptab, outof_input, superword_op1,
+                            into_target, unsignedp, methods))
+       return false;
+ 
+   if (outof_target != 0)
+     {
+       /* For a signed right shift, we must fill OUTOF_TARGET with copies
+        of the sign bit, otherwise we must fill it with zeros.  */
+       if (binoptab != ashr_optab)
+       emit_move_insn (outof_target, CONST0_RTX (word_mode));
+       else
+       if (!force_expand_binop (word_mode, binoptab,
+                                outof_input, GEN_INT (BITS_PER_WORD - 1),
+                                outof_target, unsignedp, methods))
+         return false;
+     }
+   return true;
+ }
+ 
+ /* This subroutine of expand_doubleword_shift handles the cases in which
+    the effective shift value is < BITS_PER_WORD.  The arguments and return
+    value are the same as for the parent routine.  */
+ 
+ static bool
+ expand_subword_shift (enum machine_mode op1_mode, optab binoptab,
+                     rtx outof_input, rtx into_input, rtx op1,
+                     rtx outof_target, rtx into_target,
+                     int unsignedp, enum optab_methods methods,
+                     unsigned HOST_WIDE_INT shift_mask)
+ {
+   optab reverse_unsigned_shift, unsigned_shift;
+   rtx tmp, carries;
+ 
+   reverse_unsigned_shift = (binoptab == ashl_optab ? lshr_optab : ashl_optab);
+   unsigned_shift = (binoptab == ashl_optab ? ashl_optab : lshr_optab);
+ 
+   /* The low OP1 bits of INTO_TARGET come from the high bits of OUTOF_INPUT.
+      We therefore need to shift OUTOF_INPUT by (BITS_PER_WORD - OP1) bits in
+      the opposite direction to BINOPTAB.  */
+   if (CONSTANT_P (op1) || shift_mask >= BITS_PER_WORD)
+     {
+       carries = outof_input;
+       tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+       tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
+                                  0, true, methods);
+     }
+   else
+     {
+       /* We must avoid shifting by BITS_PER_WORD bits since that is either
+        the same as a zero shift (if shift_mask == BITS_PER_WORD - 1) or
+        has unknown behaviour.  Do a single shift first, then shift by the
+        remainder.  It's OK to use ~OP1 as the remainder if shift counts
+        are truncated to the mode size.  */
+       carries = expand_binop (word_mode, reverse_unsigned_shift,
+                             outof_input, const1_rtx, 0, unsignedp, methods);
+       if (shift_mask == BITS_PER_WORD - 1)
+       {
+         tmp = immed_double_const (-1, -1, op1_mode);
+         tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp,
+                                      0, true, methods);
+       }
+       else
+       {
+         tmp = immed_double_const (BITS_PER_WORD - 1, 0, op1_mode);
+         tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
+                                      0, true, methods);
+       }
+     }
+   if (tmp == 0 || carries == 0)
+     return false;
+   carries = expand_binop (word_mode, reverse_unsigned_shift,
+                         carries, tmp, 0, unsignedp, methods);
+   if (carries == 0)
+     return false;
+ 
+   /* Shift INTO_INPUT logically by OP1.  This is the last use of INTO_INPUT
+      so the result can go directly into INTO_TARGET if convenient.  */
+   tmp = expand_binop (word_mode, unsigned_shift, into_input, op1,
+                     into_target, unsignedp, methods);
+   if (tmp == 0)
+     return false;
+ 
+   /* Now OR in the bits carried over from OUTOF_INPUT.  */
+   if (!force_expand_binop (word_mode, ior_optab, tmp, carries,
+                          into_target, unsignedp, methods))
+     return false;
+ 
+   /* Use a standard word_mode shift for the out-of half.  */
+   if (outof_target != 0)
+     if (!force_expand_binop (word_mode, binoptab, outof_input, op1,
+                            outof_target, unsignedp, methods))
+       return false;
+ 
+   return true;
+ }
  
+ 
+ #ifdef HAVE_conditional_move
+ /* Try implementing expand_doubleword_shift using conditional moves.
+    The shift is by < BITS_PER_WORD if (CMP_CODE CMP1 CMP2) is true,
+    otherwise it is by >= BITS_PER_WORD.  SUBWORD_OP1 and SUPERWORD_OP1
+    are the shift counts to use in the former and latter case.  All other
+    arguments are the same as the parent routine.  */
+ 
+ static bool
+ expand_doubleword_shift_condmove (enum machine_mode op1_mode, optab binoptab,
+                                 enum rtx_code cmp_code, rtx cmp1, rtx cmp2,
+                                 rtx outof_input, rtx into_input,
+                                 rtx subword_op1, rtx superword_op1,
+                                 rtx outof_target, rtx into_target,
+                                 int unsignedp, enum optab_methods methods,
+                                 unsigned HOST_WIDE_INT shift_mask)
+ {
+   rtx outof_superword, into_superword;
+ 
+   /* Put the superword version of the output into OUTOF_SUPERWORD and
+      INTO_SUPERWORD.  */
+   outof_superword = outof_target != 0 ? gen_reg_rtx (word_mode) : 0;
+   if (outof_target != 0 && subword_op1 == superword_op1)
+     {
+       /* The value INTO_TARGET >> SUBWORD_OP1, which we later store in
+        OUTOF_TARGET, is the same as the value of INTO_SUPERWORD.  */
+       into_superword = outof_target;
+       if (!expand_superword_shift (binoptab, outof_input, superword_op1,
+                                  outof_superword, 0, unsignedp, methods))
+       return false;
+     }
+   else
+     {
+       into_superword = gen_reg_rtx (word_mode);
+       if (!expand_superword_shift (binoptab, outof_input, superword_op1,
+                                  outof_superword, into_superword,
+                                  unsignedp, methods))
+       return false;
+     }
+ 
+   /* Put the subword version directly in OUTOF_TARGET and INTO_TARGET.  */
+   if (!expand_subword_shift (op1_mode, binoptab,
+                            outof_input, into_input, subword_op1,
+                            outof_target, into_target,
+                            unsignedp, methods, shift_mask))
+     return false;
+ 
+   /* Select between them.  Do the INTO half first because INTO_SUPERWORD
+      might be the current value of OUTOF_TARGET.  */
+   if (!emit_conditional_move (into_target, cmp_code, cmp1, cmp2, op1_mode,
+                             into_target, into_superword, word_mode, false))
+     return false;
+ 
+   if (outof_target != 0)
+     if (!emit_conditional_move (outof_target, cmp_code, cmp1, cmp2, op1_mode,
+                               outof_target, outof_superword,
+                               word_mode, false))
+       return false;
+ 
+   return true;
+ }
+ #endif
+ 
+ /* Expand a doubleword shift (ashl, ashr or lshr) using word-mode shifts.
+    OUTOF_INPUT and INTO_INPUT are the two word-sized halves of the first
+    input operand; the shift moves bits in the direction OUTOF_INPUT->
+    INTO_TARGET.  OUTOF_TARGET and INTO_TARGET are the equivalent words
+    of the target.  OP1 is the shift count and OP1_MODE is its mode.
+    If OP1 is constant, it will have been truncated as appropriate
+    and is known to be nonzero.
+ 
+    If SHIFT_MASK is zero, the result of word shifts is undefined when the
+    shift count is outside the range [0, BITS_PER_WORD).  This routine must
+    avoid generating such shifts for OP1s in the range [0, BITS_PER_WORD * 2).
+ 
+    If SHIFT_MASK is nonzero, all word-mode shift counts are effectively
+    masked by it and shifts in the range [BITS_PER_WORD, SHIFT_MASK) will
+    fill with zeros or sign bits as appropriate.
+ 
+    If SHIFT_MASK is BITS_PER_WORD - 1, this routine will synthesise
+    a doubleword shift whose equivalent mask is BITS_PER_WORD * 2 - 1.
+    Doing this preserves semantics required by SHIFT_COUNT_TRUNCATED.
+    In all other cases, shifts by values outside [0, BITS_PER_UNIT * 2)
+    are undefined.
+ 
+    BINOPTAB, UNSIGNEDP and METHODS are as for expand_binop.  This function
+    may not use INTO_INPUT after modifying INTO_TARGET, and similarly for
+    OUTOF_INPUT and OUTOF_TARGET.  OUTOF_TARGET can be null if the parent
+    function wants to calculate it itself.
+ 
+    Return true if the shift could be successfully synthesized.  */
+ 
+ static bool
+ expand_doubleword_shift (enum machine_mode op1_mode, optab binoptab,
+                        rtx outof_input, rtx into_input, rtx op1,
+                        rtx outof_target, rtx into_target,
+                        int unsignedp, enum optab_methods methods,
+                        unsigned HOST_WIDE_INT shift_mask)
+ {
+   rtx superword_op1, tmp, cmp1, cmp2;
+   rtx subword_label, done_label;
+   enum rtx_code cmp_code;
+ 
+   /* See if word-mode shifts by BITS_PER_WORD...BITS_PER_WORD * 2 - 1 will
+      fill the result with sign or zero bits as appropriate.  If so, the value
+      of OUTOF_TARGET will always be (SHIFT OUTOF_INPUT OP1).   Recursively 
call
+      this routine to calculate INTO_TARGET (which depends on both OUTOF_INPUT
+      and INTO_INPUT), then emit code to set up OUTOF_TARGET.
+ 
+      This isn't worthwhile for constant shifts since the optimizers will
+      cope better with in-range shift counts.  */
+   if (shift_mask >= BITS_PER_WORD
+       && outof_target != 0
+       && !CONSTANT_P (op1))
+     {
+       if (!expand_doubleword_shift (op1_mode, binoptab,
+                                   outof_input, into_input, op1,
+                                   0, into_target,
+                                   unsignedp, methods, shift_mask))
+       return false;
+       if (!force_expand_binop (word_mode, binoptab, outof_input, op1,
+                              outof_target, unsignedp, methods))
+       return false;
+       return true;
+     }
+ 
+   /* Set CMP_CODE, CMP1 and CMP2 so that the rtx (CMP_CODE CMP1 CMP2)
+      is true when the effective shift value is less than BITS_PER_WORD.
+      Set SUPERWORD_OP1 to the shift count that should be used to shift
+      OUTOF_INPUT into INTO_TARGET when the condition is false.  */
+   tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+   if (!CONSTANT_P (op1) && shift_mask == BITS_PER_WORD - 1)
+     {
+       /* Set CMP1 to OP1 & BITS_PER_WORD.  The result is zero iff OP1
+        is a subword shift count.  */
+       cmp1 = simplify_expand_binop (op1_mode, and_optab, op1, tmp,
+                                   0, true, methods);
+       cmp2 = CONST0_RTX (op1_mode);
+       cmp_code = EQ;
+       superword_op1 = op1;
+     }
+   else
+     {
+       /* Set CMP1 to OP1 - BITS_PER_WORD.  */
+       cmp1 = simplify_expand_binop (op1_mode, sub_optab, op1, tmp,
+                                   0, true, methods);
+       cmp2 = CONST0_RTX (op1_mode);
+       cmp_code = LT;
+       superword_op1 = cmp1;
+     }
+   if (cmp1 == 0)
+     return false;
+ 
+   /* If we can compute the condition at compile time, pick the
+      appropriate subroutine.  */
+   tmp = simplify_relational_operation (cmp_code, SImode, op1_mode, cmp1, 
cmp2);
+   if (tmp != 0 && GET_CODE (tmp) == CONST_INT)
+     {
+       if (tmp == const0_rtx)
+       return expand_superword_shift (binoptab, outof_input, superword_op1,
+                                      outof_target, into_target,
+                                      unsignedp, methods);
+       else
+       return expand_subword_shift (op1_mode, binoptab,
+                                    outof_input, into_input, op1,
+                                    outof_target, into_target,
+                                    unsignedp, methods, shift_mask);
+     }
+ 
+ #ifdef HAVE_conditional_move
+   /* Try using conditional moves to generate straight-line code.  */
+   {
+     rtx start = get_last_insn ();
+     if (expand_doubleword_shift_condmove (op1_mode, binoptab,
+                                         cmp_code, cmp1, cmp2,
+                                         outof_input, into_input,
+                                         op1, superword_op1,
+                                         outof_target, into_target,
+                                         unsignedp, methods, shift_mask))
+       return true;
+     delete_insns_since (start);
+   }
+ #endif
+ 
+   /* As a last resort, use branches to select the correct alternative.  */
+   subword_label = gen_label_rtx ();
+   done_label = gen_label_rtx ();
+ 
+   do_compare_rtx_and_jump (cmp1, cmp2, cmp_code, false, op1_mode,
+                          0, 0, subword_label);
+ 
+   if (!expand_superword_shift (binoptab, outof_input, superword_op1,
+                              outof_target, into_target,
+                              unsignedp, methods))
+     return false;
+ 
+   emit_jump_insn (gen_jump (done_label));
+   emit_barrier ();
+   emit_label (subword_label);
+ 
+   if (!expand_subword_shift (op1_mode, binoptab,
+                            outof_input, into_input, op1,
+                            outof_target, into_target,
+                            unsignedp, methods, shift_mask))
+     return false;
+ 
+   emit_label (done_label);
+   return true;
+ }
  
  /* Wrapper around expand_binop which takes an rtx code to specify
     the operation to perform, not an optab pointer.  All other
*************** expand_binop (enum machine_mode mode, op
*** 1035,1152 ****
    if ((binoptab == lshr_optab || binoptab == ashl_optab
         || binoptab == ashr_optab)
        && class == MODE_INT
!       && GET_CODE (op1) == CONST_INT
        && GET_MODE_SIZE (mode) == 2 * UNITS_PER_WORD
        && binoptab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing
        && ashl_optab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing
        && lshr_optab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing)
      {
!       rtx insns, inter, equiv_value;
!       rtx into_target, outof_target;
!       rtx into_input, outof_input;
!       int shift_count, left_shift, outof_word;
! 
!       /* If TARGET is the same as one of the operands, the REG_EQUAL note
!        won't be accurate, so use a new target.  */
!       if (target == 0 || target == op0 || target == op1)
!       target = gen_reg_rtx (mode);
! 
!       start_sequence ();
! 
!       shift_count = INTVAL (op1);
! 
!       /* OUTOF_* is the word we are shifting bits away from, and
!        INTO_* is the word that we are shifting bits towards, thus
!        they differ depending on the direction of the shift and
!        WORDS_BIG_ENDIAN.  */
! 
!       left_shift = binoptab == ashl_optab;
!       outof_word = left_shift ^ ! WORDS_BIG_ENDIAN;
! 
!       outof_target = operand_subword (target, outof_word, 1, mode);
!       into_target = operand_subword (target, 1 - outof_word, 1, mode);
! 
!       outof_input = operand_subword_force (op0, outof_word, mode);
!       into_input = operand_subword_force (op0, 1 - outof_word, mode);
! 
!       if (shift_count >= BITS_PER_WORD)
!       {
!         inter = expand_binop (word_mode, binoptab,
!                              outof_input,
!                              GEN_INT (shift_count - BITS_PER_WORD),
!                              into_target, unsignedp, next_methods);
! 
!         if (inter != 0 && inter != into_target)
!           emit_move_insn (into_target, inter);
! 
!         /* For a signed right shift, we must fill the word we are shifting
!            out of with copies of the sign bit.  Otherwise it is zeroed.  */
!         if (inter != 0 && binoptab != ashr_optab)
!           inter = CONST0_RTX (word_mode);
!         else if (inter != 0)
!           inter = expand_binop (word_mode, binoptab,
!                                 outof_input,
!                                 GEN_INT (BITS_PER_WORD - 1),
!                                 outof_target, unsignedp, next_methods);
! 
!         if (inter != 0 && inter != outof_target)
!           emit_move_insn (outof_target, inter);
!       }
!       else
!       {
!         rtx carries;
!         optab reverse_unsigned_shift, unsigned_shift;
! 
!         /* For a shift of less then BITS_PER_WORD, to compute the carry,
!            we must do a logical shift in the opposite direction of the
!            desired shift.  */
! 
!         reverse_unsigned_shift = (left_shift ? lshr_optab : ashl_optab);
! 
!         /* For a shift of less than BITS_PER_WORD, to compute the word
!            shifted towards, we need to unsigned shift the orig value of
!            that word.  */
! 
!         unsigned_shift = (left_shift ? ashl_optab : lshr_optab);
! 
!         carries = expand_binop (word_mode, reverse_unsigned_shift,
!                                 outof_input,
!                                 GEN_INT (BITS_PER_WORD - shift_count),
!                                 0, unsignedp, next_methods);
! 
!         if (carries == 0)
!           inter = 0;
!         else
!           inter = expand_binop (word_mode, unsigned_shift, into_input,
!                                 op1, 0, unsignedp, next_methods);
! 
!         if (inter != 0)
!           inter = expand_binop (word_mode, ior_optab, carries, inter,
!                                 into_target, unsignedp, next_methods);
! 
!         if (inter != 0 && inter != into_target)
!           emit_move_insn (into_target, inter);
! 
!         if (inter != 0)
!           inter = expand_binop (word_mode, binoptab, outof_input,
!                                 op1, outof_target, unsignedp, next_methods);
  
!         if (inter != 0 && inter != outof_target)
!           emit_move_insn (outof_target, inter);
!       }
! 
!       insns = get_insns ();
!       end_sequence ();
! 
!       if (inter != 0)
!       {
!         if (binoptab->code != UNKNOWN)
!           equiv_value = gen_rtx_fmt_ee (binoptab->code, mode, op0, op1);
!         else
!           equiv_value = 0;
  
!         emit_no_conflict_block (insns, target, op0, op1, equiv_value);
!         return target;
        }
      }
  
--- 1384,1454 ----
    if ((binoptab == lshr_optab || binoptab == ashl_optab
         || binoptab == ashr_optab)
        && class == MODE_INT
!       && (GET_CODE (op1) == CONST_INT || !optimize_size)
        && GET_MODE_SIZE (mode) == 2 * UNITS_PER_WORD
        && binoptab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing
        && ashl_optab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing
        && lshr_optab->handlers[(int) word_mode].insn_code != CODE_FOR_nothing)
      {
!       unsigned HOST_WIDE_INT shift_mask, double_shift_mask;
!       enum machine_mode op1_mode;
  
!       double_shift_mask = targetm.shift_truncation_mask (mode);
!       shift_mask = targetm.shift_truncation_mask (word_mode);
!       op1_mode = GET_MODE (op1) != VOIDmode ? GET_MODE (op1) : word_mode;
! 
!       /* Apply the truncation to constant shifts.  */
!       if (double_shift_mask > 0 && GET_CODE (op1) == CONST_INT)
!       op1 = GEN_INT (INTVAL (op1) & double_shift_mask);
! 
!       if (op1 == CONST0_RTX (op1_mode))
!       return op0;
! 
!       /* Make sure that this is a combination that expand_doubleword_shift
!        can handle.  See the comments there for details.  */
!       if (double_shift_mask == 0
!         || (shift_mask == BITS_PER_WORD - 1
!             && double_shift_mask == BITS_PER_WORD * 2 - 1))
!       {
!         rtx insns, equiv_value;
!         rtx into_target, outof_target;
!         rtx into_input, outof_input;
!         int left_shift, outof_word;
! 
!         /* If TARGET is the same as one of the operands, the REG_EQUAL note
!            won't be accurate, so use a new target.  */
!         if (target == 0 || target == op0 || target == op1)
!           target = gen_reg_rtx (mode);
! 
!         start_sequence ();
! 
!         /* OUTOF_* is the word we are shifting bits away from, and
!            INTO_* is the word that we are shifting bits towards, thus
!            they differ depending on the direction of the shift and
!            WORDS_BIG_ENDIAN.  */
! 
!         left_shift = binoptab == ashl_optab;
!         outof_word = left_shift ^ ! WORDS_BIG_ENDIAN;
! 
!         outof_target = operand_subword (target, outof_word, 1, mode);
!         into_target = operand_subword (target, 1 - outof_word, 1, mode);
! 
!         outof_input = operand_subword_force (op0, outof_word, mode);
!         into_input = operand_subword_force (op0, 1 - outof_word, mode);
! 
!         if (expand_doubleword_shift (op1_mode, binoptab,
!                                      outof_input, into_input, op1,
!                                      outof_target, into_target,
!                                      unsignedp, methods, shift_mask))
!           {
!             insns = get_insns ();
!             end_sequence ();
  
!             equiv_value = gen_rtx_fmt_ee (binoptab->code, mode, op0, op1);
!             emit_no_conflict_block (insns, target, op0, op1, equiv_value);
!             return target;
!           }
!         end_sequence ();
        }
      }
  
Index: config/arm/arm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.c,v
retrieving revision 1.399
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.399 arm.c
*** config/arm/arm.c    25 Aug 2004 09:52:09 -0000      1.399
--- config/arm/arm.c    31 Aug 2004 18:45:34 -0000
*************** static tree arm_get_cookie_size (tree);
*** 173,179 ****
  static bool arm_cookie_has_size (void);
  static bool arm_cxx_cdtor_returns_this (void);
  static void arm_init_libfuncs (void);
! 
  
  /* Initialize the GCC target structure.  */
  #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
--- 173,179 ----
  static bool arm_cookie_has_size (void);
  static bool arm_cxx_cdtor_returns_this (void);
  static void arm_init_libfuncs (void);
! static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode);
  
  /* Initialize the GCC target structure.  */
  #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
*************** #define TARGET_RTX_COSTS arm_slowmul_rtx
*** 246,251 ****
--- 246,253 ----
  #undef  TARGET_ADDRESS_COST
  #define TARGET_ADDRESS_COST arm_address_cost
  
+ #undef TARGET_SHIFT_TRUNCATION_MASK
+ #define TARGET_SHIFT_TRUNCATION_MASK arm_shift_truncation_mask
  #undef TARGET_VECTOR_MODE_SUPPORTED_P
  #define TARGET_VECTOR_MODE_SUPPORTED_P arm_vector_mode_supported_p
  
*************** arm_vector_mode_supported_p (enum machin
*** 14307,14309 ****
--- 14309,14322 ----
  
    return false;
  }
+ 
+ /* Implement TARGET_SHIFT_TRUNCATION_MASK.  SImode shifts use normal
+    ARM insns and therefore guarantee that the shift count is modulo 256.
+    DImode shifts (those implemented by libgcc1.asm or by optabs.c)
+    guarantee no particular behavior for out-of-range counts.  */
+ 
+ static unsigned HOST_WIDE_INT
+ arm_shift_truncation_mask (enum machine_mode mode)
+ {
+   return mode == SImode ? 255 : 0;
+ }

<Prev in Thread] Current Thread [Next in Thread>