Thought that patch went unnoticed and almost forgot about it myself...
Nice to see that it didn't slip through, and thanks for taking the time
About your comments:
>>>>> "Ralf" == Ralf Baechle <firstname.lastname@example.org> writes:
> On Sat, Aug 20, 2011 at 12:46:25PM +0200, David Kuehling wrote:
>> this patch adds a MIPS assembler version of the twofish cipher
>> algorithm. x86(_64) had an assembler version of twofish for some
>> time now, giving it an "unfair" advantage against the not so common
> Lots of trailing whitespace in that patch. scripts/checkpatch.pl
> would have warned about those …
> +#if __mips64 +# define HAVE_64BIT +#endif
> +#if _MIPS_SZPTR == 32 +# define PTRADD addu
> PTR_ADDU from <asm/asm.h> does the same thing as PTRADDU so the entire
> ifdefery above can go away.
Good point, that'll neatly clean the patch up.
> Finally the use of register names starting with $ is a bit obscure.
> Kernel code needs to build for N64 and O32 but of those only N64 has
> registers called $ta0 .. $ta3 which are the equivalent to registers $8
> .. $11.
> And in O32 registers $t0 ... $t3 also aliases for $8 .. $11. I
> haven't fully analyzed the code to ensure that there is no register
> conflict arising from that.
Didn't find another way to make the code work with O32 and N64.
$ta0..$ta3 can replace registers $t4..$t7 that are missing on N64 to
make room for additional argument registers $a4..$a7. This works as
long as $a4..$a7 aren't used as well.
Looking at asm/regdef.h I see that #define ta0 etc. is missing for O32,
making that trick impossible. Maybe we should add them there as well?
> I was surprised that gas assembles the code at all. $ta0 .. $ta3 are
> N32 / N64 register names and consider gas permitting the use of these
> registers in O32 a bug - but see my other posting to the binutils list
> for that.
Yes, that feature may be obscure, had to grep through binutil sources to
find out about it...
> Improvment suggestion for le32_fromto_cpu - MIPS 32/64 R2 CPUs can use
> the wsbh instruction to faster endianess swapping:
I completely missed the wsbh opcode. Knew about rotr, but given that
even the recent Loongson-2f doesn't support it, I was reluctant to add
code for it (plus i won't be able to benchmark it anyways).
> This code fragment is from <asm/swab.h>.
> + /* if we turned this into 64-bit ops, we get endianess issues on +
> big-endian mips, plus alignment problems */
> Some CPUs (Cavium Octeon) handle unaligned loads in hardware, on
> others a combination of LDL / LDR and SDL / SDR could be used to
> handle the unaligned loads and DSBH / DSHD (see __arch_swab64 in
> swab.h) could be used on MIPS64 R2 CPUs to handle the alignment
> issues. Or a rotate - have to think about it.
There is an alignmask field in the crypto_alg struct, that might prevent
alignment issues. Didn't have an in-depth look at how that works
though. Still there would be endianess issues, given that twofish is
designed to work on words of 32-bit.
I think the load/store code of the crypto routine does not have much
influence on performance, as it is outside the main loop (including the
endianess conversion). There is a rotation operation within the loop,
that could be sped up using rotr opcode, giving maybe 2% performance
gain. Not sure whether that's worth another round of #ifdefs and twice
What'd be the best way to check for rot opcode support? Use
CONFIG_CPU_MIPSR2? Check gcc define __mips>=32 && __mips_isa_rev>=2
(grepping through kernel asm files, I don't find any asm sources that
I'm currently a little swamped with work, btw, might take me a few days
to prepare an updated patch.
GnuPG public key: http://dvdkhlng.users.sourceforge.net/dk.gpg
Fingerprint: B17A DC95 D293 657B 4205 D016 7DEF 5323 C174 7D40
Description: PGP signature