[Top] [All Lists]

Re: [RFC] Optimize swab operations on mips_r2 cpu

To: Franck <>
Subject: Re: [RFC] Optimize swab operations on mips_r2 cpu
From: Nigel Stephens <>
Date: Thu, 26 Jan 2006 20:25:09 +0000
Cc: "Kevin D. Kissell" <>,
In-reply-to: <>
Organization: MIPS Technologies
Original-recipient: rfc822;
References: <> <> <> <> <> <> <005101c6228c$6ebfb0a0$10eca8c0@grendel> <> <> <> <>
User-agent: Debian Thunderbird 1.0.2 (X11/20050817)

Franck wrote:

2006/1/26, Nigel Stephens <>:
1) Using -march=4ksd reduces the cost of a multiply by 1 instruction
(from 5 to 4 cycles), so a few more constant multiplications, previously
expanded into a sequence of shifts, adds and subs, may now be replaced
by a shorter sequence of "li" and "mul" instructions.

Is it really specific to 4ksd cpu ? Could this behaviour be triggered
by other options ?

Yes, when you use -Os the compiler uses the instruction cost (1) of a mul, instead of the cycle cost (4), so it will be even more likely to replace the expanded shift/add sequence by a mul.

  text    data     bss     dec     hex filename
2099642  110784   81956 2292382  22fa9e vmlinux-4ksd
2136269  110784   81956 2329009  2389b1 vmlinux-mips32r2
1953086  110784   81956 2145826  20be22 vmlinux-4ksd-Os
1954489  110784   81956 2147229  20c39d vmlinux-mips32r2-Os

I now have to check that your first and second points don't have too
much bad impact on the overall speed although I don't know how to
measure that...But if so, I could safely use -march=mips32r2 -Os

You could, but why not stick with -march=4ksd if that's your CPU of choice? It appears to result in marginally smaller code even when using -Os, and should have (slightly) better performance than a generic mips32r2 kernel?


<Prev in Thread] Current Thread [Next in Thread>