| To: | Franck <vagabon.xyz@gmail.com> |
|---|---|
| Subject: | Re: [RFC] Optimize swab operations on mips_r2 cpu |
| From: | Nigel Stephens <nigel@mips.com> |
| Date: | Thu, 26 Jan 2006 20:25:09 +0000 |
| Cc: | "Kevin D. Kissell" <kevink@mips.com>, linux-mips@linux-mips.org |
| In-reply-to: | <cda58cb80601261002w6eb02249k@mail.gmail.com> |
| Organization: | MIPS Technologies |
| Original-recipient: | rfc822;linux-mips@linux-mips.org |
| References: | <cda58cb80601250136p5ee350e6g@mail.gmail.com> <cda58cb80601250632r3e8f7b9en@mail.gmail.com> <20060125150404.GF3454@linux-mips.org> <cda58cb80601251003m6ba4379w@mail.gmail.com> <43D7C050.5090607@mips.com> <cda58cb80601260702wf781e70l@mail.gmail.com> <005101c6228c$6ebfb0a0$10eca8c0@grendel> <43D8F000.9010106@mips.com> <cda58cb80601260831i61167787g@mail.gmail.com> <43D8FF16.40107@mips.com> <cda58cb80601261002w6eb02249k@mail.gmail.com> |
| Sender: | linux-mips-bounce@linux-mips.org |
| User-agent: | Debian Thunderbird 1.0.2 (X11/20050817) |
Franck wrote: 2006/1/26, Nigel Stephens <nigel@mips.com>:1) Using -march=4ksd reduces the cost of a multiply by 1 instruction (from 5 to 4 cycles), so a few more constant multiplications, previously expanded into a sequence of shifts, adds and subs, may now be replaced by a shorter sequence of "li" and "mul" instructions.Is it really specific to 4ksd cpu ? Could this behaviour be triggered by other options ? Yes, when you use -Os the compiler uses the instruction cost (1) of a mul, instead of the cycle cost (4), so it will be even more likely to replace the expanded shift/add sequence by a mul. text data bss dec hex filename 2099642 110784 81956 2292382 22fa9e vmlinux-4ksd 2136269 110784 81956 2329009 2389b1 vmlinux-mips32r2 1953086 110784 81956 2145826 20be22 vmlinux-4ksd-Os 1954489 110784 81956 2147229 20c39d vmlinux-mips32r2-Os I now have to check that your first and second points don't have too much bad impact on the overall speed although I don't know how to measure that...But if so, I could safely use -march=mips32r2 -Os options. You could, but why not stick with -march=4ksd if that's your CPU of choice? It appears to result in marginally smaller code even when using -Os, and should have (slightly) better performance than a generic mips32r2 kernel? Nigel |
| Previous by Date: | Re: [PATCH 1/6] {set,clear,test}_bit() related cleanup, Paul Jackson |
|---|---|
| Next by Date: | Re: [parisc-linux] Re: [PATCH 3/6] C-language equivalents of include/asm-*/bitops.h, Grant Grundler |
| Previous by Thread: | Re: [RFC] Optimize swab operations on mips_r2 cpu, Franck |
| Next by Thread: | Re: [RFC] Optimize swab operations on mips_r2 cpu, Franck |
| Indexes: | [Date] [Thread] [Top] [All Lists] |