-march=mips32r2 is to allow the compiler to generate branch-likely
instructions -- they're deprecated for generic mips32 code but carry no
penalty on the 4K core. It will also cause the compiler's "4kc" pipeline
description to be used for instruction scheduling, instead of the
default "24kc", but that should only change the order of instructions
Do you mean that the code can be run faster when using -march=4ksd ?
Yes, though the difference is likely to be small. The -march=4ksd option
also enables the SmartMIPS ASE, but you've already done that explicitly
and shouldn't really make a significant difference to the code size.
yes but I have :(
Then you'll have to have a look at the resulting disassembled code and
figure what's changed. :)
Thinking about this in more detail:
1) Using -march=4ksd reduces the cost of a multiply by 1 instruction
(from 5 to 4 cycles), so a few more constant multiplications, previously
expanded into a sequence of shifts, adds and subs, may now be replaced
by a shorter sequence of "li" and "mul" instructions.
2) Enabling branch-likely may allow some instructions to be moved into a
branch delay slot which previously couldn't be -- but usually these are
duplicates of the code at the original branch target, so have little
effect on overall code size.
3) Using -march=mips32r2 with -O1 and above (but not -Os) enables 64-bit
alignment of functions and frequently-used branch targets (e.g. loop
headers); whereas -march=4ksc will not do that. This will add some
additional "nops" to the code.