On Sat, Nov 01, 2008 at 08:33:03PM +0000, Maciej W. Rozycki wrote:
> > There are two ways we could handle this:
> >
> > - Make -mfix-r10000 require -mbranch-likely. (It mustn't _imply_
> > -mbranch-likely. It should simply check that -mbranch-likely is
> > already in effect.)
> >
> > - Make -mfix-r10000 insert nops when -mbranch-likely is not in effect.
>
> If I recall right, these is something special about the pipeline in this
> context making the branch-likely instructions the only ones that work.
> Which would make the option you proposed first the only viable. I am not
> absolutely sure and I have no reference handy. Perhaps Ralf or someone at
> linux-mips will know?
There are two possible workarounds. The other which IRIX and the Linux
kernel are using is based on the branch-likely instruction. The way it
works is that R10000 family processors have a fairly cheesy branch
prediction for branch likely (unlike all MIPS32 and MIPS64 processors I
know of!) which predicts branch likely instructions as always taken. So
if a SC instruction succeeds the loop closure branch of the usual LL/SC
loop will be miss-predicted and the pipeline restarted.
The alternative is to put enough NOPs (upto 28) after the loop closure
brach to avoid a sequence of 4 problematic instructions being active in
the pipeline at the same time.
SSNOP won't cut it btw. SSNOP don't have any influence on the predecode
and reordering buffers - even assuming the R10000 actually honors SSNOP.
Implementing the special treatment of SSNOP (which is encoded as
SLL $0, $0, 1) just doesn't make sense for an R10000 calibre processor.
Is gcc capable of guaranteeing a certain minimum number of instructions
between one LL and another LL instruction? Then this knowledge could be
used to avoid the branch likely or cut down the padding NOPs.
Ralf
|