Big loud bell began ringing. The RM7000 fetches and decodes multiple
instructions in one go. And just like the E9000 cores it does
throw an exception if it doesn't like one of the opcodes even if that
doesn't actually get executed. The kernel has a workaround for this
PMC-Sierra peculiarity (I call it a bug) but it's only being activated
for E9000 platforms.
We have had a similar problems with shell on RM7000 based system. It
seems, the reason listed above is only half of the problem, another is:
linux works incorrectly with RM7000 caches hierarchy. One visible effect
is errors in userspace on signal delivery trampolines.
Lets imagine we deliver a signal to application: we write signal
trampoline instructions to stack, writeback (and invalidate)
corresponding dcache line, invalidate corresponding icache line. Thats
all, and we think that we can safely execute the trampoline, but this is
wrong on RM7000! Our trampoline is now in scache, and everything seems
to be ok, but after some number of load/stores corresponding scache line
can be moved to dcache, replaced in scache by another data and not
written to memory (this is a feature of RM7000 caches, its dcache is not
a subset of scache, you can find a possible scenario of similar (but not
the same) cache line transference in RM7000 manual (7.1.5 Orphaned Cache
Lines)). After that it is possible that on signal trampoline execution
icache fetch old memory content instead of instruction written. If we
want to execute instruction written by cpu, we must not only writeback
corresponding dcache lines, but also writeback corresponding scache
lines after it. The error is very sensitively to kernel/user code and
data arrangement, it can be visible with one kernel configuration and
irreproducible with another.
The problem affects not only signal trampoline flush to memory, but most
cases of icache invalidation in kernel.
Sergey Rogozhkin.
|