> > (Software cache coherency) It is possible,
> > but tricky, and at times unavoidably inefficient to build a
> > software-coherent SMP system. I have not heard of anyone
> > doing so with MIPS/Linux.
> How would it be possible? Any reference to the previous implementations?
Lots of work on software coherent schemes was done in the
mid-late 1980s. Check out the ASPLOS, and ISCA proceedings
from the period for references. In essence, such schemes involve
the identification of critical regions at risk, the use of barriers around
such regions, and an explicit cache flush/purge protocol. You can think
of the more common MP "TLB shootdown" protocols as being a variant
of a software cache coherence scheme.
> I imagine you would need at least some kind of atomic operation (like
> working reliably (which itself may require cache coherency).
MIPS ll/sc, as defined and implemented, does require hardware
coherency support for correct multiprocessor operation. But one
can, in principle, construct a software-coherent SMP system even
in the absence of such a primitive - many of the implementations
of software coherent SMPs used software coherence precisely
because they were based on simple switch/crossbar interconnects
where snooping was not possible.
> Also, any such
> scheme should not require massive change in the programming.
Whether progams need to change depends on the coherency
and consistency models assumed by the program. Certainly
a naive multithreaded program that assumes an SGI-like model
could not be dropped onto a software-coherent MP system without
recompilation with specialized compilers at a minimum, and
more likely not without recoding. On the other hand, if one's objective
is to run multiple, independent programs on different CPUs in
an SMP system, it should only be the OS that should need to
change to deal with the coherence issues for shared user pages
and shared kernel data structures, and to ensure that any
multithreaded application that is not explicitly set up to handle
software cache coherency has its threads bound to the same
CPU and caches (defeats some of the point of having a
multithreaded program, I know, but...).