Many of us are aware of a hole in current TLB flushing code that
could cause processes using the same ASID for a SMP machine.
Actually there are several problems:
1) get_new_mmu_context() and following set_entryhi, etc are
not called automically in switch_mm() and active_mm(). If
an IPI happens and request to flush local tlb, bad things happen.
2) if local_flush_tlb_range() and local_flush_tlb_mm() are
called from an IPI, they may call get_new_mmu_context() which
can bump up the ASID generation number with current active_mm
totally not aware of it. Bad things will happen later.
3) during the time window after schedule() calling switch_mm()
before switch_to(), current->active_mm may be valid but does
really mean "current->active_mm" anymore. This is because
the "current" process will soon become "prev". The real active_mm
is actually "next->active_mm". Because of this, it is not
enough for those two IPI'ed flushing routines to just check
again current->active_mm. Long story made short - bad
things will happen.
It turns out that other arches have similar problems and solved
it in various ways. Unfortunely I like none of them.
Here is one I am pretty happy with. It is very small and efficient.
And conceptually it is clean too. We basically keep the semantics
of ->mm and ->active_mm unchanged and only introduce a new bit
to mark which mm is the true owner of mmu hardware on a cpu.
The only downside is that cpu_vm_mask variable does not really
mean "mask for blocking IPI" in this approach. It actually
indicates whether current->active_mm is really active or not.
Tested and passed the notorious fork/malloc test.
Let me know what you think.
Jun
junk
Description: Text document
|