linux-mips
[Top] [All Lists]

Re: dcache aliasing problem on fork

To: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Subject: Re: dcache aliasing problem on fork
From: Jun Sun <jsun@junsun.net>
Date: Fri, 4 Feb 2005 09:44:10 -0800
Cc: linux-mips@linux-mips.org, ralf@linux-mips.org
In-reply-to: <20050204.183813.132760959.nemoto@toshiba-tops.co.jp>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <20050204.183813.132760959.nemoto@toshiba-tops.co.jp>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.4.1i
On Fri, Feb 04, 2005 at 06:38:13PM +0900, Atsushi Nemoto wrote:
> There is a dcache aliasing problem on preempt kernel (or SMP kernel,
> perhaps) when a multi-threaded program calls fork().
> 
> 1. Now there is a process containing two thread (T1 and T2).  The
>    thread T1 call fork().  dup_mmap() function called on T1 context.
> 
> static inline int dup_mmap(struct mm_struct * mm, struct mm_struct * oldmm)
> {
>       ...
>       flush_cache_mm(current->mm);
>       /* A */
>       ...
>       (write-protect all Copy-On-Write pages)
>       ...
>       /* B */
>       flush_tlb_mm(current->mm);
>       ...
> }
> 
> 2. When preemption happens between A and B (or on SMP kernel), the
>    thread T2 can run and modify data on COW pages without page fault
>    (modified data will stay in cache).
> 
> 3. Some time after fork() completed, the thread T2 may cause page
>    fault by write-protect on COW pages .
> 
> 4. Then data of the COW page will be copied to newly allocated
>    physical page (copy_cow_page()).  It reads data via kernel mapping.
>    The kernel mapping can have different 'color' with user space
>    mapping of the thread T2 (dcache aliasing).  Therefore
>    copy_cow_page() will copy stale data.  Then the modified data in
>    cache will be lost.
> 
> 
> How should we fix this problem?  Any idea?
> 

It seems to me a naive solution is to introduce a spinlock to make all
three operation automic.  you flush tlb first and make relavent tlb fault
handling sync with this spinlock as well.

At in theory it should fix the problem, but the spinlock might be held
for too long this dup_mmap().

BTW, is this problem real or hypothetic?

Jun

<Prev in Thread] Current Thread [Next in Thread>