linux-mips
[Top] [All Lists]

Re: RM7k cache_flush_sigtramp

To: Fuxin Zhang <fxzhang@ict.ac.cn>
Subject: Re: RM7k cache_flush_sigtramp
From: Ralf Baechle <ralf@linux-mips.org>
Date: Wed, 6 Aug 2003 16:45:14 +0200
Cc: Adam Kiepul <Adam_Kiepul@pmc-sierra.com>, MAKE FUN PRANK CALLS <linux-mips@linux-mips.org>
In-reply-to: <3F30FA1E.3000002@ict.ac.cn>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <9DFF23E1E33391449FDC324526D1F259017DF091@SJC1EXM02> <3F30DFB7.8030304@ict.ac.cn> <20030806115531.GA12161@linux-mips.org> <3F30FA1E.3000002@ict.ac.cn>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.4.1i
On Wed, Aug 06, 2003 at 08:52:46PM +0800, Fuxin Zhang wrote:

> I am not sure. It is stardard X distribution from debian-woody. Fairly 
> easy to reproduce,just move the mouse
> around and click here and there then it would die. Will check this 
> later,but I think such a giant as Xserver won't fork frequently.

The scenario I was describing was just how we did originally discover the
bug.  Supposedly that was fixed but your register dump and dissassembly
show the exact fingerprint of that old problem, so I though I should
describe it in the hope it's going to help you.

> If the new process touch the cow page first,shouldn't it get a new page 
> and leave the original page for parent?
> If so,the parent should be able to see the trampoline content from 
> icache anyway(either L2 or memory should
> have the value),though the child may not?

RM7000 has a physically indexed cache.  That means if the copy of the
page wasn't explicitly or implicitly written back to L2 the process
whichever ends up with the copy of the page might fetch stale instructions
from memory - boom.

> >  not been flushed proplerly in the previous step, thereby failing to
> >  execute the trampoline - crash.
> >
> RM7000 has 16k 4-way set-associated primary caches,which are supposed to 
> have no cache aliasing problem

The described scenario is not an aliasing problem; it's the case where the
copy of the cow page hasn't properly been flushed at all.  When we
isolated the bug was that neither flush_page_to_ram() nor flush_cache_page()
were flushing the cache.  I suspect your case must be something fairly
similar.

  Ralf

<Prev in Thread] Current Thread [Next in Thread>