linux-mips
[Top] [All Lists]

Re: Performance bug in c-r4k.c cache handling code

To: "Maciej W. Rozycki" <macro@linux-mips.org>
Subject: Re: Performance bug in c-r4k.c cache handling code
From: Dominic Sweetman <dom@mips.com>
Date: Tue, 20 Sep 2005 10:09:20 +0100
Cc: Thiemo Seufer <ths@networkno.de>, linux-mips@linux-mips.org
In-reply-to: <Pine.LNX.4.61L.0509191733180.5551@blysk.ds.pg.gda.pl>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <20050919154056.GG3386@hattusa.textio> <Pine.LNX.4.61L.0509191733180.5551@blysk.ds.pg.gda.pl>
Sender: linux-mips-bounce@linux-mips.org
> > I found an performance bug in c-r4k.c:r4k_dma_cache_inv, where a
> > Hit_Writeback_Inv instead of Hit_Invalidate is done.

The MIPS64 spec (which is really all there is to set standards in this
area) regards Hit_Invalidate as optional.  So it would be nice not to
use it.  CPUs have no standard "configuration" register you can read
to establish which cacheops work, so to identify capable CPUs you must
use a table of CPU attributes indexed by the CPU ID, which encourages
the crime of building software which can't possibly run on a new CPU...

So long as the buffer is in fact clean, then in most implementations a
Hit_Writeback_Invalidate will be just as efficient.

Moreover, CPUs always "post" writes to some extent, so a small
percentage of dirty lines can be handled without any great overhead.
So a significant advantage can only occur when the buffer you want to
invalidate (prior to DMA-in) was fairly recently densely written by
the CPU; and this is only safe when all that data can be guaranteed to
now be of no importance to anyone.

Randomly and retrospectively discarding writes could generate some
very interesting bugs, or (indeed) usually hide some very interesting
bugs.  It's the kind of thing one would lik to avoid!

I suppose where DMA data subsequently gets decorated by the CPU then
handed on to some other layer, then the buffer is freed...?

> FYI, for R4k DECstations the need to flush the cache for newly allocated 
> skbs reduces throughput of FDDI reception by about a half (!), down from 
> about 90Mbps (that's for the /260)...

How did you measure the high throughput?  Have you got a
machine with DMA-coherency you can turn on and off?

--
Dominic


<Prev in Thread] Current Thread [Next in Thread>