[Top] [All Lists]

RE: memcpy and prefetch

To: "Ralf Baechle" <>, "Atsushi Nemoto" <>
Subject: RE: memcpy and prefetch
From: "David VomLehn (dvomlehn)" <>
Date: Thu, 29 Jan 2009 22:39:37 -0500
Authentication-results: rtp-dkim-2;; dkim=pass ( sig from verified; );
Cc: <>, "Michael Sundius -X (msundius - Yoh Services LLC at Cisco)" <>, <>, <>
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; l=1659; t=1233286778; x=1234150778; c=relaxed/simple; s=rtpdkim2001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version;;; z=From:=20=22David=20VomLehn=20(dvomlehn)=22=20<dvomlehn@cis> |Subject:=20RE=3A=20memcpy=20and=20prefetch |Sender:=20 |To:=20=22Ralf=20Baechle=22=20<>,=0A=20= 20=20=20=20=20=20=20=22Atsushi=20Nemoto=22=20<anemo@mba.ocn.>; bh=Kq0jYqdl+9SUjeFYBjG4WzRAu6mBBgSxMUs0J7R7a+8=; b=vTXeMBHhAF2vQUxvTn5KcDUuWRngS9nePdpDYkAyCd43JqacwVuWy07Yyt 5e2leTCSmSoxZjjqE2eb3FkePazgjB3A9lOZvohJGn1kCkIBgJmAKah8WF0W B5TEz00hYp;
In-reply-to: <>
Original-recipient: rfc822;
References: <> <> <> <> <>
Thread-index: AcmCKpIrQPWnNMO0Tsqxqvphv6Zt2wAYO6Kg
Thread-topic: memcpy and prefetch
> The idea here is that we have two issues with prefetching:
>  o Prefetching beyond the end of the source or destination range on a
>    in-coherent range might bring back stale values from a DMA I/O
>    buffer resulting in data corruption.  Hardware DMA coherency will
>    avoid this issue.
>  o IP27 has full blown hardware coherency.  Historically 
>    was not able to cope with something of the complexity of IP27, so
>    there was a separate CONFIG_DMA_IP27 and the broken logic 
> expression
>    was meant to treat CONFIG_DMA_COHERENT and CONFIG_DMA_IP27 the same
>    as for prefetching.
>  o Prefetching beyond the end of physical memory can cause 
> exceptions on
>    some systems.  The Malta has this problem.
> Thus no prefetching on Malta or non-coherent systems.
>   Ralf

It seems to me as though we could avoid the first and third problems
with a memcpy that doesn't prefetch past the end of the buffer, the
thought being that if we are reading or writing a memory region, we
really shouldn't be doing DMA to or from that location. This would
probably be slightly suboptimal, performance-wise, for those systems
that do have DMA coherence. It seems as though we could have two
mutually exclusive versions, selectable via the CONFIG_DMA_COHERENT
flag. For those of us without DMA coherence, it would probably give our
memcpy performance a bit of a kick in the pants over using no prefetch
at all.

If this makes sense, we might be able to sign up to do the work. Anyone
have a good, caching-aware memcpy test?
David VomLehn,

<Prev in Thread] Current Thread [Next in Thread>