| To: | "Ralf Baechle" <ralf@linux-mips.org>, "Atsushi Nemoto" <anemo@mba.ocn.ne.jp> |
|---|---|
| Subject: | RE: memcpy and prefetch |
| From: | "David VomLehn (dvomlehn)" <dvomlehn@cisco.com> |
| Date: | Thu, 29 Jan 2009 22:39:37 -0500 |
| Authentication-results: | rtp-dkim-2; header.From=dvomlehn@cisco.com; dkim=pass ( sig from cisco.com/rtpdkim2001 verified; ); |
| Cc: | <ddaney@caviumnetworks.com>, "Michael Sundius -X (msundius - Yoh Services LLC at Cisco)" <msundius@cisco.com>, <linux-mips@linux-mips.org>, <msundius@sundius.com> |
| Dkim-signature: | v=1; a=rsa-sha256; q=dns/txt; l=1659; t=1233286778; x=1234150778; c=relaxed/simple; s=rtpdkim2001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=dvomlehn@cisco.com; z=From:=20=22David=20VomLehn=20(dvomlehn)=22=20<dvomlehn@cis co.com> |Subject:=20RE=3A=20memcpy=20and=20prefetch |Sender:=20 |To:=20=22Ralf=20Baechle=22=20<ralf@linux-mips.org>,=0A=20= 20=20=20=20=20=20=20=22Atsushi=20Nemoto=22=20<anemo@mba.ocn. ne.jp>; bh=Kq0jYqdl+9SUjeFYBjG4WzRAu6mBBgSxMUs0J7R7a+8=; b=vTXeMBHhAF2vQUxvTn5KcDUuWRngS9nePdpDYkAyCd43JqacwVuWy07Yyt 5e2leTCSmSoxZjjqE2eb3FkePazgjB3A9lOZvohJGn1kCkIBgJmAKah8WF0W B5TEz00hYp; |
| In-reply-to: | <20090129155854.GC29521@linux-mips.org> |
| Original-recipient: | rfc822;linux-mips@linux-mips.org |
| References: | <20090128103753.GC2234@linux-mips.org> <20090129.002850.118974677.anemo@mba.ocn.ne.jp> <20090128183047.GA1691@linux-mips.org> <20090129.213613.128618730.anemo@mba.ocn.ne.jp> <20090129155854.GC29521@linux-mips.org> |
| Sender: | linux-mips-bounce@linux-mips.org |
| Thread-index: | AcmCKpIrQPWnNMO0Tsqxqvphv6Zt2wAYO6Kg |
| Thread-topic: | memcpy and prefetch |
> The idea here is that we have two issues with prefetching: > > o Prefetching beyond the end of the source or destination range on a > in-coherent range might bring back stale values from a DMA I/O > buffer resulting in data corruption. Hardware DMA coherency will > avoid this issue. > > o IP27 has full blown hardware coherency. Historically > CONFIG_DMA_COHERENT > was not able to cope with something of the complexity of IP27, so > there was a separate CONFIG_DMA_IP27 and the broken logic > expression > was meant to treat CONFIG_DMA_COHERENT and CONFIG_DMA_IP27 the same > as for prefetching. > > o Prefetching beyond the end of physical memory can cause > exceptions on > some systems. The Malta has this problem. > > Thus no prefetching on Malta or non-coherent systems. > > Ralf It seems to me as though we could avoid the first and third problems with a memcpy that doesn't prefetch past the end of the buffer, the thought being that if we are reading or writing a memory region, we really shouldn't be doing DMA to or from that location. This would probably be slightly suboptimal, performance-wise, for those systems that do have DMA coherence. It seems as though we could have two mutually exclusive versions, selectable via the CONFIG_DMA_COHERENT flag. For those of us without DMA coherence, it would probably give our memcpy performance a bit of a kick in the pants over using no prefetch at all. If this makes sense, we might be able to sign up to do the work. Anyone have a good, caching-aware memcpy test? -- David VomLehn, dvomlehn@cisco.com |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: [PATCH] Alchemy: time.c build fix, Ralf Baechle |
|---|---|
| Next by Date: | GCC-4.3.3 sillyness, Manuel Lauss |
| Previous by Thread: | Re: memcpy and prefetch, Ralf Baechle |
| Next by Thread: | Re: memcpy and prefetch, Michael Sundius |
| Indexes: | [Date] [Thread] [Top] [All Lists] |