[Top] [All Lists]

Re: memcpy and prefetch

To: David Daney <>
Subject: Re: memcpy and prefetch
From: Michael Sundius <>
Date: Wed, 28 Jan 2009 11:28:10 -0800
Authentication-results: rtp-dkim-2;; dkim=pass ( sig from verified; );
Cc:, "VomLehn, David" <>,
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; l=2360; t=1233170894; x=1234034894; c=relaxed/simple; s=rtpdkim2001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version;;; z=From:=20Michael=20Sundius=20<> |Subject:=20Re=3A=20memcpy=20and=20prefetch |Sender:=20 |To:=20David=20Daney=20<>; bh=nekCxxdYkz1nAM+XWlK/wjU52MOHKmY0ReSjITtALU4=; b=czuZuoBcrUzTXp1pshWU+F38ISVQyrwAxG5ppaN/Cyue3FSIH/K+ejjKBJ y8Fhs8vGZFWwK+Vs9yEaD/2ZPlczTkmdRILtj6+VNfrZMscGXnbsiGvnfjcC 10ko876JYg;
In-reply-to: <>
Original-recipient: rfc822;
References: <> <>
User-agent: Thunderbird (X11/20080501)
David Daney wrote:
Michael Sundius wrote:
I know this topic has been written about but so excuse me if I am redundant. I saw lots of talk in the archives but I don't know if a solution was ever arrived
at. so:

what is the current state of the use of prefetch in memcpy()? it seems that
it is #undef-ed if CONFIG_DMA_COHERENT is not turned on.

is this still because the memcpy does not check to prevent a prefetch of
addresses beyond the end of the buffer?

If so, what was the reason a solution was abandoned....

also  has anyone out there written a memcopy that does use prefetch
intelligently (for mips32 that is)?

The Cavium OCTEON port overrides the default memcpy and does use prefetch. It was recently merged (2.6.29-rc2). Look at octeon-memcpy.S

I have thought that memcpy could be generated by mm/page.c as copy_page and clear_page are.

David Daney

thanks!!! that's really useful. I have a few questions tho:

1) So you made this function explicitly for the Octeon. and that is because you know the cache-line is 128 bytes long
on the octeon? is that right?

2) It seems as though you always prefectch the first cache line.. what happens if the memcopy is less than 1 cache line long?
wouldn't you risk prefetching beyond the end of the buffer?

3) why do you only do the "pref 0 offset(src)" and not a prefetch for the destination?

4) on line 244 you check to see if len is less than 128. while on the other checks you check for (offset)+1 why would you not do the prefetch if len was exactly 256 bytes? (or 128 in the case of line 196)?


- - - - - Cisco - - - - - This e-mail and any attachments may contain information which is confidential, proprietary, privileged or otherwise protected by law. The information is solely intended for the named addressee (or a person responsible for delivering it to the addressee). If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete it from your computer.

<Prev in Thread] Current Thread [Next in Thread>