linux-mips
[Top] [All Lists]

Re: [PATCH] MIPS: lib: Optimize partial checksum ops using prefetching.

To: "Steven J. Hill" <Steven.Hill@imgtec.com>
Subject: Re: [PATCH] MIPS: lib: Optimize partial checksum ops using prefetching.
From: Ralf Baechle <ralf@linux-mips.org>
Date: Tue, 21 Jan 2014 21:49:38 +0100
Cc: linux-mips@linux-mips.org
In-reply-to: <1390321122-25634-1-git-send-email-Steven.Hill@imgtec.com>
List-archive: <http://www.linux-mips.org/archives/linux-mips/>
List-help: <mailto:ecartis@linux-mips.org?Subject=help>
List-id: linux-mips <linux-mips.eddie.linux-mips.org>
List-owner: <mailto:ralf@linux-mips.org>
List-post: <mailto:linux-mips@linux-mips.org>
List-software: Ecartis version 1.0.0
List-subscribe: <mailto:ecartis@linux-mips.org?subject=subscribe%20linux-mips>
List-unsubscribe: <mailto:ecartis@linux-mips.org?subject=unsubscribe%20linux-mips>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <1390321122-25634-1-git-send-email-Steven.Hill@imgtec.com>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Jan 21, 2014 at 10:18:42AM -0600, Steven J. Hill wrote:

> From: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
> 
> Use the PREF instruction to optimize partial checksum operations.

Prefetch operations may cause obscure bus error exceptions on some systems
such as Malta, for example, when prefetching beyond the end of memory.
It may also mean memory regions that are just undergoing a DMA transfer
are being brought back into cache.

This pretty much means that pref is only safe to use on cache-coherent
systems.

Those are the very same reasons that are making pref headache for memcpy.

Performance tuning is another can of worms.  On those platforms that I've
benchmarked code with and without pref on, it was very hard to predict
if pref was actually an advantage.  If data that is not going to be
used is prefetch, pref wastes an issue slot, wastes instruction bandwith
and in the end makes things slower.  If data is not prefetched early
enough, same kind of issue.  And in the end PREF and MT were invented
to solve the same kind of fundamental problem: memory is slow and slower
on embedded.  For both solutions the results are extremly dependent on
workload.

Cheers,

  Ralf

<Prev in Thread] Current Thread [Next in Thread>