linux-mips
[Top] [All Lists]

Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf

To: Ralf Baechle <ralf@linux-mips.org>
Subject: Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
From: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Date: Sat, 02 Oct 2010 10:54:04 +0800
Cc: David Daney <ddaney@caviumnetworks.com>, linux-mips@linux-mips.org, a.p.zijlstra@chello.nl, paulus@samba.org, mingo@elte.hu, acme@redhat.com
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=j05byqydO1PeiOXcsm/CIWibQdeKQwWX1wlJnl428oY=; b=r5HhGkmzthJHbGu32ggA55GBUgEpoUWfQDSpUvoueo2+I+mDGe/GzRu2Elvyt7baQh IMT5egmWeHGM7hmraHONY0VFG8dH0fjwH920tE0t2UohArxi6G7hI8jHwnK9RRDKxb+C gSV0YC/bVK2+rib169H2zP76Uv7QgI9j3k0l0=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=Xy0kcMTjjwGMNwtXuPWdUIe7Sn/t6PcOSIHUGCw4K+W2yMl38w8pTowj8tCyZcm3jY SVu+i0XlQ8x7Z8BtZVzGjj1dvpgk6WpzSDWeqqypyqytv1yE4S72oaTFQtrm+nhls15W +0ZCaK2QdxLgdLptaAF4jkSuC4pVSIkLHcnaI=
In-reply-to: <20101002015947.GB9360@linux-mips.org>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <4CA4920C.30401@gmail.com> <4CA6566D.2050003@caviumnetworks.com> <20101002015947.GB9360@linux-mips.org>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4


Thanks guys. So let's turn the patch into the following?

Signed-off-by: Deng-Cheng Zhu<dengcheng.zhu@gmail.com>
---
 tools/perf/perf.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 6fb379b..cd05284 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -73,6 +73,20 @@
 #define cpu_relax()    asm volatile("":::"memory")
 #endif

+#ifdef __mips__
+#include "../../arch/mips/include/asm/unistd.h"
+#define rmb()          asm volatile(                                   \
+                               ".set      push\n\t"                  \
+                               ".set      noreorder\n\t"                     \
+                               ".set      mips2\n\t"                 \
+                               "sync\n\t"                            \
+                               ".set      pop"                               \
+                               : /* no output */                       \
+                               : /* no input */                        \
+                               : "memory")
+#define cpu_relax()    asm volatile("" ::: "memory")
+#endif
+
 #include<time.h>
 #include<unistd.h>
 #include<sys/types.h>


On 2010-10-2 9:59, Ralf Baechle wrote:
On Fri, Oct 01, 2010 at 02:45:17PM -0700, David Daney wrote:

In user space the rmb() must expand to a SYNC instruction.  I am not
sure what your version in the patch is doing with all those NOPs.  That
is not guaranteed to do anything.
That's a rather old version of the kernel rmb macro I think.  The NOPs
where there to enforce ordering of a mix of cached and uncached accesses
on the R4400 (not R4000) where according to my reading the manual leaves
it a bit unclear if a SYNC is sufficient or if the pipeline needs to be
drained in addition.  See version 2 of the R4000/R4400 User's Manual.

The instruction set specifications say that SYNC orders all loads and
stores.  This is a heaver operation than rmb() demands, but is the only
universally available instruction that imposes ordering.

For processors that do not support SYNC, the kernel will emulate it, so
it is safe to use in userspace.  I wouldn't worry about emulation
overhead though, because processors that lack SYNC probably also lack
performance counters, so are not as interesting from a perf-tool point
of view.
Yes, just use SYNC.  SYNC-less processors would only be R2000/R3000
processors and a few other oddball processors which for performance
optimization are totally uninteresting since years.

   Ralf

<Prev in Thread] Current Thread [Next in Thread>