linux-mips
[Top] [All Lists]

Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf

To: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Subject: Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
From: David Daney <ddaney@caviumnetworks.com>
Date: Fri, 01 Oct 2010 14:45:17 -0700
Cc: linux-mips@linux-mips.org, ralf@linux-mips.org, a.p.zijlstra@chello.nl, paulus@samba.org, mingo@elte.hu, acme@redhat.com
In-reply-to: <4CA4920C.30401@gmail.com>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <4CA4920C.30401@gmail.com>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.11) Gecko/20100720 Fedora/3.0.6-1.fc12 Thunderbird/3.0.6
On 09/30/2010 06:35 AM, Deng-Cheng Zhu wrote:
> 
> 
> (Directing this patch to Perf-events maintainers for review.)
> 
> With the kernel facility of Linux performance counters, we want the user
> level tool tools/perf to be cross compiled for MIPS platform. To do this,
> we need to include unistd.h, add rmb() and cpu_relax() in perf.h.
> 
> Your review comments are especially required for the definition of rmb():
> In perf.h, we need to have a proper rmb() for _all_ MIPS platforms. And
> we don't have CONFIG_* things for use in here. Looking at barrier.h,
> rmb() goes into barrier() and __sync() for CAVIUM OCTEON and other CPUs,
> respectively. What's more, __sync() has different versions as well.
> Referring to BARRIER() in dump_tlb.c, I propose the "common" definition
> for perf tool rmb() in this patch. Do you have any comments?
> 


In fact I do.

In user space the rmb() must expand to a SYNC instruction.  I am not
sure what your version in the patch is doing with all those NOPs.  That
is not guaranteed to do anything.

The instruction set specifications say that SYNC orders all loads and
stores.  This is a heaver operation than rmb() demands, but is the only
universally available instruction that imposes ordering.

For processors that do not support SYNC, the kernel will emulate it, so
it is safe to use in userspace.  I wouldn't worry about emulation
overhead though, because processors that lack SYNC probably also lack
performance counters, so are not as interesting from a perf-tool point
of view.

David Daney


> In addition, for testing the kernel part code I sent several days
> ago, I was using the "particular" rmb() version for 24K/34K/74K cores:
> 
> #define rmb()           asm volatile(                           \
>                                  ".set   push\n\t"               \
>                                  ".set   noreorder\n\t"          \
>                                  ".set   mips2\n\t"              \
>                                  "sync\n\t"                      \
>                                  ".set   pop"                    \
>                                  : /* no output */               \
>                                  : /* no input */                \
>                                  : "memory")
> 
> This is the definition of __sync() for CONFIG_CPU_HAS_SYNC.
> 
> 
> Thanks,
> 
> Deng-Cheng
> 
> Signed-off-by: Deng-Cheng Zhu<dengcheng.zhu@gmail.com>
> ---
>   tools/perf/perf.h |   12 ++++++++++++
>   1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index 6fb379b..cd05284 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -73,6 +73,18 @@
>   #define cpu_relax() asm volatile("":::"memory")
>   #endif
> 
> +#ifdef __mips__
> +#include "../../arch/mips/include/asm/unistd.h"
> +#define rmb()                asm volatile(                                   
> \
> +                             ".set   noreorder\n\t"                  \
> +                             "nop;nop;nop;nop;nop;nop;nop\n\t"       \
> +                             ".set   reorder"                        \
> +                             : /* no output */                       \
> +                             : /* no input */                        \
> +                             : "memory")
> +#define cpu_relax()  asm volatile("" ::: "memory")
> +#endif
> +
>   #include<time.h>
>   #include<unistd.h>
>   #include<sys/types.h>
> 
> 


<Prev in Thread] Current Thread [Next in Thread>