OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. OProfile is released under the GNU GPL.
It consists of a kernel driver and a daemon for collecting sample data, and several post-profiling tools for turning data into information.
OProfile leverages the hardware performance counters of the CPU to enable profiling of a wide variety of interesting statistics, which can also be used for basic time-spent profiling. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel, shared libraries, and applications.
OProfile is currently in alpha status; however it has proven stable over a large number of differing configurations; it is being used on machines ranging from laptops to 16-way NUMA-Q boxes. As always, there is no warranty.
Linux/MIPS support for Oprofile was added for Linux 2.6.10. There is no support for older MIPS kernels nor due to bugs and limitations in the existing oprofile patches for these kernels this is considered a sensible project.
CPUs support by the kernel
- Legacy processors R10000, R12000, R14000, R16000, RM9000
- MIPS32 processors 24K, 34K in non-34K#SMTC mode
- MIPS64 processors 5K, 20K, 25K, SB1
The 4K series is not supported as they do not have performance counters but only a performance monitoring interface which basically is an interface to which a SOC designer can glue his own counter gadgetry. As there is no standard whatsoever for how to implement this. One SOC known to make use of this is the ATI Xilleon.
Oprofile support is available at the time of this writing only in the Sourceforge Oprofile CVS repository; released versions are either lacking support for MIPS or are unusable due to bugs. The R10000, R12000, R12000, RM7000, RM9000, SB1 / SB1A, VR5432, VR5500 processors are supported by the CVS version of the userspace tools; support for further is in the queue.
Restrictions on R10000, R12000, R14000, R16000 mixed configurations
On SGI systems it is legal and relativly common to mix different types and speeds of processors in a single system. These processors have some significant differences in their counter implementations:
- R10000 processors only have 2 counters
- R12000, R14000, R16000 processors have 4 counters
- The R10000 processor versions 2.x count the virtual coherency exception as event 14 on counter 0. Version 3.x processors change that to ALU/FPU completion cycles.
- R12000, R14000 and R16000 events differ significantly from those supported by R10000 processors.
As the result oprofile will only support properly whatever the common subset of all processors in a system is. Among the events of interest this is:
|Counters||Symbolic event name||Description|
|0, 1||CYCLES||Clock cycles|
There are further usable events (5 - 15 on counter 0, 2 - 14 on counter 14) some of which have different symbolic names on R10000 v3.x and R12000 and later processors in oprofile but have identical functionality, therfore are usable on mixed configurations. Where these restrictions on mixed configurations are a possible problem attempting to restrict the configuration to a subset of identical processors may be attempted as a workaround.