Profiling C++ apps with oprofile

If you are hitting some performance problems (related to CPU) there is very powerfull tool that can help you with diagnostics: oprofile. Below I'm summarizing some hints for efficient oprofile usage (not all are very obvious). First of all: some basic commands:

  • opcontrol: starts/stops statistics collection
  • opreport: reporting tool

For applications that use shared libraries you have to add "–separate=lib" switch to opcontrol call:

opcontrol --start --separate=lib --vmlinux /boot/vmlinux
opcontrol --stop

Without this option you will see only main binary activities (without time spent in shared libraries).

Looking at raw opreports is not very useful. Sometimes "light" method will call another "heavy" method and you won't be able to locate the caller from report (very lov CPU usage). But sometimes caller should be located and modified to fix the performance problem. That's why "call graphs" were added for oprofile. In order to report using current stack contents you have to initialize oprofile properly:

opcontrol --start --callgraph=10 --vmlinux /boot/vmlinux
opcontrol --stop

Then besides raw data you can see code locations found on stack (callers and code called from function that eats most CPU):

  9        16.0714 widgets::ActionItemWidget::isFocused() const
  11       19.6429  CoreApplication          _fini
  24       42.8571 QPainter::drawPixmap(int, int, QPixmap const&)
4413     13.6452       /usr/lib/
  4413     100.000       /usr/lib/ [self]

As you can see there's one non-indented line. This is our CPU-heavy method. We also see callers (with percentage assigned – oprofile is a statistic profiler). We can guess probably drawPixmap() should be optimised here.

If  you are tracking embedded systems you can perform reporting on host machine. Architecture can be totally different, but oprofile versions must match (there might be differences in internal oprofile format). I'm running reports by the following command:

opreport --image-path=/usr/local/sh4/some-arch/rootfs \

As you can see you have to point to binaries location and /var/lib/oprofile (both are on rootfs mounted by NFS in my case). Using host for reports is much faster (and you have no problems with limited embedded device memory).

This entry was posted in en and tagged , , , . Bookmark the permalink.