If you are hitting some performance problems (related to CPU) there is very powerfull tool that can help you with diagnostics: oprofile. Below I'm summarizing some hints for efficient oprofile usage (not all are very obvious). First of all: some basic commands:
- opcontrol: starts/stops statistics collection
- opreport: reporting tool
For applications that use shared libraries you have to add "–separate=lib" switch to opcontrol call:
opcontrol --start --separate=lib --vmlinux /boot/vmlinux opcontrol --stop
Without this option you will see only main binary activities (without time spent in shared libraries).
Looking at raw opreports is not very useful. Sometimes "light" method will call another "heavy" method and you won't be able to locate the caller from report (very lov CPU usage). But sometimes caller should be located and modified to fix the performance problem. That's why "call graphs" were added for oprofile. In order to report using current stack contents you have to initialize oprofile properly:
opcontrol --start --callgraph=10 --vmlinux /boot/vmlinux opcontrol --stop
Then besides raw data you can see code locations found on stack (callers and code called from function that eats most CPU):
9 16.0714 libCommonWidgets.so.1.0.0 widgets::ActionItemWidget::isFocused() const 11 19.6429 CoreApplication _fini 24 42.8571 libCommonWidgets.so.1.0.0 QPainter::drawPixmap(int, int, QPixmap const&) 4413 13.6452 libQtGuiE.so.4.7.1 /usr/lib/libQtGuiE.so.4.7.1 4413 100.000 libQtGuiE.so.4.7.1 /usr/lib/libQtGuiE.so.4.7.1 [self]
As you can see there's one non-indented line. This is our CPU-heavy method. We also see callers (with percentage assigned – oprofile is a statistic profiler). We can guess probably drawPixmap() should be optimised here.
If you are tracking embedded systems you can perform reporting on host machine. Architecture can be totally different, but oprofile versions must match (there might be differences in internal oprofile format). I'm running reports by the following command:
opreport --image-path=/usr/local/sh4/some-arch/rootfs \ --session-dir=/usr/local/sh4/some-arch/rootfs/var/lib/oprofile
As you can see you have to point to binaries location and /var/lib/oprofile (both are on rootfs mounted by NFS in my case). Using host for reports is much faster (and you have no problems with limited embedded device memory).