Credit: IBM
Using gensyms, trace, and dbx commands in tprof online and offline
modes
tprof
is a command line utility which provides features to identify and
analyze performance bottlenecks. The tprof
utility runs a given
workload using the IBM® AIX® trace utility and specifies the time spent in
each process, and within each process, the time spent in each subroutine. This
article explains the way in which users can translate the unresolved address in a
tprof
profile report to meaningful function addresses using gensyms
and dbx commands.
tprof modes
tprof
command can run in the following modes:
Real-time or online mode
when tprof
is run with the -x
flag without -A
,
tprof
runs in the real-time mode. In this
mode, tprof
starts the AIX trace utility in the background, and
processes the trace data as it gets generated.
When the program being profiled ends, tprof
collects the symbol
information and generates the profile report. In this mode, there are no additional
files generated by tprof
.
Note: Symbols are the addresses of the variables and functions in an
object file.
Automated offline
when tprof
is run with the -x
flag and the -A
flag,
tprof
runs in the automated offline mode. In this mode,
tprof
starts the the AIX trace utility and logs the trace data into
a file. After the trace data is collected, symbol information is also collected and
written into the symbol file. tprof
reads the data from the log file
and processes the data to generate the report. In this mode, tprof
generates .syms and .trc files. These files can be
used for post processing.
Postprocessing
In this mode, tprof
post processes the .trc and
.syms files. The .trc and .syms files can be generated when
tprof is run in an automated offline mode or can be generated using the
trace
and gensyms
command. tprof
generates the profile report by post processing the required files.
Address
resolution
tprof
reads the symbols from the binary files to resolve the IAR into
human-readable function names. In case of automated offline mode, this information
is saved in a .syms file. In some cases, tprof
fails
to resolve the IAR (address) into human-readable function names, for example, when
the binary path is not exported, tprof
fails to resolve the address,
and instead, displays the address in hex values in the report.
The following table provides the methods that can help in resolving the hex values in
the generated report.
Table 1. tprof modes
Mode | Method |
---|---|
Real-time or online | Using the dbx command |
Automated offline | Using the dbx and gensyms commands |
Post-processing | Using the dbx and gensyms command |
The following prerequisites are expected to resolve the symbol
information:
- Knowledge of
dbx
- Knowledge of
generating the.lst
files during compilation
Sample code of the test binary
void testfunc() { for(int i = 0; i < 10000000; i++); } void testfunc1() { for(int i = 0; i < 10000000; i++); } int main() { while(1) { testfunc(); testfunc1(); } }
As part of the test, the sample code is compiled and is placed in
/home/tproftest
.
The binary is
ran from the same location.
tprof is run from a different location, say
/home/tproftest/reports
.
Real-time / Online mode address resolution using
dbx
Run the tprof
command using the following parameters:
tprof -ukes -x sleep 30
Experts for sleep.prof
(report generated by
tprof
)
Summary section
Process ======= |
Freq ==== |
Total ===== |
Kernel ====== |
User ==== |
Shared ====== |
Other ===== |
./ryes | 2 | 72.77 | 0.10 | 72.68 | 0.00 | 0.00 |
From the report, we can understand that there are two instances of ryes
processes running. The combined processor utilization for two instances is about
72.77 % for samples collected for 30 seconds. Out of 72.77%, 0.10% is of kernel and
72.77% is of user space.
Per process report
Process section
Process ======= |
PID ==== |
TID ===== |
Total ===== |
Kernel ====== |
User ==== |
Shared ====== |
Other ===== |
./ryes | 11141484 | 30736841 | 36.47 | 0.00 | 36.47 | 0.00 | 0.00 |
./ryes | 7406066 | 17760629 | 36.31 | 0.10 | 36.21 | 0.00 | 0.00 |
User space report
User Process ============= ./ryes 72.68 |
% ====== 72.68 |
|
Profile: ./ryes Total % For All Processes (./ryes) = 72.68 |
||
Subroutine ========== <0x1000039C> <0x10000400> <0x100003FC> <0x100003B0> <0x100003A0> <0x1000040C> <0x100003A8> <0x100003AC> <0x100003A4> <0x10000404> <0x10000408> |
% ====== 34.67 18.08 17.74 0.94 0.79 0.17 0.09 0.07 0.07 0.05 0.01 |
Source ====== |
In this report, it is seen that none of the addresses got resolved for the
ryes
process, and users can use
dbx
and the .lst file to
resolve the address.
Note: |
---|
• These steps will help only when the symbols are not stripped. • As a sample, only two IARs are considered. |
Perform the following steps to resolve addresses using dbx
:
- Run the following command as a debugee using dbx.
# dbx ./ryes Type 'help' for help. reading symbolic information ...warning: no source compiled with -g.
- Get the address information using the
listi
subcommand in
dbx
.
(dbx) listi 0x1000039C 0x1000039c (testfunc+0x1c) 80610044 lwz r3,0x44(r1) (dbx) listi 0x100003FC 0x100003fc (testfunc1+0x1c) 80610044 lwz r3,0x44(r1)
dbx
shows that
0x100003fc
resolves to
testfunc
with offset
0x1c
and
0x100003fc
resolves to
testfunc1
with offset
0x1c
. - Identify the line numbers in the source file for the given address using the
.lst filesExcerpts of the .lst file
Source file section
>>>>> SOURCE SECTION <<<<< 1 | 2 | void testfunc() { 3 | for(int i = 0; i < 10000000; i++); 4 | 5 | } 6 | 7 | 8 | void testfunc1() { 9 | for(int i = 0; i < 10000000; i++); 10 | 11 | } 12 | 13 | int main() { 14 | while(1) { 15 | 16 | testfunc(); 17 | 18 | testfunc1(); 19 | } 20 | 21 | 22 | 23 | }
Function testfunc
| 000000 PDEF testfunc 2| PROC 0| 000000 stwu 9421FFB0 1 ST4U gr1,#stack(gr1,-80)=gr1 0| CL.13: 3| 000004 addi 38600000 1 LI gr3=0 3| 000008 stw 90610044 1 ST4A i(gr1,68)=gr3 3| 00000C addis 3C800099 1 LIU gr4=153 3| 000010 addi 38849680 1 AI gr4=gr4,-27008 3| 000014 cmpw 7C032000 1 C4 cr0=gr3,gr4 3| 000018 bc 40800020 1 BF CL.17,cr0,0x1/lt, 3| CL.16: 3| 00001C lwz 80610044 1 L4A gr3=i(gr1,68) 3| 000020 addi 38630001 2 AI gr3=gr3,1 3| 000024 stw 90610044 1 ST4A i(gr1,68)=gr3 3| 000028 addis 3C800099 1 LIU gr4=153 3| 00002C addi 38849680 1 AI gr4=gr4,-27008 3| 000030 cmpw 7C032000 1 C4 cr0=gr3,gr4 3| 000034 bc 4180FFE8 1 BT CL.16,cr0,0x1/lt, 3| CL.17: 5| 000038 b 48000004 0 B CL.18,-1 5| CL.14: 5| CL.18: 5| 00003C addi 38210050 1 AI gr1=gr1,80 5| 000040 bclr 4E800020 0 BA lr
Function testfunc1
| 000000 PDEF testfunc1 8| PROC 0| 000060 stwu 9421FFB0 1 ST4U gr1,#stack(gr1,-80)=gr1 0| CL.7: 9| 000064 addi 38600000 1 LI gr3=0 9| 000068 stw 90610044 1 ST4A i(gr1,68)=gr3 9| 00006C addis 3C800099 1 LIU gr4=153 9| 000070 addi 38849680 1 AI gr4=gr4,-27008 9| 000074 cmpw 7C032000 1 C4 cr0=gr3,gr4 9| 000078 bc 40800020 1 BF CL.11,cr0,0x1/lt, 9| CL.10: 9| 00007C lwz 80610044 1 L4A gr3=i(gr1,68) 9| 000080 addi 38630001 2 AI gr3=gr3,1 9| 000084 stw 90610044 1 ST4A i(gr1,68)=gr3 9| 000088 addis 3C800099 1 LIU gr4=153 9| 00008C addi 38849680 1 AI gr4=gr4,-27008
Using the listi
subcommand in dbx
, get the line number of
the instruction from the .lst file.
- Get the offset of the instruction using listi subcommand in dbx.
(dbx) listi 0x1000039C 0x1000039c (testfunc+0x1c) 80610044 lwz r3,0x44(r1)
From the output of the
listi
subcommand
(testfunc+0x1c)
, you can see that the offset of the
instruction is0x1c
. This offset is used to identify the line
number of the code in the source file. - Get the start address of the function from the .lst file
From the excerpts of
Function testfunc of the .lst file, you can notice that the start address of
testfunc
is
000000
.| 000000 PDEF testfunc 2| PROC 0| 000000 stwu 9421FFB0 1 ST4U gr1,#stack(gr1,-80)=gr1 0| CL.13: 3| 000004 addi 38600000 1 LI gr3=0
- Get the line number.
By adding the offset of the instruction to the start
address of the function, we get0x1c
which maps to the following content in the .lst file.3| 00001C lwz 80610044 1 L4A gr3=i(gr1,68)
In the above code “3” refers to the line number in the source section of the .lst file.
Similarly,
0x100003FC
maps to testfunc1, line number 9 of the
source file.9| 00007C lwz 80610044 1 L4A gr3=i(gr1,68)
Resolve symbols in post processing
The -A
option of tprof
generates .trc and
.syms files in addition to the .prof file. By
regenerating the missing symbols using gensyms
for the
given binary you can resolve unresolved symbols into meaningful function names.
Perform the following steps to resolve symbols using
gensyms
:
- Collect symbols for the
ryes
process using
gensyms
.
gensyms -k /unix -s -b ./ryes > ryes.syms
- Run
tprof
to post process the trace files.
tprof -ukes -r ryes
Excerpts of report file
User Process
=============
./ryes
Profile: ./ryes%
=====
72.68Total % For All Processes (./ryes) = 72.68 Subroutine
==========
.testfunc
.testfunc1%
======
36.62
36.05Source
======
ryes.c
ryes.c
Collecting traces for tprof post processing
Perform the following steps to collect trace file required for post processing mode
in tprof.
- Run the
trace
command.
/usr/bin/trace -ad -M -L 402366873 -T 500000 -j 00A,001,002,003,38F,005,006,134,210,139,5A2,5A5,465,234,5D8 -o ryes.trc
Alternatively, run the following command to get the same results.
/usr/bin/trace -ad -M -L 402366873 -T 500000 -J tprof -o ryes.trc
- Turn on trace collection using the following command:
trcon
- Run the workload.
- Stop trace collection using the following command.
trcstop
- Generate symbols for the process using the following command.
gensym -k -s -b ./ryes > ryes.syms
- Post process the trace file using the following command.
tprof -ukes -r ryes
Refer to IBM Knowledge Center for information on the following commands:
Downloadable resources
Credit: IBM