Tkprof: A Worked Example

David Clement

September 2004

Oracle versions 7-8


Users often report that "the system is slow," or some similarly vague statement. They know what they mean, but they don't know how to explain it. An efficient way to identify the performance problem is session profiling with TkProf.

The first step is to identify where the trace files will be written. This will usually be the default value. However, it is a configurable Oracle parameter, so it might be different. If you have trouble finding where the trace files live, the way to verify it is to run SQL*Plus or TOAD and enter the query,

                select name, value 
                from v$parameter 
                where name = 'user_dump_dest';
            
Under Windows NT, you will generally see something like '%RDBMS73%/trace'. Find the folder and open it so that you can keep an eye on the trace files; you don't want them to fill up the disk.

The next step is to start the performance trace for the session you are concerned about. This is easiest if you use TOAD. In the TOAD toolbar, click on View. In the drop-down menu, click on "Kill/Trace Session." Read the program names in the middle of the grid to find the right session. Click on it, and then click on the "Start Trace" button. A message box pops up to tell you that the trace has started.

If you do not have TOAD or a similar tool, you can always use SQL*Plus to start a session trace. Use the following query to identify the session ID and serial number, for a program called "PetSvc" in this example.

                select sid, serial#, program 
                from v$session 
                where program like '%PETSVC%';
            
Then
                execute dbms_system.set_sql_trace_in_session (sid, serial#, true)
            
where the sid and serial# are the values that you just looked up in v$session.

If you want to trace your current session in SQL*Plus, you can use

                alter session set sql_trace=true;
            

Monitor the trace. If you look at the trace folder, you will see that there is a new file in it, and that this file is growing. Let the trace file grow for a while. When its size is one megabyte or greater, you probably have enough data for a profile.

Now stop the trace. If you are using TOAD, just click on the "Stop Trace" button above the session grid. A message box pops up to tell you that the trace has stopped.

If you are using SQL*Plus, use this command:

                execute dbms_system.set_sql_trace_in_session (sid, serial#, false)
            
where, again, the sid and serial# are the values that you looked up.

To stop tracing your current session, use

                alter session set sql_trace=false;
            

The next step is to use TkProf to generate the performance report. TkProf is a command-line tool. The format of the command is <tkprof executable name> <trace file name> <output file name> There are a lot of command-line options, which this explanation will ignore.

In the Windows world, the executable name generally includes the major version numbers, so it will be tkprof73.exe or something similar. For example,

                C:\> tkprof73 d:\orant\rdbms73\trace\ora_9064.trc d:\orant\rdbms73\trace\tkprof.out
            
The file names will depend both on the directories in the file system and on where it is convenient for you to put the output file.

Now read the report. It starts with a header that explains what it is measuring (count, CPU, elapsed, disk, query, current, rows). It ends with two blocks of totals for non-recursive SQL and recursive SQL statements, where "recursive" denotes a statement that Oracle generated in order to process another statement. Between the header and the totals, there are profiles of the performance of individual SQL statements.

You are trying to find the SQL statements that took the most time. For example, imagine that in the totals for all non-recursive SQL statements you see that "disk," that is, physical blocks read, during the fetch phase were 204191. Then, looking through the body of the report, you find a "select" statement whose disk reads during the fetch phase were 204110. This is a very high percentage of the total of physical blocks read (about 99.98 to be precise). I've really seen exactly those numbers, by the way.

If you can identify the SQL statements that have taken the most time, you will often find that they have something obvious in common: for example, they all apply to the same table.

You now have strong reasons to believe that you've identified the performance problems.