Introduction
I’ve written a number of articles around the web on the subject of using strace to get information about the IO in your application. But the hard thing is how to take seemingly endless lines in the output file and extract useful information. I learned at Panasas that you can take that strace data and extract useful IO information. For example you can extract a great deal of statistical information such as how much time was spent on IO, how much data was read, how much data was written, the IO rates, what files were opened, how many lseek’s or llseek’s were used, etc. So I wrote my own strace analyzer to take strace output files and produce something that is reasonably useful.
In the HPC world many of the applications use MPI which means that you get an individual strace file for each MPI process. But many times we want to produce an “application-level” view of the strace. So there is a new tool to take the summary of the strace for each process and combine them to give you the total view of things.
When I started the project I used it as an excuse for learning Perl. Perl has some great tools for this type of thing but this was the only application where I used Perl. So I decided to rewrite everything in Python, which I know better. Plus Python has some tools that I want to use for longer term development (GUI, plotting, etc.).
The Python version of the strace analyzer that uses the elapsed time capability of the strace function. You should consider this to be the current version of the strace analyzer. It has plotting capability via matplotlib and can produce an uncompressed pickle output of the major data structures that will be used with additional tools.
The analyzer is released under GPL v2. All the supporting tools are also released under GPL v2 as well.
Getting Started
To get started with the strace analyzer and other tools, you first need to create an strace file of your application. Then you can process it through the tools. So let’s start there – how to create an strace file.
How to create an strace file
Creating the strace file that is appropriate for the strace analyzer involves just a few options. The required options are:
strace -T -ttt -o strace.out <application>
The “-T” option cause strace to print out the elapsed time at the end of the line in the form, <0.000197>. The option “-ttt” causes microsecond timing but does so in seconds since the epoch (makes life easier for applications that run a long time and for MPI applications). The “-o” option specifies the output file. In this case the output file is called “strace.out” but it can be called anything you like.
If you want to reduce the amount of output then you can try the following:
strace -T -ttt -e trace=open,close,read,write,lseek,llseek,_llseek,fsync,fstat,fstat64,... -o strace.out
which specifies the specific functions you want strace to track. You can add almost any IO function you want to the “trace=…” line. The options that the strace analyzer current recognizes are,
- read
- write
- open
- close
- lseek
- llseek
- __llseek
- stat
- stat64
- fstat
- chmod
- access
- rename
- mkdir
- getdents
- fcntl
- unlink
- fseek
- rewind
- ftell
- fgetpos
- fsetpos
- fclose
- fsync
- creat
- readdir
- opendir
- rewinddir
- scandir
- seekdir
- telldir
- flock
- lockf
Other IO operations will be added later (asynchronous IO, fread, fwrite, etc.).
Simple Example:
To get a feel for how things work, let’s run a simple example. Since I’m on the older side, I will write a simple Fortran program that opens a file, writes to it, and closes the file. The Fortran code is below:
Fortran code goes here.
Now that you have an strace file let’s run it through the strace analyzer.
Strace_analyzer Options:
There are several options for the strace analyzer. You an get a listing of the options by using the help option. For example,
$ ./strace_analyzer_ng_0.03.pl -help
Usage: strace-analyze [OPTION]… [FILE]
Analyzes strace output for IO functions. It creates statistics
on IO functions and performance of the read and write functions.
The strace file should have been run with ‘strace -tt [PROGRAM]
-d Debug – echo the input lines to stdout and provide debugging information
-debug same as -d
–d same as -d
–debug same as -d
-h Produces help output (this message)
-help same as -h
–h same as -h
–help same as -h
-f [FILE] input a list of file sizes for examining IO functions
The format of the file is:
1024
8192
…
32768
1048576
-file same as -f
–f same as -f
–file same as -f
-dat Produce .dat files for plotting
–dat same as -dat
-s Create simulator code – file is simulator.c
-sim same as -s
–s same as -s
–sim same as -s
-short Short version of output
–short same as -short
The normal options that I use are “-short -dat”. It produces much less output than usual and also produces data files fit for plotting using something like Grace.
Example output:
The output from the strace analyzer can either be very long or relatively short. It also produces some comma delimited files (csv) as well as the data files that can be used for plotting.
By default you will get two files: strace.log and whatever file you pipe the output to (stdout). So for example,
strace -short -dat < file.strace > strace.out
will produce strace.log and strace.out files. Here is a quick example of the strace.out file:
Analysis Output
===============
Number of Lines in strace file: 7072
—————-
— Time Stats —
—————-
Elapsed Time for run: 0.526419 (secs)
Total IO Time: 0.121526 (secs)
Total IO Time Counter: 7025
Percentage of Total Time = 23.085406%
———————-
— IO Command Count —
———————-
Command Count
==============================
stat64 1
open 6
close 7
access 6
lseek 4000
read 4
write 2998
———————-
— Write Statistics —
———————-
— File sizes for write() function —
IO Size Range Number of Functions
=======================================================
( 1) 0KB < < 1KB 1998
( 2) 1KB < < 8KB 0
( 3) 8KB < < 32KB 1000
( 4) 32KB < < 128KB 0
( 5) 128KB < < 256KB 0
( 6) 256KB < < 512KB 0
( 7) 512KB < < 1000KB 0
( 8) 1000KB < < 10MB 0
( 9) 10MB < < 100MB 0
( 10) 100MB < < 1GB 0
( 11) 1GB < < 10GB 0
( 12) 10GB < < 100GB 0
( 13) 100GB < < 1TB 0
( 14) 1TB < < 10TB 0
— WRITE SUMMARY —
Total number of Bytes written = 8,011,996 (8.011996 MB)
Number of Write function calls = 2,998
Average (mean) Bytes per call = 2,672.4469646431 (bytes) (0.0026724469646431 MB)
Standard Deviation: 3,771.86653967916 (bytes) (0.00377186653967916 MB)
Mean Absolute Deviation: 3,922.7267533689 (bytes) (0.0039227267533689 MB)
Median Bytes per call = 4 (bytes) (4e-06 MB)
Median Absolute Deviation: 2,668.4469646431 (bytes) (0.0026684469646431 MB)
Time for slowest write function (secs) = 0.000439
Line location in file: 4777
———————
— Read Statistics —
———————
— File sizes for read function() —
IO Size Range Number of Functions
=======================================================
( 1) 0KB < < 1KB 2002
( 2) 1KB < < 8KB 0
( 3) 8KB < < 32KB 1000
( 4) 32KB < < 128KB0
( 5) 128KB < < 256KB 0
( 6) 256KB < < 512KB 0
( 7) 512KB < < 1000KB0
( 8) 1000KB < < 10MB 0
( 9) 10MB < < 100MB 0
( 10) 100MB < < 1GB 0
( 11) 1GB < < 10GB0
( 12) 10GB < < 100GB 0
( 13) 100GB < < 1TB 0
( 14) 1TB < < 10TB 0
— READ SUMMARY —
Total number of Bytes read = 2,048 (0.002048 MB)
Number of Read function calls = 4
Average (mean) Bytes per call = 512 (bytes) (0.000512 MB)
Standard Deviation: 0 (bytes) (0 MB)
Mean Absolute Deviation: 512 (bytes) (0.000512 MB)
Median Bytes per call = 512 (bytes) (0.000512 MB)
Median Absolute Deviation: 0 (bytes) (0 MB)
Time for slowest read function (secs) = 0.000439
Line location in file: 4777
———————-
— Close Statistics —
———————-
— CLOSE SUMMARY —
Total number of close function calls = 7
Average time for close function calls (secs) = 0.000010
Maximum Time for close function (secs) = 0.000015
Line location in file: 7069
———————
— Open Statistics —
———————
— OPEN SUMMARY —
Total number of open function calls = 6
Average time for open function calls (secs) = 0.000000
Maximum Time for open function (secs) = 0.000042
Line location in file: 66
————————————
— lseek unit activity Statistics —
————————————
unit Number of lseeks
========================
3 4000
———————
— IOPS Statistics —
———————
Maximum Write IOPS = 2998 occured at 1 seconds
Write IOPS = 2998
Maximum Read IOPS = 4 occured at 1 seconds
Read IOPS = 4
Maximum Total IOPS = 7021 occured at 1 seconds
Total IOPS = 3512
*****************
Final IOPS report
Write IOPS = 2998
Read IOPS = 4
Total IOPS = 3512
*****************
— File Statistics (per file basis) —
Filename Read Bytes Bytes/sec Write Bytes Bytes/sec
=========================================================================================================================================================================
junk.out 2,048 0 8,011,996 0
DONE
Here are example of what you could expect to see if you plot the data files. The first plot is of Read IOPS vs. Time for the simple example (really simple). It is the IOPS for all files although the analyzer will produce data files for each file opened.