UW AMath High Performance Scientific Computing
 
AMath 483/583 Class Notes
 
Spring Quarter, 2011

Table Of Contents

Previous topic

Python debugging

Next topic

Visualization and graphics

This Page

Timing and profiling computer codes

Unless otherwise stated, all timings of computer experiments shown on these pages were performed on an Apple MacBook Pro laptop with 2.26 GHz Intel dual core processors and 4 GB of memory (see Binary/metric prefixes for computer size, speed, etc.).

Unix time command

There is a built in command time that can be used to time other commands. Just type time command at the prompt, e.g.:

$ cd $CLASSHG/codes/python
$ time python testappend.py

real    0m0.946s
user    0m0.812s
sys     0m0.127s

This executes the command python testappend.py and then prints information about the time required to execute, where, roughly speaking, real is the wall-clock time, user is the time spent executing the user’s program, and sys is the time spent on system tasks required by the program.

There may be a big difference between the real time and the sum of the other two times if the computer is simulataneously executing many other tasks and your program is only getting some of its attention.

Python time module

The code in testappend.py is designed to illustrate that creating a long array by repeatedly appending elements to a list is very slow. It consists of two functions test1 and test2 that do this in two different ways. From the output of the unix time command we can’t tell how much time was spent in each. To do so, we will rewrite the code using the Python time module.

For general information about this module see the time module documentation. The module contains many functions, for example:

>>> time.time()
1270909028.53543

returns the number of seconds since “The Epoch”, a particular date in the past (usually midnight GMT on January 1, 1970). If you’re trying to find out the current time, e.g. to print a timestamp along with your computed results, try:

>>> time.asctime()
'Sat Apr 10 07:18:21 2010'

For timing programs, one can all time.time() twice, once before executing some commands, and once afterwards. The difference will the the total elapsted time between the two calls (roughly the same as what is reported as the real time by the Unix time command).

As noted above, the elapsed time isn’t necessarily all spent in executing your program since the computer may be busy doing other things as well. For timing codes, the clock function is often better:

>>> time.clock()
0.030932999999999999

this tells the number of CPU seconds that have been used by your Python process since it was started. So the difference in time between two calls will be roughly equal to the CPU time used by the commands in between.

Here is another version of the testappend.py code that uses the Python time.clock function to compute the time spent in each function:

Note a couple things about this code:

  • The variables tstart_cpu and t_cpu are local to each function, see Python Namespaces for more about local and global variables.
  • The last few lines of the code employ a device frequently seen in Python code that might be either executed as a script or imported as a module, see Writing scripts for ease of importing.

Executing this code at the Unix command line gives:

$ python testappend2.py

CPU time spent in test1:   0.74862000 seconds
CPU time spent in test2:   0.02950800 seconds
CPU time spent in testall:   0.78093800 seconds

Now we can see that the vast majority of time is spent in test1, even though the two function create exactly the same NumPy array x. This is because modifying a list by appending an item to it requires Python to obtain more memory and perhaps copy the entire list to a new place in memory.

Note that the time spent in testall is roughly the sum of the time spent in the two functions it called. To find out how much time is spent in a function excluding the time spent in functions it calls, see Python cProfile module below.

Let’s compare the numbers reported by Python and Unix:

$ time python testappend2.py

CPU time spent in test1:   0.75025600 seconds
CPU time spent in test2:   0.02906000 seconds
CPU time spent in testall:   0.78216400 seconds

real    0m0.968s
user    0m0.822s
sys     0m0.130s

The user time reported by Unix is comparable to the total CPU time reported by Python.

Note: Running a test several times will generally give different results each time. It’s often best to compute an average over many tries.

The times will also generally depend on what other programs might be running on the computer at the same time. See also Unix top command.

Python cProfile module

Another useful tool is the Python module cProfile, which can provide information on how much time is spent in each function, not only the ones you write but also functions such as append:

  >>> import cProfile
  >>> cProfile.run('import testappend2; testappend2.testall()')
  CPU time spent in test1:   0.99215300 seconds
  CPU time spent in test2:   0.02913500 seconds
  CPU time spent in testall:   1.02416500 seconds

       1000015 function calls in 1.032 CPU seconds

 Ordered by: standard name

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      1    0.000    0.000    1.032    1.032 <string>:1(<module>)
      1    0.028    0.028    0.029    0.029 function_base.py:35(linspace)
      1    0.671    0.671    1.000    1.000 testappend2.py:19(test1)
      1    0.000    0.000    0.029    0.029 testappend2.py:29(test2)
      1    0.003    0.003    1.032    1.032 testappend2.py:35(testall)
1000000    0.145    0.000    0.145    0.000 {method 'append' of 'list' objects}
      1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      1    0.002    0.002    0.002    0.002 {numpy.core.multiarray.arange}
      1    0.168    0.168    0.168    0.168 {numpy.core.multiarray.array}
      1    0.017    0.017    0.017    0.017 {range}
      6    0.000    0.000    0.000    0.000 {time.clock}

After executing the command given by the string passed to cProfile.run, a profile of the program is reported, showing that the majority of time (0.671 seconds) was spent in test1 as expected.

We also see that even just converting the list into an array with np.array took 0.145 seconds, whereas the np.linspace command that created the desired x from scratch only took 0.028 seconds.

The cumtime column shows the cumulative time spent in a function, including time spent in functions it calls, where as tottime only shows the time actually spent executing things in the function itself. So, for example, the line for testall shows that only 0.003 seconds was spent executing commands in testall, but 1.032 seconds were spent in testall because of all the time spent in test1, which was called from testall.

Note that the total CPU time is somewhat greater than it was when we ran the code without profiling, since a lot of work is done keeping track of how much time is spent in each function. The results of profiling are primarily useful to show where bottlenecks are and which function calls are using most of the time, not to get an accurate estimate of how much time.

See http://docs.python.org/library/profile.html for more information on profiling.

Timing Fortran codes

To appear.

Profiling Fortran codes

To appear.