Performance and Debugging Tool – Totalview
Table of Contents
These instructions will help users invoke Totalview to debug codes on a Research Computing server in just a couple of steps.
Please note that Totalview is a commercial product. There are only 64 licenses available on the Research Computing cluster so users will need to debug on this many or fewer cores.
The Totalview GUI is an X window and so users will have to make sure that X windows can be displayed on their local computer. Please read the following general instructions for setting graphic display on your local computer depending on your operating system.
Add the Totalview to your environment depending on the Research Computing server you use:
module add totalview
module initadd totalview
Totalview will work on serial code complied with any of our compilers. For parallel code certain implementations of MPI must be used. On KillDevil use MVAPICH.
As with all compilers, you will need to compile your code with the “-g” option. This allows the symbol table information to be included which means that the tool can display your source code rather than obscure machine code.
On your job submission add the “tv” esub application to your jobs submit line. For example, depending on the Research Computing server you use:
-a "mvapich tv"
Using Totalview with a serial job is very straightforward. No esub is required, simply invoke it as
Now your job will start up under the control of totalview. Note that the totalview GUI is an X window and so you should make sure that X windows can be displayed on your computer.
When Totalview comes up it will be running the pam process which is the LSF process that will start your job.
You should have a Totalview window with four panes in it (see image below). The pane on the upper left is the stack trace and you should be able to click on this to view the levels of your code. At the very lowest levels (top of the box) you will see low level system calls or perhaps MPI calls, depending upon where the code was when you halted it and you should also see your program, presumably the main routine where your program starts.
When you click on routines you should see their source in the source pane in the middle of the Totalview window. You should also see variables and their values in the stack frame in the upper right pane. You can also double click on variables or hold the mouse over them to view their values.
to start your job running.
Next a dialog window will come up asking if you want to stop the job now. Click
Let the job run for a few seconds.
You should see P new processes in the Totalview root window, where P is the number of cores you requested (see image below). After this click
to stop the job.
You might find it useful to add the sleep call to the beginning of your program if you want to stop at the very beginning but this is not required.
You are now on your way to using Totalview! The best way to learn about this tool and it’s many cool features is to play around with it for a while and experiment. There is also documentation and tutorials on line to guide you, see the links below.
- Totalview home page
- Totalview Documentation and Training
- Totalview Tutorial from Lawrence Livermore National Laboratory