PerformanceReports - MPI profiling summary


AllineaTools PerformanceReports  is used to profile your MPI application and deliver a summary indicating a “bottleneck” in your resource usage. The summary shows time spent in actual CPU-calculations, message-passing (MPI) communications, and input-output (I/O) operations.

Load the PerformanceReport allinea/4.2-PR module:

% module unload allinea
% module load allinea/4.2-PR 

Replace mpiexec in your job runscript with perf-report mpiexec. Or similarly, for the mpirun. For example, if you start your application as

 mpiexec ./ccsm.exe >&! ccsm.log

replace it with

perf-report mpiexec ./ccsm.exe >&! ccsm.log


Your environmental variable MAP_MPI_WRAPPER needs to be set up the depending on your which compiler has been used to build your MPI application.

If you use Intel MPI impi module, set

% MAP_MPI_WRAPPER=/share/opt/allinea/wrapper/

(in bash shell).

If you use openmpi/1.7.5 with Intel ifort and icc compilers, set

% MAP_MPI_WRAPPER=/share/opt/allinea/wrapper/

Submit your job as a usual batch job:

Click here view sample job script

#! /bin/bash ### sample jobscript #### #BSUB -n 16 #BSUB -R "span[ptile=16]" #BSUB -q general #BSUB -J #BSUB -o %J.stdout #BSUB -e %J.stderr #BSUB -W 0:05 # module purge module load intel module load impi/ module unload allinea/4.2.1 module load allinea/4.2-PR # export MAP_MPI_WRAPPER=/share/opt/allinea/wrapper/ # cd $HOME/MPItests ProgName=$HOME/MPItests/helloworld.exe perf-report  mpiexec $ProgName  > &!  log.sayhello # exit 0
% bsub < ./

After it is finished, two files in your working directory will contain the PerformanceReport output, in *.html format and in plain text *.txt format, and allow you to examine the resource usage summary.

Please check the User's Guide (Performance Reports) AllineaTools Version 4.2-PR for more details on PR profiling!

This User's Guide is also located in the AllineaTools PerformanceReport installation directory on pegasus2 cluster:



Example of PerformanceReports summaries

In the first example, most of the time is spent in actual computational tasks:


In the second example below, MPI communications take 47.5% of the total time, comparable to the time spent for actual computational tasks:



User's Guide (PerformanceReports) AllineaTools Version 4.2-PR

Debugging and Profiling with AllineaTools

DDT – Distributed Debugging Tool

MAP - MPI Profiling

PerformanceReports - MPI Profiling summary 

HPC Documentation

HPC Home