Message Passing Interface (MPI) is a communication protocol for parallel programming. MPI is specifically used to allow applications to run in parallel across a number of separate computers connected by a network.


Basic Features of MPI:

Message passing programs generally run the same code on multiple processors, which then communicate with one another via library calls which fall into a few general categories:

  • Calls to initialize, manage, and terminate communications.
  • Calls to communicate between two individual processes (point-to-point).
  • Calls to communicate among a group of processes (collective).
  • Calls to create custom datatypes.
  • Rich extended functionality, see some extended training materials here

Implementations of MPI:

There are several different implementations of MPI available on the UB CCR cluster.

  • Intel MPI (Recommended)
    • This implementation has multi-network support (TCP/IP, Infiniband, Myrinet, etc.) - by default the best network is tried first.
    • Compiler "wrappers" around both Intel's compiler suite (mpiifort, mpiicc, mpiicpc) and the GNU compilers (mpif90, mpicc, mpicxx)
    • Show all current versions of Intel-MPI: module avail intel-mpi
    • Intel MPI website
  • MVAPICH2
    • This implementation runs over InfiniBand.
    • Show all versions of MVAPICH2: module avail mvapich2
    • MVAPICH2 website
  • MPICH 2 - A portable implementation of standard Message Passing Interface (MPI) created by the Argonne National Laboratory.
    • MPICH is built specifically for a given combination of network interface and compiler. The UB CCR cluster has three internal networks, Gigabit ethernet, Q-Logic infiniband, and Mellanox infiniband. The compilers are GNU, Intel, and PGI.
    • NOTE: The MPICH 1 (MPICH) implementation is now deprecated on the CCR cluster. Instead, please use Intel MPI.
    • MPICH 2 website
  • OPENMPI - An open source implementation of MPI that is developed and maintained by a consortium made up of researchers from academia and industry.
    • This implementation is network aware, so it will automatically select the network interface.
    • OPENMPI is built specifically for a particular compiler.
    • Show all the current versions of OPENMPI: module avail openmpi
    • OPENMPI website

Using MPI with GNU Compilers:

  • Use the default Intel-MPI module to set paths to the compiler. 
  • See Compiling Code for examples of compiling C and Fortran codes without MPI.
 [user@rush mpi-stuff]$ module load intel/14.0
 [user@rush mpi-stuff]$ module load intel-mpi/4.1.3




  • Create a nodefile: 
    • In these examples the nodefile contains entries for the front-end machine (rush).


 [user@rush mpi-stuff]$ cat nodefile
 rush
 rush
 rush
 rush
 [user@rush mpi-stuff]$




Compiling with MPI, C:


  • Code: cpi from MPI test suite. 
    • view cpi code
  • Compilation:


 [user@rush mpi-stuff]$ mpiicc -o cpi cpi.c




  • Running the code:


[user@rush mpi-stuff]$ mpiexec.hydra -n 2 ./cpi
Process 0 of 2 on k07n14
Process 1 of 2 on k07n14
pi is approximately 3.1415926544231247, Error is 0.0000000008333316
wall clock time = 0.000111
[user@rush mpi-stuff]$




Compiling with MPI, Fortran:


  • Code: fpi from MPI test suite, modified to not run interactively. 
    • view fpi code
  • Compilation:


 [user@rush mpi-stuff]$ mpif77 -o fpi fpi.f




  • Running the code:


[user@rush mpi-stuff]$ mpiexec.hydra -n 2 ./fpi
 Process  1 of  2 is alive
 Process  0 of  2 is alive
 pi is approximately: 3.1415926569231196  Error is: 0.0000000033333265
[user@rush mpi-stuff]$



Tutorials on MPI and Related Topics

Running Jobs on the Cluster

Sample SLURM Scripts


  • See also /util/academic/slurm-scripts on the UB CCR front-end


Parallel Computing


Cluster Resources:


The UB CCR clusters provide extensive resources for parallel computing. 


Learn more about the academic cluster (rush)


Learn more about the industry cluster (presto)

Running Interactively:



Using the front-end login machines (rush - academic or presto - industry):

  • If you wish to run an MPI code interactively on the front-end machines (only for a few processes and short duration, please - otherwise use the batch system), you can simply launch using mpirun:



[user@rush mpi-stuff]$ module load desired_version_of_mpi
[user@rush mpi-stuff]$ mpirun -np 2 ./a.out 
(the "-np 2" argument requests 2 processors)




Using a compute node:

  • Machines within the cluster are available for interactive use through the batch scheduler. Note: Depending on the selected MPI module, mpirun may not be the appropriate task launcher when in an interactive SLURM environment.  See the individual MPI pages for details.


[user@rush mpi-stuff]$ fisbatch --ntasks=2 --time=01:00:00
(wait a bit for node to be allocated)
[user@compute-node mpi-stuff]$ module load desired_version_of_mpi
[user@compute-node mpi-stuff]$ mpirun -np 2 ./a.out 

Learn more batch computing