Check out these two virtual workshops on:
Parallel Processing with MPI
Parallel Processing with OpenMP

Message Passing Interface (MPI) is a communication protocol for parallel programming. MPI is specifically used to allow applications to run in parallel across a number of separate computers connected by a network.

Basic Features of MPI:

Message passing programs generally run the same code on multiple processors, which then communicate with one another via library calls which fall into a few general categories:

  • Calls to initialize, manage, and terminate communications.
  • Calls to communicate between two individual processes (point-to-point).
  • Calls to communicate among a group of processes (collective).
  • Calls to create custom datatypes.
  • Rich extended functionality, see some extended training materials here

Implementations of MPI:

There are several different implementations of MPI available on the UB CCR clusters.

  • Intel MPI (Recommended)
    • This implementation has multi-network support (TCP/IP, Infiniband, Myrinet, etc.) - by default the best network is tried first.
    • Compiler "wrappers" around both Intel's compiler suite (mpiifort, mpiicc, mpiicpc) and the GNU compilers (mpif90, mpicc, mpicxx)
    • Show all current versions of Intel-MPI: module avail intel-mpi
    • Intel MPI website
    • This implementation runs over InfiniBand.
    • Show all versions of MVAPICH2: module avail mvapich2
    • MVAPICH2 website
  • MPICH 2 - A portable implementation of standard Message Passing Interface (MPI) created by the Argonne National Laboratory.
    • MPICH is built specifically for a given combination of network interface and compiler. The UB CCR clusters have three internal networks, Gigabit ethernet, Q-Logic infiniband, and Mellanox infiniband. The compilers are GNU, Intel, and PGI.
    • NOTE: The MPICH 1 (MPICH) implementation is now deprecated on the CCR cluster. Instead, please use Intel MPI.
    • MPICH 2 website
  • OPENMPI - An open source implementation of MPI that is developed and maintained by a consortium made up of researchers from academia and industry.
    • This implementation is network aware, so it will automatically select the network interface.
    • OPENMPI is built specifically for a particular compiler.
    • Show all the current versions of OPENMPI: module avail openmpi
    • OPENMPI website

Using MPI with GNU Compilers:

  • Use the default Intel-MPI module to set paths to the compiler. 
  • See Compiling Code for examples of compiling C and Fortran codes without MPI.
 [user@vortex mpi-stuff]$ module load intel/14.0
 [user@vortex mpi-stuff]$ module load intel-mpi/4.1.3

  • Create a nodefile: 
    • In these examples the nodefile contains entries for the front-end machine (vortex).

 [user@vortex mpi-stuff]$ cat nodefile
 vortex  vortex  vortex  vortex [user@vortex mpi-stuff]$

Compiling with MPI, C:

  • Code: cpi from MPI test suite. 
    • view cpi code
  • Compilation:

 [user@vortex mpi-stuff]$ mpiicc -o cpi cpi.c

  • Running the code:

[user@vortex mpi-stuff]$ mpiexec.hydra -n 2 ./cpi
Process 0 of 2 on k07n14
Process 1 of 2 on k07n14
pi is approximately 3.1415926544231247, Error is 0.0000000008333316
wall clock time = 0.000111
[user@vortex mpi-stuff]$

Compiling with MPI, Fortran:

  • Code: fpi from MPI test suite, modified to not run interactively. 
    • view fpi code
  • Compilation:

 [user@vortex mpi-stuff]$ mpif77 -o fpi fpi.f

  • Running the code:

[user@vortex mpi-stuff]$ mpiexec.hydra -n 2 ./fpi
 Process  1 of  2 is alive
 Process  0 of  2 is alive
 pi is approximately: 3.1415926569231196  Error is: 0.0000000033333265
[user@vortex mpi-stuff]$

Tutorials on MPI and Related Topics

Running Jobs on the Cluster

Sample SLURM Scripts

  • See also /util/academic/slurm-scripts on the UB CCR front-end

Parallel Computing

Cluster Resources:

The UB CCR clusters provide extensive resources for parallel computing. 

Learn more about the UB-HPC cluster

Learn more about the faculty cluster

Running Interactively:

Using the front-end login machines (

  • If you wish to run an MPI code interactively on the front-end machines (only for a few processes and short duration, please - otherwise use the batch system), you can simply launch using mpirun:

[user@vortex mpi-stuff]$ module load desired_version_of_mpi
[user@vortex mpi-stuff]$ mpirun -np 2 ./a.out
(the "-np 2" argument requests 2 processors)

Using a compute node:

  • Machines within the cluster are available for interactive use through the batch scheduler. Note: Depending on the selected MPI module, mpirun may not be the appropriate task launcher when in an interactive SLURM environment.  See the individual MPI pages for details.

[user@vortex mpi-stuff]$ salloc --partition=debug --qos=debug --ntasks=2 --time=01:00:00
(wait a bit for node to be allocated)
[user@compute-node mpi-stuff]$ module load desired_version_of_mpi
[user@compute-node mpi-stuff]$ mpirun -np 2 ./a.out

Learn more batch computing