MPI Guide

From Storrs HPC Wiki
Revision as of 17:03, 22 June 2012 by Stc07008 (talk | contribs)
Jump to: navigation, search

What is MPI?

MPI stands for Message Passing Interface, and is a protocol used for communication in parallel programming. It is useful for the Multiple Instruction Multiple Data (MIMD) technique. This will allow for a considerable speed up in programs that can broken into many sections that can be run simultaneously on different processors.

The Basics of MPI Programming

There are a few essentials to running a program using MPI. First, the mpi library must always be used. In C this is #include <mpi.h> and in fortran this is mpif.h . MPI_Init and MPI_Finalize are two trivial yet required functions. Always be sure to include these at the beginning and end of your parallel code.

Communicators are the primary unit used in MPI Programming. A communicator is a group of processes that can communicate and send data between each other. MPI_Comm_size and MPI_Comm_rank are two useful functions when dealing with communicators. The former will return the size of the communicator and the second returns the rank (between 0 and the size of the communicator) which identifies the calling process. Here is a basic MPI Program that prints the size and rank of all the processes:

/*This is helloworld.c*/
#include <stdio.h>

/*Remember you must include this library*/
#include <mpi.h> 

int main (int argc, char * argv[])
        int rank, size;

        /*Initialize the parallel section*/
        MPI_Init( &argc, &argv);

        /*This section will get the rank and size of each process and
        print a statement for each one.  MPI_COMM_WORLD is a global variable
        that is a group of all processes*/
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
        printf("I am %d of %d\n", rank, size);

        /*close the parallel section*/
        return 0;

To compile and run this, you must load your MPI and GCC environment. All MPI modules are listed under mpi/{Name of MPI Implementation}. Then run the following commands:

mpicc helloworld.c -o hello
bsub -n 24 -o output.txt mpirun -np 24 ./hello

This first line compiles using mpicc, which is an MPI C compiler. The second line submits a job with a 24 CPU core reservation to the default queue through the LSF scheduler. (Note that if these resources are unavailable, the job will be queued in the scheduler until they become available) The actual job being submitted is "mpirun -np 24 ./hello", and output is being written to output.txt.

Before trying anything else run this program and see what the output looks like!

Send and Receive in MPI

MPI_SEND and MPI_RECV are the most important functions to understand for sending messages in MPI. Using these functions, you can send data of the specified MPI types. These types consist of the predefined types from the language you are programming in (MPI_INT, MPI_DOUBLE, etc.), an array of these types, or a structure consisting of these types. MPI_SEND and MPI_RECV takes arguments as follows:

MPI_SEND(start, count, datatype, dest, tag, comm)

  • start is the starting address of your data
  • count is the number of objects you are sending
  • datatype is the datatype as specified above
  • dest is the rank of the reveiver
  • tag is an integer that is used to assist in communication. It is the messages identification
  • comm is the communicator that the message is being sent in

MPI_RECV(start, count, datatype, source, tag, comm, status)

  • The only differences are that the source rank is required and there is an additional argument for status information.

Additional Information

The six functions mentioned above are enough to start programming with MPI and can be very powerful tools. For further information in using MPI please refer to the following links:

An Intrdocution to MPI, William Gropp and Ewing Lusk from Argonne National Labortory

MPI C examples

Another tutorial by Blaise Barney from Lawrence Livermore National Laboratory

MPI Implementation Specific Information

There are many different implementations of MPI, information on a few of them are listed below:

MPICH2 and Hydra