MPICH2 Guide

From Storrs HPC Wiki
Jump to: navigation, search
MPICH MPI Chameleon
Author
Website http://www.mpich.org
Source Git
Category Library
Help documentation
mailing list


MPICH2 jobs through Slurm

MPICH2 is located in /apps/mpich2/1.4.1p1-ics/ and can be loaded with modules as

module load intelics/2012.0.032 mpi/mpich2/1.4.1p1-ics

To automatically load mpich2 on login, execute the following:

module initadd intelics/2012.0.032 mpi/mpich2/1.4.1p1-ics

Example code 1

#include <iostream>
#include <string>
#include <sys/unistd.h>
#include <sys/socket.h>
#include <netdb.h>
#include "mpi.h"
using namespace std;
void funkywork(){
 int i;
 for(i=0;i<100; i++){
  //sleep(1);
 }
}
int main(int argc, char* argv[]){
    int mytid, numprocs;
    MPI_Status status;
    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
    MPI_Comm_rank(MPI_COMM_WORLD,&mytid);
    char name[100];
    gethostname(name, sizeof(name));
    if(mytid>0){
     funkywork();
    }
    cout << "Hello, " << mytid <<  " and " << name << " say hi in a  C++ statement \n";
    MPI_Finalize();
}

RECOMMEND: Submitting jobs by compile with PMI in the SLURM

To compile the example code with the PMI in the SLURM, save it as main.cpp and execute the following by adding "-L/gpfs/gpfs1/slurm/lib -lpmi"

mpicxx -L/gpfs/gpfs1/slurm/lib -lpmi -lm -O2 -Wall main.cpp -o mpich2srun

Rewrite the slurm.sh as:

#!/bin/bash
#SBATCH -n 12 

srun ./mpich2srun

Then, submit the job via

sbatch slurm.sh

Or run it with interactive I/O:

salloc -n 12 srun ./mpich2srun

Results will be saved to the slurm-<SLURM_JOB_ID>.out file, any error encountered during the execution will be also saved to the slurm-<SLURM_JOB_ID>.out file in the working directory. You can monitor the execution of the job with

sjobs

The output of the example should look something like this

Hello, 0 and cn24 say hi in a  C++ statement
Hello, 3 and cn24 say hi in a  C++ statement
Hello, 5 and cn24 say hi in a  C++ statement
Hello, 1 and cn24 say hi in a  C++ statement
Hello, 6 and cn24 say hi in a  C++ statement
Hello, 9 and cn24 say hi in a  C++ statement
Hello, 10 and cn24 say hi in a  C++ statement
Hello, 4 and cn24 say hi in a  C++ statement
Hello, 7 and cn24 say hi in a  C++ statement
Hello, 8 and cn24 say hi in a  C++ statement
Hello, 11 and cn24 say hi in a  C++ statement
Hello, 2 and cn24 say hi in a  C++ statement

Submit Jobs Using Hydra

The user can submit MPI jobs through Slurm with the Hydra Process Manager. But we RECOMMEND to compile the mpi jobs with SLURM and submit jobs via srun because HPM is much slower than PMI.

To compile your code using MPICH2 you need to use Intel ICS (for now at least)

module load intelics/2012.0.032 mpi/mpich2/1.4.1p1-ics

To compile the example code, save it as main.cpp and execute the following

mpicxx -lm -O2 -Wall main.cpp -o mpich2test

The following will submit a job, #Example code 1, to Slurm using the Hydra PM:

$ cat slurm.sh
#!/bin/bash
#SBATCH -n 12

mpirun -prepend-rank -iface ib0 -rmk slurm ./mpich2test

$ sbatch slurm.sh

Or run it with interactive I/O:

salloc -n 12 mpirun -prepend-rank -iface ib0 -rmk slurm ./mpich2test

Example code 2

This example illustrates information exchange through MPI between the nodes.

#include <stdio.h>
#include <math.h>
#include <unistd.h>
#include "mpi.h"
#define MAX 1024000
int main (int argc, char *argv[]) {
 char message[MAX] ="Hellooo";
 int count=0;
 char hostname[129];
 int rank, size, i, tag, node, mlen;
 double t0, time, ticks;
 MPI_Status status;
 for (count=7;count<MAX;count++) {
    message[count]='c';
 }
 message[MAX-1]='\0';
 MPI_Init (&argc, &argv);      /* starts MPI */
 MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
 MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
 tag = 100;
 t0 = 0.0;
 time = 0.0;
 ticks = 0.0;
   ticks = MPI_Wtick();
 mlen = sizeof(message);
 printf("Start---> rank: %d size: %d Clk Resolution is %f sec/tick\n", rank, size, ticks);
 if (gethostname(hostname, 129) < 0)
 {
   printf ("gethostname failed \n");
 }
 else
 {
   printf("running on %s\n",hostname );
 }
 message[mlen] = '\0';
 if (rank == 0)
   {
     for (i = 1; i <   size; i++)
     {
         t0 = MPI_Wtime();
         MPI_Send (message, mlen, MPI_CHAR, i, tag, MPI_COMM_WORLD);
         time = MPI_Wtime() - t0;
         printf ("End--> Snd hostname: %s Length: %d bytes  Snd Time: %.4f sec. i: %d \n",hostname,mlen, time, i);
     }
   }
 else
   {
      printf ("Rcv  rank: %d \n", rank);
      t0 = MPI_Wtime();
      MPI_Recv (message, mlen, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);
      time = MPI_Wtime() - t0;
      printf("End--> Rcv hostname: %s Length: %d bytes  Rcv Time: %.4f sec.  node: %d   %.12s\n",hostname, mlen, time, rank, message);
   }
 MPI_Finalize ();
 return(0);
}

Save it as big_message_mpi.c and to compile execute

mpicxx -L/gpfs/gpfs1/slurm/lib -lpmi -O2 -Wall big_message_mpi.c -o mpich_big_msg -lm

You can use the template from Example 1 to submit it to SLURM. This example exchanges pretty big messages between the nodes so don't spam it. In fact, change the template from Example one so that you only submit it to a handful of nodes (say 2)

#!/bin/bash
#SBATCH -n 2 

srun ./mpich_big_msg

The output of the example should look something like this

Start---> rank: 0 size: 2 Clk Resolution is 0.000001 sec/tick
running on cn21
End--> Snd hostname: cn21 Length: 1024000 bytes  Snd Time: 0.0004 sec. i: 1
Start---> rank: 1 size: 2 Clk Resolution is 0.000001 sec/tick
running on cn21
Rcv  rank: 1
End--> Rcv hostname: cn21 Length: 1024000 bytes  Rcv Time: 0.0004 sec.  node: 1   Helloooccccc