Python Guide

From Storrs HPC Wiki
Jump to: navigation, search
Python
Author Various
Website https://www.python.org/
Source GitHub mirror
Category General, Commandline utility
Help documentation


Loading the Python module

For either method, you must first load the python module. Get a list of available versions using:

module avail python

For example, to load Python 3.4.3 you would then run:

module load python/3.4.3

Or if you need Python 2.7 for compatibility, the above module avail command lists 2.7.6 at the time of writing:

module load python/2.7.6

Submitting Jobs

Serial

Create a toy Python example script my_program.py:

print("Hello world")

Create the SLURM submission script submit.sh:

#SBATCH -n 1
python my_program.py

MPI

Please read Laurent Duchesne's excellent step-by-step guide for parallelizing your Python code using multiple processors and MPI.

On our cluster, to run MPI Python programs, mpi4py has been compiled against OpenMPI 1.10.1 therefore we need to load that additional package:

module load python/3.4.3 mpi/openmpi/1.10.1-gcc

Create the the test MPI example file as described in Laurent's guide above, using the same name mpi.py:

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

print("I am rank", rank, "of", size)

Create the SLURM submission script submit.sh:

#SBATCH -n 4 
mpirun python mpi.py

You should get output similar to:

I am rank 3 of 4
I am rank 0 of 4
I am rank 1 of 4
I am rank 2 of 4

Craig Finch has a more practical example for high throughput MPI on GitHub.

Installing Python libraries

Global package install

Please submit a ticket with the packages you would like installed and the Python version, and the administrators will install it for you.

Local package install

One can easily install Python packages to your home directory using:

pip install --user <name of your package>

Summary

Current Supported Python Versions

2.7.6, 2.7.10, 3.4.3, 3.5.2

Check Installed Packages

One can ls all the python directories to see a package is installed:

# Check which Python versions have numpy installed
ls -d /apps2/python/*/lib/python*/site-packages/numpy

If you already have a python module loaded, one can also see all the packages and versions installed with:

pip list

Latest Installed Packages List

1. Tensorflow: it already been installed in python 2.7.6. Currently, the cpu version is available. It has been passed our basic testings.

2. scikit-learn (0.17.1)

 Notice: you need to load the module "intelics/2012.0.032" before use sklearn 0.17.1. Some functions such as "LinearRegression" depend on some libraries such as libmkl_rt.so which are included by this intel module