Difference between revisions of "Singularity Guide"

From Storrs HPC Wiki
Jump to: navigation, search
(Add wrapper script that maps cluster directories)
(Fix typos)
Line 226: Line 226:
singularity \
singularity \
     exec \
     exec \
     -B /work:work \
     -B /work:/work \
     -B /scratch:/scratch \
     -B /scratch:/scratch \
     -B /gpfs/gpfs2:/gpfs/gpfs2 \
     -B /gpfs/gpfs2:/gpfs/gpfs2 \
Line 234: Line 234:
Add the <code>Rscript</code> to your <code>PATH</code> and run it with:
Add the <code>Rscript</code> to your <code>$PATH</code>, make the file executable, and run it with:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">

Revision as of 18:02, 12 May 2017

Author Gregory M. Kurtzer and others
Website singularity.lbl.gov
Source GitHub
Category Container, Commandline utility
Help documentation
mailing list
PLoS paper

Singularity allows complex software to be run on the cluster, that would otherwise be difficult or impossible to install. Using singularity, you can generate a single "image" file in which you install all your software, and then you can copy that single image file to the cluster and run it there. Additionally, singularity allows more directly comparisons of running software between different clusters, because you can use the exact same versions of software that may not be installed on the different clusters.

This guide will help you create your own singularity images.


  • Your computer with macOS, Windows, or a GNU/Linux operating system.
  • VirtualBox to create a virtual machine in which we will install singularity.
  • Vagrant (version 1.9 or later) to greatly simplify administration of our virtual machine.
  • At least 40 GB of disk space.

Getting started

To be able to create and edit a singularity image, you must have root administrator access on a machine. Therefore you cannot manage singularity containers directly on the cluster; you must install your own copy of singularity on your computer. Singularity uses the Linux kernel to run the container, therefore you need to use a GNU/Linux machine for which we will use a virtual machine. For best results, it also helps to use a Linux distribution similar to the environment in which we will finally run our singularity image. As our cluster uses RHEL 6.7, we will create our singularity image using CentOS which, for our purposes, is functionally identical to RHEL.

If you are using macOS, you can install singularity via homebrew as explained on the singularity page [1] but which we will repeat here for completeness:

# Only run these commands if you are using macOS!

# Install Brew if you do not have it installed already
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

# The next commands will install Vagrant and the necessary bits
brew cask install virtualbox
brew cask install vagrant
brew cask install vagrant-manager

If you are using GNU/Linux, you can install virtualbox and vagrant through your package manager. You may need to log out and log in to your system again for your user account to be recognized as being in the vboxusers group.

Windows users can install virtualbox and vagrant directly from the main websites.

Now that we have the necessary tools, we can create our CentOS virtual machine in which we will install singularity:

# Create a working directory for the Vagrant configuration and
# generate a template Vagrantfile for "centos/7"
mkdir vm-singularity
cd vm-singularity
vagrant init centos/7
vagrant box add --provider virtualbox centos/7  # If this fails, you might need to upgrade vagrant

# Build and start the Vagrant hosted VM
vagrant up --provider virtualbox

# Run the necessary commands within the VM to install Singularity
vagrant ssh -c /bin/sh <<EOF
    sudo yum update
    sudo yum -y install @'development tools' emacs-nox
    git clone https://github.com/singularityware/singularity.git
    cd singularity
    ./configure --prefix=  # System root to avoid changing sudo secure_path
    sudo make install

# Singularity is installed in your Vagrant CentOS VM! Now you can
# use Singularity as you would normally by logging into the VM
# directly
vagrant ssh

Creating your singularity image

When you log in with vagrant ssh above, your shell prompt should look similar to:

[vagrant@localhost ~]$

We are inside our CentOS virtual machine.

To create the raw image file use the create command.

sudo singularity create --size 2048 centos7-container.img

We now need to install our software inside the image file by writing a "definition" file. You might be able to use one of several community contributed definition files are in the main singularity repository: singularity/examples/contrib The singularity user documentation also has more detail, but for now we will look at a small example here.

As an example here, we will install R with the rstan package. Create the following centos7-container.def file using your favorite command-line text editor:

 1 # Contents of file "centos7-container.def"
 2 BootStrap: yum
 3 OSVersion: 7
 4 MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
 5 Include: yum
 7 # The default command to run once our image is finished.
 8 %runscript
 9     Rscript $@
11 # Installing the software in our image.
12 %post
13     yum -y install epel-release
14     yum -y install R
15     Rscript -e 'if (! require(rstan)) install.packages("rstan", repo="http://cran.rstudio.com/")'
16     rm -rf /work /scratch /gpfs
17     mkdir -p /work /scratch /gpfs/{gpfs2,scratchfs1}
19 # Command
20 %test
21     Rscript -e 'library(rstan)'

Now we run the %post software installation section with the singularity bootstrap command:

sudo singularity bootstrap centos7-container.{img,def}

The first time singularity runs the bootstrap process, you will see it installing some extra packages before it does anything in your %post steps. It is setting up a minimal operating system inside the image.

You will notice that the compilation grinds to a halt when R tries to install rstan. This is because rstan uses a lot of RAM to compile. By default, VirtualBox assigns 512 MB of RAM. Let's increase that to 4GB. Cancel the rstan compilation with Ctrl + C. Then exit out of the VM with Ctrl + D.

# Shut down the VM.
vagrant halt

Uncomment the following in your Vagrantfile, and change the memory from "1024" to "4096"

config.vm.provider :virtualbox do |vb|
#   # Don't boot with headless mode                                                                                                                                                          
#   vb.gui = true                                                                                                                                                                            

  # Use VBoxManage to customize the VM. For example to change memory:                                                                                                                        
  vb.customize ["modifyvm", :id, "--memory", "4096"]

Bring back up the VM with vagrant up and complete the compilation with the bootstrap command above.

Finally you should see the compliation complete:

+ Rscript -e 'library(rstan)'
Loading required package: ggplot2
Loading required package: StanHeaders
rstan (Version 2.15.1, packaged: 2017-04-19 05:03:57 UTC, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
[vagrant@localhost ~]$

Run image on the cluster

Now we can copy our centos7-container.img to the cluster:

# From inside the VM.
scp centos7-container.* abc12345@login.storrs.hpc.uconn.edu:

Then in another terminal, log into your cluster account and you will find the files in your home directory.

You can run the container by loading the singularity module:

module load singularity
singularity exec centos7-container.img cat /etc/redhat-release
singularity test centos7-container.img

We have set the default action of our container to run Rscript, therefore we can execute the image directly:

./centos7-container.img --help

Now you can submit your job by passing your input file to the image just as you would with Rscript.

However there are 2 minor things we still need to correct:

  1. By default the container always starts in the home directory. We need to explicitly change to $PWD.
  2. We need to mount the /scratch directories, etc (/home is automatically mounted).

To do this, we can create a wrapper script called Rscript that does these things:

 1 #!/bin/bash -x                                                                                                                                                                                 
 2 DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" # http://stackoverflow.com/a/246128                                                                                                    
 3 singularity \
 4     exec \
 5     -B /work:/work \
 6     -B /scratch:/scratch \
 7     -B /gpfs/gpfs2:/gpfs/gpfs2 \
 8     -B /gpfs/scratchfs1:/gpfs/scratchfs1 \
 9     ${DIR}/centos7-container.img \
10     bash -c "cd $PWD && Rscript $@"

Add the Rscript to your $PATH, make the file executable, and run it with:

Rscript path/to/your/rstan-script.R