Difference between revisions of "SLURM Job Array Migration Guide"
(Use LDFLAGS to avoid loading gcc) |
(Add LDFLAGS for tbb) |
||
Line 69: | Line 69: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
− | # | + | # Add LDFLAGS so that we can run the model without loading gcc. |
− | + | LDFLAGS="-Wl,-rpath,/apps2/gcc/9.2.0/lib64 -Wl,-rpath,stan/lib/stan_math/lib/tbb" | |
+ | make LDFLAGS="${LDFLAGS}" examples/bernoulli/bernoulli | ||
examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.R | examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.R | ||
bin/stansummary output.csv | bin/stansummary output.csv |
Revision as of 21:53, 22 June 2020
This guide describes how to bypass the 8 job limit by lightly restructuring your code to effectively use multiple CPUs within a single job. As a side benefit, your code will also become more resilient to failure by gaining the ability to resume where it left off.
If you have used SLURM on other clusters, you may be surprised by the 8 job limit; the reason the limit was put in place is to reduce the time between submitting your your job and it starting to run. The limit was added by the request of our users to share the cluster more fairly.
The goal if the guide is to explain concepts several underlying job parallelism, starting with SLURM Job Arrays, taking a detour to using shell job parallelism and xargs, and finally describing sophisticated parallelism and job control using GNU Parallel.
Method | Multiple CPUs | Multiple Nodes | Resumable | Max CPUs |
---|---|---|---|---|
Job Arrays | Yes | Yes | Manual | (See note) 8 |
Bash Jobs | Yes | No | No | 24 |
xargs | Yes | No | No | 24 |
GNU Parallel | Yes | Yes | Yes | 192 |
MPI | Yes | Yes | Maybe | 192 |
Note: Assume each job step uses 1 CPU |
Let's get started!
Setup
Let's solve an authentic task of Bayesian inference using the Stan language.
The command-line version of the stan program cannot be shared as a module and is instead meant to be compiled our home directory, because the way Stan works is by compiling model programs before running them. The setup should take about 10 minutes. Run these commands in your shell:
wget https://github.com/stan-dev/cmdstan/releases/download/v2.23.0/cmdstan-2.23.0.tar.gz
tar -xf cmdstan-2.23.0.tar.gz
cd cmdstan-2.23.0/
module purge
module load gcc/9.2.0
# Note that we unset the RTM_KEY of tbb because rtm instructions are only available on SkyLake,
# but we want to be able to run Stan on older CPUs.
make -j RTM_KEY= build
Build and run the example model as described by make
.
Compiling models only uses one CPU core at a time,
so no need to use -j
here.
# Add LDFLAGS so that we can run the model without loading gcc.
LDFLAGS="-Wl,-rpath,/apps2/gcc/9.2.0/lib64 -Wl,-rpath,stan/lib/stan_math/lib/tbb"
make LDFLAGS="${LDFLAGS}" examples/bernoulli/bernoulli
examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.R
bin/stansummary output.csv
Job Array script
Consider this simple job array script which we will save as submit.slurm
#SBATCH --partition debug
#SBATCH --ntasks 1
#SBATCH --array 1-5
# Load only required modules.
module purge
module load \
gcc/9.2.0 \
r/3.6.1
# Run parameter index set by the SLURM array index.
Rscript model_fit.R ${SLURM_ARRAY_TASK_ID}
The R script is model_fit.R
# Read the index for the list of parameters from the command line.
args <- commandLineArgs(trailingOnly = TRUE)
param_idx <- as.integer(args[[1]])
# Read in the parameters for this index.
params <- read.csv("")
params <- subset(params, params$idx == param_idx)
# Load data from builtin "datasets" package.
# Fit the model to the parameters and save the result.
formula <- y ~ x_0 + x_1
This is what our parameters.csv
idx, niter, tol 1, 100, 1e-4 2, 1000, 1e-8 3, 3000, 1e-8 4, 2000, 1e-8 5, 3000, 1e-4
We use R
in this example because we
can run all our statistical computing without loading any additional libraries
and so it should "just work" without needing any additional setup on your part.