# Job script examples¶

## Help! I don’t know what OpenMP or MPI means!¶

OpenMP and MPI are parallelization frameworks. If you want to run many similar jobs that each use one core at a time, scroll down to job arrays.

## Example for an OpenMP job¶

#!/bin/bash -l

#############################
# example for an OpenMP job #
#############################

#SBATCH --job-name=example

# we ask for 1 task with 20 cores
#SBATCH --nodes=1

# run for five minutes
#              d-hh:mm:ss
#SBATCH --time=0-00:05:00

# short partition should do it
#SBATCH --partition short

# 500MB memory per core
# this is a hard limit
#SBATCH --mem-per-cpu=500MB

# turn on all mail notification
#SBATCH --mail-type=ALL

# you may not place bash commands before the last SBATCH directive

# define and create a unique scratch directory
SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID}
mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY}

# we copy everything we need to the scratch directory
# ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we set OMP_NUM_THREADS to the number of available cores export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

# we execute the job and time it
time ./my_binary.x > my_output

# after the job is done we copy our output back to $SLURM_SUBMIT_DIR cp${SCRATCH_DIRECTORY}/my_output ${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd${SLURM_SUBMIT_DIR}
rm -rf ${SCRATCH_DIRECTORY} # happy end exit 0  Save it to a file (e.g. run.sh) and submit it with: $ sbatch run.sh


## Example for a MPI job¶

#!/bin/bash -l

##########################
# example for an MPI job #
##########################

#SBATCH --job-name=example

# 80 MPI tasks in total
# Stallo has 16 or 20 cores/node and therefore we take
# a number that is divisible by both

# run for five minutes
#              d-hh:mm:ss
#SBATCH --time=0-00:05:00

# short partition should do it
#SBATCH --partition short

# 500MB memory per core
# this is a hard limit
#SBATCH --mem-per-cpu=500MB

# turn on all mail notification
#SBATCH --mail-type=ALL

# you may not place bash commands before the last SBATCH directive

# define and create a unique scratch directory
SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID}
mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY}

# we copy everything we need to the scratch directory
# ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we execute the job and time it time mpirun -np$SLURM_NTASKS ./my_binary.x > my_output

# after the job is done we copy our output back to $SLURM_SUBMIT_DIR cp${SCRATCH_DIRECTORY}/my_output ${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd${SLURM_SUBMIT_DIR}
rm -rf ${SCRATCH_DIRECTORY} # happy end exit 0  Save it to a file (e.g. run.sh) and submit it with: $ sbatch run.sh


## Example for a hybrid MPI OpenMP job¶

#!/bin/bash -l

#######################################
# example for a hybrid MPI OpenMP job #
#######################################

#SBATCH --job-name=example

# we ask for 4 MPI tasks with 10 cores each
#SBATCH --nodes=2

# run for five minutes
#              d-hh:mm:ss
#SBATCH --time=0-00:05:00

# short partition should do it
#SBATCH --partition short

# 500MB memory per core
# this is a hard limit
#SBATCH --mem-per-cpu=500MB

# turn on all mail notification
#SBATCH --mail-type=ALL

# you may not place bash commands before the last SBATCH directive

# define and create a unique scratch directory
SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID}
mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY}

# we copy everything we need to the scratch directory
# ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we set OMP_NUM_THREADS to the number cpu cores per MPI task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

# we execute the job and time it
time mpirun -np $SLURM_NTASKS ./my_binary.x > my_output # after the job is done we copy our output back to$SLURM_SUBMIT_DIR
cp ${SCRATCH_DIRECTORY}/my_output${SLURM_SUBMIT_DIR}

# we step out of the scratch directory and remove it
cd ${SLURM_SUBMIT_DIR} rm -rf${SCRATCH_DIRECTORY}

# happy end
exit 0


Save it to a file (e.g. run.sh) and submit it with:

$sbatch run.sh  If you want to start more than one MPI rank per node you can use --ntasks-per-node in combination with --nodes: #SBATCH --nodes=4 --ntasks-per-node=2 --cpus-per-task=8  will start 2 MPI tasks each on 4 nodes, where each task can use up to 8 threads. ## Running many sequential jobs in parallel using job arrays¶ In this example we wish to run many similar sequential jobs in parallel using job arrays. We take Python as an example but this does not matter for the job arrays: import time print('start at ' + time.strftime('%H:%M:%S')) print('sleep for 10 seconds ...') time.sleep(10) print('stop at ' + time.strftime('%H:%M:%S'))  Save this to a file called “test.py” and try it out: $ python test.py

start at 15:23:48
sleep for 10 seconds ...
stop at 15:23:58


Good. Now we would like to run this script 16 times at the same time. For this we use the following script:

#!/bin/bash -l

#####################
# job-array example #
#####################

#SBATCH --job-name=example

# 16 jobs will run in this array at the same time
#SBATCH --array=1-16

# run for five minutes
#              d-hh:mm:ss
#SBATCH --time=0-00:05:00

# short partition should do it
#SBATCH --partition short

# 500MB memory per core
# this is a hard limit
#SBATCH --mem-per-cpu=500MB

# you may not place bash commands before the last SBATCH directive

# define and create a unique scratch directory
SCRATCH_DIRECTORY=/global/work/${USER}/job-array-example/${SLURM_JOBID}
mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY}

cp ${SLURM_SUBMIT_DIR}/test.py${SCRATCH_DIRECTORY}

# each job will see a different ${SLURM_ARRAY_TASK_ID} echo "now processing task id:: "${SLURM_ARRAY_TASK_ID}
python test.py > output_${SLURM_ARRAY_TASK_ID}.txt # after the job is done we copy our output back to$SLURM_SUBMIT_DIR
cp output_${SLURM_ARRAY_TASK_ID}.txt${SLURM_SUBMIT_DIR}

# we step out of the scratch directory and remove it
cd ${SLURM_SUBMIT_DIR} rm -rf${SCRATCH_DIRECTORY}

# happy end
exit 0


Sumbit the script and after a short while you should see 16 output files in your submit directory:

$ls -l output*txt -rw------- 1 user user 60 Oct 14 14:44 output_1.txt -rw------- 1 user user 60 Oct 14 14:44 output_10.txt -rw------- 1 user user 60 Oct 14 14:44 output_11.txt -rw------- 1 user user 60 Oct 14 14:44 output_12.txt -rw------- 1 user user 60 Oct 14 14:44 output_13.txt -rw------- 1 user user 60 Oct 14 14:44 output_14.txt -rw------- 1 user user 60 Oct 14 14:44 output_15.txt -rw------- 1 user user 60 Oct 14 14:44 output_16.txt -rw------- 1 user user 60 Oct 14 14:44 output_2.txt -rw------- 1 user user 60 Oct 14 14:44 output_3.txt -rw------- 1 user user 60 Oct 14 14:44 output_4.txt -rw------- 1 user user 60 Oct 14 14:44 output_5.txt -rw------- 1 user user 60 Oct 14 14:44 output_6.txt -rw------- 1 user user 60 Oct 14 14:44 output_7.txt -rw------- 1 user user 60 Oct 14 14:44 output_8.txt -rw------- 1 user user 60 Oct 14 14:44 output_9.txt  Observe that they all started (approximately) at the same time: $ grep start *txt

output_1.txt:start at 14:43:58
output_10.txt:start at 14:44:00
output_11.txt:start at 14:43:59
output_12.txt:start at 14:43:59
output_13.txt:start at 14:44:00
output_14.txt:start at 14:43:59
output_15.txt:start at 14:43:59
output_16.txt:start at 14:43:59
output_2.txt:start at 14:44:00
output_3.txt:start at 14:43:59
output_4.txt:start at 14:43:59
output_5.txt:start at 14:43:58
output_6.txt:start at 14:43:59
output_7.txt:start at 14:43:58
output_8.txt:start at 14:44:00
output_9.txt:start at 14:43:59


## Packaging smaller parallel jobs into one large parallel job¶

There are several ways to package smaller parallel jobs into one large parallel job. The preferred way is to use Job Arrays. Browse the web for many examples on how to do it. Here we want to present a more pedestrian alternative which can give a lot of flexibility.

In this example we imagine that we wish to run 5 MPI jobs at the same time, each using 4 tasks, thus totalling to 20 tasks. Once they finish, we wish to do a post-processing step and then resubmit another set of 5 jobs with 4 tasks each:

#!/bin/bash

#SBATCH --job-name=example
#SBATCH --time=0-00:05:00
#SBATCH --partition short
#SBATCH --mem-per-cpu=500MB

cd ${SLURM_SUBMIT_DIR} # first set of parallel runs mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & wait # here a post-processing step # ... # another set of parallel runs mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & mpirun -n 4 ./my-binary & wait exit 0  The wait commands are important here - the run script will only continue once all commands started with & have completed. ## Example on how to allocate entire memory on one node¶ #!/bin/bash -l ##################################################### # example for a job where we consume lots of memory # ##################################################### #SBATCH --job-name=example # we ask for 1 node #SBATCH --nodes=1 # run for five minutes # d-hh:mm:ss #SBATCH --time=0-00:05:00 # short partition should do it #SBATCH --partition short # total memory for this job # this is a hard limit # note that if you ask for more than one CPU has, your account gets # charged for the other (idle) CPUs as well #SBATCH --mem=31000MB # turn on all mail notification #SBATCH --mail-type=ALL # you may not place bash commands before the last SBATCH directive # define and create a unique scratch directory SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID} mkdir -p${SCRATCH_DIRECTORY}
cd ${SCRATCH_DIRECTORY} # we copy everything we need to the scratch directory #${SLURM_SUBMIT_DIR} points to the path where this script was submitted from
cp ${SLURM_SUBMIT_DIR}/my_binary.x${SCRATCH_DIRECTORY}

# we execute the job and time it
time ./my_binary.x > my_output

# after the job is done we copy our output back to $SLURM_SUBMIT_DIR cp${SCRATCH_DIRECTORY}/my_output ${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd${SLURM_SUBMIT_DIR}
rm -rf \${SCRATCH_DIRECTORY}

# happy end
exit 0