# Job script examples¶

## Help! I don’t know what OpenMP or MPI means!¶

OpenMP and MPI are parallelization frameworks. If you want to run many similar jobs that each use one core at a time, scroll down to job arrays.

## Example for an OpenMP job¶

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 #!/bin/bash ############################# # example for an OpenMP job # ############################# #SBATCH --job-name=example # we ask for 1 node with 20 cores #SBATCH --nodes=1 #SBATCH --ntasks-per-node=20 # run for five minutes # d-hh:mm:ss #SBATCH --time=0-00:05:00 # short partition should do it #SBATCH --partition short # 500MB memory per core # this is a hard limit #SBATCH --mem-per-cpu=500MB # turn on all mail notification #SBATCH --mail-type=ALL # you may not place bash commands before the last SBATCH directive # define and create a unique scratch directory SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID} mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY} # we copy everything we need to the scratch directory # ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we set OMP_NUM_THREADS to the number of available cores export OMP_NUM_THREADS=${SLURM_TASKS_PER_NODE} # we execute the job and time it time ./my_binary.x > my_output # after the job is done we copy our output back to $SLURM_SUBMIT_DIR cp${SCRATCH_DIRECTORY}/my_output ${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd${SLURM_SUBMIT_DIR} rm -rf ${SCRATCH_DIRECTORY} # happy end exit 0  Save it to a file (e.g. run.sh) and submit it with: $ sbatch run.sh


## Example for a MPI job¶

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 #!/bin/bash ########################## # example for an MPI job # ########################## #SBATCH --job-name=example # 80 MPI tasks in total # Stallo has 16 or 20 cores/node and therefore we take # a number that is divisible by both #SBATCH --ntasks=80 # run for five minutes # d-hh:mm:ss #SBATCH --time=0-00:05:00 # short partition should do it #SBATCH --partition short # 500MB memory per core # this is a hard limit #SBATCH --mem-per-cpu=500MB # turn on all mail notification #SBATCH --mail-type=ALL # you may not place bash commands before the last SBATCH directive # define and create a unique scratch directory SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID} mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY} # we copy everything we need to the scratch directory # ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we execute the job and time it time mpirun ./my_binary.x > my_output # after the job is done we copy our output back to$SLURM_SUBMIT_DIR cp ${SCRATCH_DIRECTORY}/my_output${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd ${SLURM_SUBMIT_DIR} rm -rf${SCRATCH_DIRECTORY} # happy end exit 0 

Save it to a file (e.g. run.sh) and submit it with:

$sbatch run.sh  ## Running many sequential jobs in parallel using job arrays¶ In this example we wish to run many similar sequential jobs in parallel using job arrays. We take Python as an example but this does not matter for the job arrays:  1 2 3 4 5 6 7 8 import time print('start at ' + time.strftime('%H:%M:%S')) print('sleep for 10 seconds ...') time.sleep(10) print('stop at ' + time.strftime('%H:%M:%S'))  Save this to a file called “test.py” and try it out: $ python test.py

start at 15:23:48
sleep for 10 seconds ...
stop at 15:23:58


Good. Now we would like to run this script 16 times at the same time. For this we use the following script:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 #!/bin/bash ##################### # job-array example # ##################### #SBATCH --job-name=example # 16 jobs will run in this array at the same time #SBATCH --array=1-16 # run for five minutes # d-hh:mm:ss #SBATCH --time=0-00:05:00 # short partition should do it #SBATCH --partition short # 500MB memory per core # this is a hard limit #SBATCH --mem-per-cpu=500MB # you may not place bash commands before the last SBATCH directive # define and create a unique scratch directory SCRATCH_DIRECTORY=/global/work/${USER}/job-array-example/${SLURM_JOBID} mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY} cp ${SLURM_SUBMIT_DIR}/test.py${SCRATCH_DIRECTORY} # each job will see a different ${SLURM_ARRAY_TASK_ID} echo "now processing task id:: "${SLURM_ARRAY_TASK_ID} python test.py > output_${SLURM_ARRAY_TASK_ID}.txt # after the job is done we copy our output back to$SLURM_SUBMIT_DIR cp output_${SLURM_ARRAY_TASK_ID}.txt${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd ${SLURM_SUBMIT_DIR} rm -rf${SCRATCH_DIRECTORY} # happy end exit 0 

Sumbit the script and after a short while you should see 16 output files in your submit directory:

$ls -l output*txt -rw------- 1 user user 60 Oct 14 14:44 output_1.txt -rw------- 1 user user 60 Oct 14 14:44 output_10.txt -rw------- 1 user user 60 Oct 14 14:44 output_11.txt -rw------- 1 user user 60 Oct 14 14:44 output_12.txt -rw------- 1 user user 60 Oct 14 14:44 output_13.txt -rw------- 1 user user 60 Oct 14 14:44 output_14.txt -rw------- 1 user user 60 Oct 14 14:44 output_15.txt -rw------- 1 user user 60 Oct 14 14:44 output_16.txt -rw------- 1 user user 60 Oct 14 14:44 output_2.txt -rw------- 1 user user 60 Oct 14 14:44 output_3.txt -rw------- 1 user user 60 Oct 14 14:44 output_4.txt -rw------- 1 user user 60 Oct 14 14:44 output_5.txt -rw------- 1 user user 60 Oct 14 14:44 output_6.txt -rw------- 1 user user 60 Oct 14 14:44 output_7.txt -rw------- 1 user user 60 Oct 14 14:44 output_8.txt -rw------- 1 user user 60 Oct 14 14:44 output_9.txt  Observe that they all started (approximately) at the same time: $ grep start *txt

output_1.txt:start at 14:43:58
output_10.txt:start at 14:44:00
output_11.txt:start at 14:43:59
output_12.txt:start at 14:43:59
output_13.txt:start at 14:44:00
output_14.txt:start at 14:43:59
output_15.txt:start at 14:43:59
output_16.txt:start at 14:43:59
output_2.txt:start at 14:44:00
output_3.txt:start at 14:43:59
output_4.txt:start at 14:43:59
output_5.txt:start at 14:43:58
output_6.txt:start at 14:43:59
output_7.txt:start at 14:43:58
output_8.txt:start at 14:44:00
output_9.txt:start at 14:43:59


## Example on how to allocate entire memory on one node¶

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 #!/bin/bash ##################################################### # example for a job where we consume lots of memory # ##################################################### #SBATCH --job-name=example # we ask for 1 node #SBATCH --nodes=1 # run for five minutes # d-hh:mm:ss #SBATCH --time=0-00:05:00 # short partition should do it #SBATCH --partition short # total memory for this job # this is a hard limit # note that if you ask for more than one CPU has, your account gets # charged for the other (idle) CPUs as well #SBATCH --mem=31000MB # turn on all mail notification #SBATCH --mail-type=ALL # you may not place bash commands before the last SBATCH directive # define and create a unique scratch directory SCRATCH_DIRECTORY=/global/work/${USER}/example/${SLURM_JOBID} mkdir -p ${SCRATCH_DIRECTORY} cd${SCRATCH_DIRECTORY} # we copy everything we need to the scratch directory # ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from cp${SLURM_SUBMIT_DIR}/my_binary.x ${SCRATCH_DIRECTORY} # we execute the job and time it time ./my_binary.x > my_output # after the job is done we copy our output back to$SLURM_SUBMIT_DIR cp ${SCRATCH_DIRECTORY}/my_output${SLURM_SUBMIT_DIR} # we step out of the scratch directory and remove it cd ${SLURM_SUBMIT_DIR} rm -rf${SCRATCH_DIRECTORY} # happy end exit 0