Skip to main content

Submit jobs

To execute workloads in the cluster, users must submit their jobs through the workload scheduler. The number of jobs a user can submit is based on their assigned quota. This quota can be found using the myinfo command.

Submitting Jobs

The sbatch command will read a shell script (.sh file) and execute the job when resources are available in the cluster. The workload scheduler can be configured to inform the user when the submitted job is being executed and when the job ends or fails.

Tip

Files written to /tmp/ may be inaccessible after the job has been completed.

As the submitted job may be distributed to other nodes for processing, do not use the /tmp/ directory to store your temporary files or output. Instead, use the assigned scratch directory or your home directory to preserve any logs or output.

A sample job submission script is located here (open it with your favourite text editor to view) —

Download shell script template

These scripts must have the executable permission granted. This can be done with the following command

chmod +x <file path>/shellscript.sh

Commands in sbatch

Parameters listed below allow users to adjust the resources assigned for their job. If no values are specified, then the cluster will assign a default value to the job.

Important

By default, no GPUs will be assigned to the job.

All parameters in the SBATCH File starts with a # prefix

/bin/bash
#SBATCH --job-name=finalSubmissionFinal
#SBATCH --partition=project
Command parametersDescriptionArgument format
--job-nameName of the job submitted, without spaces<jobid>.log
--partitionName of the partiton assignedproject
--mail-typeWhen should the cluster send an email. Needs to be specified if --mail-user is setALL or BEGIN or END or FAIL (multiple options separated by comma)
--mail-userEmail address to send notifications to. Needs to be specified if --mail-type is setEmail Address
--timeMaximum time before the cluster terminates the jobHH:MM:SS
--nodesHow many nodes to run the job onInteger
--cpus-per-tasksNumber of CPUs to use for the workloadInteger
--memAmount of memory require by the job in Gigabytes (GB)(Integer)GB (no space)
--gresDetermines how many GPU(s) will be assigned to the job, in the following format gpu:quantity. By default, no GPU will be issued for jobs.gpu:(Integer)
--outputWhere should the log files go? Recommended to be in home directory : --output /common/home/Module/moduleaccount/%u.%j.outFile path

To submit the job to the cluster —

sbatch /path/to/sh/file.sh

Method 2 (via the srun command)

The srun command is similar to sbatch. It enables the user to request for resources from the cluster to run an interactive job. By executing your script through srun, the output will be directed to your terminal instead.

This is not recommended because it is an interactive session. The job will be terminated if you disconnect from the session.

srun should be used in a script that is read by sbatch (Refer to the template)

The command options are similar to that of sbatch

srun --partition=normal --nodes=1 --cpus-per-task=30 --mem=2GB /path/to/final_finalProject.py

Job queue commands

View the status of your jobs

myqueue

Cancel your jobs

scancel <JOBID>

Cancel all jobs submitted by your account

scancel --me

View detailed information about a running/pending job

Take note

This command only fetches information on jobs that are currently pending/running OR completed within the past 5 minutes.

scontrol show jobid <job id>

View information about previously completed jobs

Query using job id

sacct --job=<jobid> --format=JobID,User,Jobname,partition,state,time,start,end,elapsed,AllocTRES%45

Query using job name

sacct --name=<jobname> --format=JobID,User,Jobname,partition,state,time,start,end,elapsed,AllocTRES%45

View all jobs for past N days

You may view older jobs by modifying the query of 1 day ago below (up to 30 days).

sacct --starttime $(date -d '1 day ago' +"%Y-%m-%d") --format=JobID,User,Jobname,partition,state,time,start,end,elapsed,AllocTRES%45