Starter jobs
In this section, a sample job will be provided to give hands-on experience with job submission to the cluster.
Tensor addition sample job
Overview
This job helps to visualise the difference in computational performance for GPU vs CPU.
We will compare the time taken by a GPU vs a CPU, in completing 1 million tensor additions on a 2D matrix.
Download the zip file for this sample here.
Understand the zip file contents
Unzip the downloaded file and open it with a code editor to view the contents.
Add2D.py contains the job code to be ran on the cluster. Use the code comments to help understand the job better
sbatchAdd2D.sh contains the shell script for submitting this job to the cluster
Edit sbatch script
For this sample job, most of the configurations have been pre-set already (Node, Memory, Time, etc.). Refer to the Job submission guide(pip3) and the Shell script generator to find out more about script configuration.
Replace the values in lines 29 - 32 of the script (shown below) with the assigned values:
Use myinfo
command to help determine the assigned partition, account, and qos.
Refer to the instructions on Getting your account quota information for more details.
Transfer files onto the cluster
We shall use the SCP to transfer the files onto the cluster.
Open up command prompt and input the following commands, replacing the <>
parts with the appropriate filepath and username:
The second part of <username>@origami.smu.edu.sg:~
is simply the destination for the file, with ~
signifying the home address of <username>@origami.smu.edu.sg
and :
being the link.
Submit job to cluster
Run the commands below in sequence to submit the job.
The cluster will send an email notification when the job has ended.
Evaluating log
Transfer the job log to your local computer, also using SCP.
Read the log and observe the time difference between tensor addition with a CPU, and tensor addition with a GPU. Attached below is a sample log for this job.
Additionally, refer to Job logs for more insights on job script optimization.