Researchers

Modules

If you require an account for your research work, please reach out to your PI

Researchers assigned an account on the cluster are provisioned with the following quotas. If more resources are needed, please feel free to reach out to the SCIS GPU Cluster Team :

Description	Quota
CPU	20 Cores
Memory	30 GB
GPU	1x RTX 3090/V100/A40
Maximum Job Runtime	Up to 5 days
Home Directory Storage	80 GB

For more information about your account, please execute the myinfo command while logged on the server.

Queues

The common research queues are made up of both undergraduate and compute nodes contributed by various research teams.

Common Queue

Queue name	Quota	Remarks
researchlong	5 days	Jobs will be preempted during UG term time
researchshort	2 days	Jobs will be preempted during UG term time

Research Team Queues

The following research queues are priortised for the research teams which contributed compute nodes to the GPU cluster. If your job is running on these compute node, They will be pre-empted when the resources are required by relevant research team.

Queue name	hostnames	GPU
pradeepresearch	lexicon	L40
akshatresearch	comet	A5000
antioneresearch	candle	A5000
losiawlingresearch	albert	A100
dgxa100	analog	A100
project	aloha,lava,voodoo,toga	RTX 3090

Job preemption

Job preemption refers to an event where your job is stopped to free up resources for other higher priority jobs to take place. Whenever there are insufficient resources for undergraduate use, jobs in the researchlong queue will be preempted. If your job is running on one of the compute node contributed by a research team and they require resources for their work. There will be a chance where your job will get pre-empted when they require resources.

If the #SBATCH --requeue parameter is not present in the job submission script, the job will not be automatically requeued. Jobs requeued in this fashion will continue to be executed until the configured time limit is exceeded.

Sample submission script

Anaconda Users

If you are using Anaconda as the preferred package manager, please refer to the guide here

You will need to modify the sample script below to submit a job to the cluster. Execute the myinfo command in the server to find out the values assigned for your accounts

Refer to the Submitting Jobs page for more information on how to submit jobs.

#!/bin/bash

#################################################
## TEMPLATE VERSION 1.01                       ##
#################################################
## ALL SBATCH COMMANDS WILL START WITH #SBATCH ##
## DO NOT REMOVE THE # SYMBOL                  ## 
#################################################

#SBATCH --nodes=1                   # How many nodes required? Usually 1
#SBATCH --cpus-per-task=4           # Number of CPU to request for the job
#SBATCH --mem=8GB                   # How much memory does your job require?
#SBATCH --gres=gpu:1                # Do you require GPUS? If not delete this line
#SBATCH --time=02-00:00:00          # How long to run the job for? Jobs exceed this time will be terminated
                                    # Format <DD-HH:MM:SS> eg. 5 days 05-00:00:00
                                    # Format <DD-HH:MM:SS> eg. 24 hours 1-00:00:00 or 24:00:00  
#SBATCH --mail-type=BEGIN,END,FAIL  # When should you receive an email?
#SBATCH --output=%u.%j.out          # Where should the log files go?
                                    # You must provide an absolute path eg /common/home/module/username/
                                    # If no paths are provided, the output file will be placed in your current working directory
#SBATCH --requeue                   # Remove if you are not want the workload scheduler to requeue your job after preemption

################################################################
## EDIT AFTER THIS LINE IF YOU ARE OKAY WITH DEFAULT SETTINGS ##
################################################################

#SBATCH --partition=researchshort           # The partition you've been assigned
#SBATCH --account=UseMyInfoCommandToCheck   # The account you've been assigned (normally student)
#SBATCH --qos=UseMyInfoCommandToCheck       # What is the QOS assigned to you? Check with myinfo command
#SBATCH --mail-user=email1@scis.smu.edu.sg,email2@scis.smu.edu.sg # Who should receive the email notifications
#SBATCH --job-name=finalSubmissionFinal     # Give the job a name

#################################################
##            END OF SBATCH COMMANDS           ##
#################################################

# Purge the environment, load the modules we require.
# Refer to https://violet.smu.edu.sg/origami/module/ for more information
module purge
module load Python/3.11.7

# Create a virtual environment
python3.11 -m venv ~/myenv

# This command assumes that you've already created the environment previously
# We're using an absolute path here. You may use a relative path, as long as SRUN is execute in the same working directory
source ~/myenv/bin/activate

# Find out which GPU you are using
srun whichgpu

# If you require any packages, install it as usual before the srun job submission.
# pip3 install numpy

# Submit your job to the cluster
srun --gres=gpu:1 python /path/to/your/python/script.py

Submitting the job

Add the executable bit to the template file with the following command

chmod +x <name>.sh
Submit the batch file to the job scheduler

sbatch template.sh