Skip to content

Researchers

Modules

If you require an account for your research work, please reach out to your PI

Researchers assigned an account on the cluster are provisioned with the following quotas. If more resources are needed, please feel free to reach out to the SCIS GPU Cluster Team :

Description Quota
CPU 20 Cores
Memory 30 GB
GPU 1x RTX 3090/V100/A40
Maximum Job Runtime Up to 5 days
Home Directory Storage 80 GB

For more information about your account, please execute the myinfo command while logged on the server.

Queues

The common research queues are made up of both undergraduate and compute nodes contributed by various research teams.

Common Queue

Queue name Quota Remarks
researchlong 5 days Jobs will be preempted during UG term time
researchshort 2 days Jobs will be preempted during UG term time

Research Team Queues

The following research queues are priortised for the research teams which contributed compute nodes to the GPU cluster. If your job is running on these compute node, They will be pre-empted when the resources are required by relevant research team.

Queue name hostnames GPU
pradeepresearch lexicon L40
akshatresearch comet A5000
antioneresearch candle A5000
losiawlingresearch albert A100
dgxa100 analog A100
project aloha,lava,voodoo,toga RTX 3090

Job preemption

Job preemption refers to an event where your job is stopped to free up resources for other higher priority jobs to take place. Whenever there are insufficient resources for undergraduate use, jobs in the researchlong queue will be preempted. If your job is running on one of the compute node contributed by a research team and they require resources for their work. There will be a chance where your job will get pre-empted when they require resources.

If the #SBATCH --requeue parameter is not present in the job submission script, the job will not be automatically requeued. Jobs requeued in this fashion will continue to be executed until the configured time limit is exceeded.

Sample submission script

Anaconda Users

If you are using Anaconda as the preferred package manager, please refer to the guide here

You will need to modify the sample script below to submit a job to the cluster. Execute the myinfo command in the server to find out the values assigned for your accounts

Refer to the Submitting Jobs page for more information on how to submit jobs.

#!/bin/bash

#################################################
## TEMPLATE VERSION 1.01                       ##
#################################################
## ALL SBATCH COMMANDS WILL START WITH #SBATCH ##
## DO NOT REMOVE THE # SYMBOL                  ## 
#################################################

#SBATCH --nodes=1                   # How many nodes required? Usually 1
#SBATCH --cpus-per-task=4           # Number of CPU to request for the job
#SBATCH --mem=8GB                   # How much memory does your job require?
#SBATCH --gres=gpu:1                # Do you require GPUS? If not delete this line
#SBATCH --time=02-00:00:00          # How long to run the job for? Jobs exceed this time will be terminated
                                    # Format <DD-HH:MM:SS> eg. 5 days 05-00:00:00
                                    # Format <DD-HH:MM:SS> eg. 24 hours 1-00:00:00 or 24:00:00  
#SBATCH --mail-type=BEGIN,END,FAIL  # When should you receive an email?
#SBATCH --output=%u.%j.out          # Where should the log files go?
                                    # You must provide an absolute path eg /common/home/module/username/
                                    # If no paths are provided, the output file will be placed in your current working directory
#SBATCH --requeue                   # Remove if you are not want the workload scheduler to requeue your job after preemption

################################################################
## EDIT AFTER THIS LINE IF YOU ARE OKAY WITH DEFAULT SETTINGS ##
################################################################

#SBATCH --partition=researchshort           # The partition you've been assigned
#SBATCH --account=UseMyInfoCommandToCheck   # The account you've been assigned (normally student)
#SBATCH --qos=UseMyInfoCommandToCheck       # What is the QOS assigned to you? Check with myinfo command
#SBATCH --mail-user=email1@scis.smu.edu.sg,email2@scis.smu.edu.sg # Who should receive the email notifications
#SBATCH --job-name=finalSubmissionFinal     # Give the job a name

#################################################
##            END OF SBATCH COMMANDS           ##
#################################################

# Purge the environment, load the modules we require.
# Refer to https://violet.smu.edu.sg/origami/module/ for more information
module purge
module load Python/3.11.7

# Create a virtual environment
python3.11 -m venv ~/myenv

# This command assumes that you've already created the environment previously
# We're using an absolute path here. You may use a relative path, as long as SRUN is execute in the same working directory
source ~/myenv/bin/activate

# Find out which GPU you are using
srun whichgpu

# If you require any packages, install it as usual before the srun job submission.
# pip3 install numpy

# Submit your job to the cluster
srun --gres=gpu:1 python /path/to/your/python/script.py

Submitting the job

  1. Add the executable bit to the template file with the following command

    chmod +x <name>.sh

  2. Submit the batch file to the job scheduler

    sbatch template.sh