Researchers
Modules
If you require an account for your research work, please reach out to your PI
Researchers assigned an account on the cluster are provisioned with the following quotas. If more resources are needed, please feel free to reach out to the SCIS GPU Cluster Team :
Description | Quota |
---|---|
CPU | 20 Cores |
Memory | 30 GB |
GPU | 1x RTX 3090/V100/A40 |
Maximum Job Runtime | Up to 5 days |
Home Directory Storage | 80 GB |
For more information about your account, please execute the myinfo
command while logged on the server.
Queues
The common research queues are made up of both undergraduate and compute nodes contributed by various research teams.
Common Queue
Queue name | Quota | Remarks |
---|---|---|
researchlong | 5 days | Jobs will be preempted during UG term time |
researchshort | 2 days | Jobs will be preempted during UG term time |
Research Team Queues
The following research queues are priortised for the research teams which contributed compute nodes to the GPU cluster. If your job is running on these compute node, They will be pre-empted when the resources are required by relevant research team.
Queue name | hostnames | GPU |
---|---|---|
pradeepresearch | lexicon | L40 |
akshatresearch | comet | A5000 |
antioneresearch | candle | A5000 |
losiawlingresearch | albert | A100 |
dgxa100 | analog | A100 |
project | aloha,lava,voodoo,toga | RTX 3090 |
Job preemption
Job preemption refers to an event where your job is stopped to free up resources for other higher priority jobs to take place. Whenever there are insufficient resources for undergraduate use, jobs in the researchlong
queue will be preempted. If your job is running on one of the compute node contributed by a research team and they require resources for their work. There will be a chance where your job will get pre-empted when they require resources.
If the #SBATCH --requeue
parameter is not present in the job submission script, the job will not be automatically requeued. Jobs requeued in this fashion will continue to be executed until the configured time limit is exceeded.
Sample submission script
Anaconda Users
If you are using Anaconda as the preferred package manager, please refer to the guide here
You will need to modify the sample script below to submit a job to the cluster. Execute the myinfo
command in the server to find out the values assigned for your accounts
Refer to the Submitting Jobs page for more information on how to submit jobs.
Submitting the job
-
Add the executable bit to the template file with the following command
chmod +x <name>.sh
-
Submit the batch file to the job scheduler
sbatch template.sh