Job logs

Return values and print values generated during execution are recorded in that job's log.

Note

Log files will be saved to the directory where the sbatch command was executed.

A log report can be broken down into 3 main parts

First part : Setting up

Records the loading of required modules for the job and any environment variables.

Second part : Standard output and error

Records all outputs from the job, such as print statements, return values, and error messages.

Third part : Job report

States the outcome of job execution. If the job has failed, a reason will be provided (time limit, erronous code, preemption, etc).

Failure from erronous code

1	`from bs4 import beautifulsoup`

If beautifulsoup (wrong) is imported instead of BeautifulSoup (right), Python will be unable to find the beautifulsoup function from the bs4 module. The job fails and the log will record this issue as shown below

log of job with erronous code

Failure from improper parameter values

Improper memory

    #SBATCH --mem=1MB

If too little memory is allocated (1MB for example purposes), the job will fail when memory runs out and the issue will be displayed as such in the log

log of job with little memory

Improper time

    #SBATCH --time=00-00:01:00

If too little time is allocated (1 min for example purposes), the job will also fail when time runs out and the issue will be displayed as such in the log

log of job with little time

Optimization

Tip

If running the same job multiple times, utilize the CPU and Memory efficiency readings to allocate the necessary resources for the job

Job ID: 00001
Cluster: crimson
User/Group: example.user.2023/example.user.2023
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 4
CPU Utilized: 03:28:12
CPU Efficiency: 92.78% of 03:44:24 core-walltime
Job Wall-clock time: 00:56:06
Memory Utilized: 3.41 GB
Memory Efficiency: 10.65% of 32.00 GB

Using the above log report to optimize the job script,

Use CPU core-walltime to determine only 4~4.5 hours of time needed
Use memory utilized to determine only 4~4.5GB of memory needed

For CPU and Memory, achieving an efficiency between 80-90% would be ideal

Why optimize?

Good practice to request the needed resources and share with other users

Resource-heavy jobs may take longer to queue