Job logs
Return values and print values generated during execution are recorded in that job's log.
Note
Log files will be saved to the directory where the sbatch command was executed.
A log report can be broken down into 3 main parts
First part : Setting up
Records the loading of required modules for the job and any environment variables.
Second part : Standard output and error
Records all outputs from the job, such as print statements, return values, and error messages.
Third part : Job report
States the outcome of job execution. If the job has failed, a reason will be provided (time limit, erronous code, preemption, etc).
Failure from erronous code
If beautifulsoup
(wrong) is imported instead of BeautifulSoup
(right), Python will be unable to find the beautifulsoup
function from the bs4
module. The job fails and the log will record this issue as shown below
Failure from improper parameter values
Improper memory
If too little memory is allocated (1MB for example purposes), the job will fail when memory runs out and the issue will be displayed as such in the log
Improper time
If too little time is allocated (1 min for example purposes), the job will also fail when time runs out and the issue will be displayed as such in the log
Optimization
Tip
If running the same job multiple times, utilize the CPU and Memory efficiency readings to allocate the necessary resources for the job
Job ID: 00001
Cluster: crimson
User/Group: example.user.2023/example.user.2023
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 4
CPU Utilized: 03:28:12
CPU Efficiency: 92.78% of 03:44:24 core-walltime
Job Wall-clock time: 00:56:06
Memory Utilized: 3.41 GB
Memory Efficiency: 10.65% of 32.00 GB
Using the above log report to optimize the job script,
- Use CPU core-walltime to determine only 4~4.5 hours of time needed
- Use memory utilized to determine only 4~4.5GB of memory needed
For CPU and Memory, achieving an efficiency between 80-90% would be ideal
Why optimize?
Good practice to request the needed resources and share with other users
Resource-heavy jobs may take longer to queue