Advanced Job Management II

The advanced job management training focuses on array job, job dependencies, checkpoint and restart, file stage-in/out and troubleshooting job submission issues. This will help users to more efficiently run their jobs by best utilizing the hardware.

16 June 2021

Virtual

Overview

The advanced job management training focuses on array job, job dependencies, checkpoint and restart, file stage-in/out and troubleshooting job submission issues. This will help users to more efficiently run their jobs by best utilizing the hardware.

 

Speaker: Manjunath Doddam (Altair)

1. Introduction
2. Job management and project info in brief
3. Job exit codes
4. MPI jobs in batch mode
a) “mpirpocs” parameter
b) MPI tight integration
5. Multithreaded jobs and OMP_NUM_THREADS in batch mode
6. Details on memory enforcement
7. Job Arrays
8. Job dependencies
9. PBS Reservations
10. Using Check pointing
11. File stage-in/out
12. Using IME
13. Troubleshooting
14. Lab Session
15. Using Compute Manager
16. Using Display Manager

1. A valid user account on NSCC system, ASPIRE1
2. Pre-installed SSH client like Putty or Moba-Xterm to connect to ASPIRE1 on user’s laptop
3. Basic understanding of Linux commands.
a) File management
b) “vi” editor
c) Using “modules” in Linux
d) Process management
4. Basic PBS Pro job management

At the end of this course one will have fair understanding about advanced job management such as array jobs, reservations, Job dependencies, file stage-in/out, IME and Compute, Display Manager.