Slurm : Job , Step , Task
In Slurm, job , step , and task describe different layers of work. The terminology can be confusing, so a clean breakdown helps. 🧩 Slurm Concepts: Job vs Task (and Job Step) 1. Job A job is the top‑level unit you submit to Slurm using sbatch , srun , or salloc . Represents the entire workload you want Slurm to run. Has resource requests: nodes, CPUs, memory, time limit, etc. Can contain one or more job steps . Think of a job as the container . 2. Job Step A job step is a subdivision of a job, created with srun inside a job allocation. Each step can run a different program or phase. Steps share the job’s allocated resources. Steps can run sequentially or in parallel. Example: preprocessing → simulation → postprocessing. 3. Task A task is the smallest unit: typically one process (often one MPI rank). Created by srun or by Slurm when launching a job step. If you request --ntasks=8 , Slurm launches 8 tasks . Each task may have 1 or more CPU threads ( --cpus-per-task ). Thin...