Reports to: Head of Software
DUG uses HPC job scheduling software to run jobs on its compute clusters, and internal software to integrate with project management and billing systems. The scheduling and accounting engineer is a hybrid IT / software development role, responsible for developing, maintaining, and supporting this system.
- Maintenance and development of the DUG batch job queueing system (slurm, with DUG characteristics)
- Working with SchedMD to resolve bugs, push patches upstream when possible, and track their major releases
- Developing slurm test infrastructure
- Testing and deploying updates in a timely fashion
- Developing tools for collecting, aggregating, QCing, and reporting accurate accounting data from slurm to the McCloud billing system.
- C or C++ software development
- Experience with the basic tools of software development, such as version control (e.g. git) and issue tracking (e.g. Jira)
- Experience with HPC job scheduling software (e.g. slurm, PBS, SGE)
- Linux system administration skills (e.g. software installation, userspace troubleshooting, system performance analysis tools, etc).
- Slurm experience
- HPC accounting
- Bash scripting.