Reports to: System Architect
DUG runs a custom monitoring and cluster provisioning system. The Monitoring and Automation Engineer will be have ownership of these systems along with other existing and future automated systems. This role will require some knowledge of software-development and Linux administration in order to be able to implement or recommend changes and advise on new systems.
- Maintaining and extending the monitoring system
- Alert notifications
- Basic IT automation
- Purchase solutions if possible; if not, recommend what needs to be built
- Working with storage, network and Linux experts to enhance monitoring.
- Experienced software engineer
- Strong Linux administration background
- Experience with a number of monitoring systems.
- Experience with lustre monitoring
- Experience with ZFS monitoring
- Experience with monitoring Mellanox switches.