[What the role is]
The Next Generation Programme Office (NGPO) Strategy branch is looking for motivated platform engineers with a collaborative, proactive attitude and a passion for continuous learning.[What you will be working on]
Key responsibilities include:
Design & Maintain Mission-Critical Infrastructure: Architect, implement, and maintain platform solutions for safety-critical applications, ensuring compliance with industry standards for reliability, availability, and performance in either cloud or on-premises environments.
Platform Management: Work with either cloud-native technologies and services (AWS, Azure) including compute, storage, networking, and managed services, OR on-premises infrastructure including virtualisation platforms, bare metal servers, and traditional networking solutions, adapting quickly to new technologies as needed.
Infrastructure as Code & Automation: Implement infrastructure automation using Infrastructure as Code tools and configuration management systems to ensure consistent, repeatable deployments in your designated environment.
Container Orchestration & Platform Services: Design and manage containerised environments using container orchestration platforms and related ecosystem tools to support application deployment and scaling.
Collaborate with Cross-Functional Teams: Work closely with software developers, architects, and other stakeholders to design, implement and deploy scalable platform solutions. May also perform Site Reliability Engineering (SRE) functions as part of the role responsibilities.
Monitoring & Observability: Implement comprehensive monitoring, logging, and alerting solutions to ensure platform health, performance, and early issue detection across all environments.
Security & Compliance: Apply security best practices, implement security controls, and ensure compliance with regulatory requirements for aviation systems in your designated platform environment.
Data Platform Operations: Maintain and optimise data pipeline infrastructure, streaming platforms, and analytics workloads as secondary responsibility.
Deployment & Integration: Perform platform deployment, integration testing, and validation in production environments, ensuring seamless service delivery.
Troubleshooting & Performance Optimisation: Proactively identify and resolve infrastructure, performance, and reliability issues across development and production environments.
Documentation & Knowledge Sharing: Maintain thorough documentation for infrastructure, processes, and operational procedures, ensuring knowledge transfer and compliance requirements are met.
[What we are looking for]
Bachelor's Degree in Computer Science, Information Technology, Engineering or equivalent.
Platform engineering and infrastructure management experience in either cloud or on-premises environments.
Strong knowledge of either cloud-native architectures and microservices patterns OR traditional infrastructure patterns and distributed systems design.
Core technical experience with:
Infrastructure as Code tools (e.g., Terraform, CloudFormation)
Container orchestration platforms (e.g., Kubernetes, Docker Swarm, OpenShift)
Package management and deployment tools (e.g., Helm, Kustomize)
Monitoring and observability stack (e.g., Prometheus/Grafana, ELK stack, Datadog)
For Cloud Platform Focus - experience with:
Major cloud platforms (AWS, Azure)
Cloud-native services and managed solutions
For On-Premises Platform Focus - experience with:
Virtualisation platforms (e.g., VMware vSphere, Hyper-V, KVM)
Configuration management tools (e.g., Ansible, Puppet, Chef)
Bare metal server management and data centre operations
Knowledge of CI/CD pipelines, DevOps practices, and automation tools.
Understanding of cybersecurity concepts, including network security, identity management, and compliance frameworks.
Knowledge of networking concepts including VPCs/VLANs, load balancers, DNS, and service mesh technologies.
Proficiency in scripting and automation languages (e.g., Python, Bash, PowerShell, Go).
Engineer desired skills and experience
Having Site Reliability Engineering (SRE) background and experience, including familiarity with service level objects, incident management, and production system reliability practices.
Experience with disaster recovery, backup strategies, and business continuity planning.
Experience in high-availability, fault-tolerant system design and implementation.
Government Commercial Cloud (GCC) environments experience.
Basic understanding of data platform technologies and streaming systems.
Understanding of cost optimisation strategies and financial management.
Knowledge of compliance frameworks relevant to aviation and critical infrastructure systems.
Your appointment designation will commensurate with your relevant work experience. Successful candidates will be offered a 3-year contract in the first instance, and may be considered for placement on a permanent tenure or subsequent contract renewal.