At Aptiv, we build the software backbone for the next generation of vehicles. These are systems where milliseconds matter and downtime isn't just a metric - it's a safety concern. Our technology reaches millions of vehicles on roads worldwide and the platforms behind them need to be just as resilient as the products they power.
Your Role
We're looking for an Site Reliability Engineer who goes beyond reacting to incidents - someone focused on preventing them in the first place. You should care deeply about observability, not just as dashboards on a wall, but as a way to truly understand how our platform behaves. If you want to work somewhere that takes reliability seriously, keep reading.
You'll join Aptiv's Global DevOps & Platform Engineering team - a cross-functional group responsible for the infrastructure and workflows that powers automotive software delivery at scale.
In your daily work you will:
- Build and improve the observability platform - metrics, logs, traces - so teams have clear, actionable visibility into how their services behave
- Define and track SLOs, SLIs, and SLAs that translate business expectations into concrete engineering targets
- Lead incident response when things go wrong - run blameless post-mortems, find root causes and make sure the same issue doesn't happen again
- Plan capacity ahead of demand - study traffic patterns, forecast growth, and scale infrastructure before it becomes urgent
- Automate what you can - if you're doing it twice, script it; if you're scripting it often, make it a self-service tool
- Mentor other engineers through architecture reviews, knowledge sharing, and helping build a culture of continuous improvement
- Work with development, security, and product teams to keep reliability front and center throughout the software lifecycle
- Partner with DevSecOps to harden infrastructure, manage secrets properly and enforce least-privilege access
Your background
- Deep, hands-on experience running K8s in production - cluster lifecycle, networking (CNI, service mesh), RBAC, resource management, HPA/VPA and debugging pod failures at scale
- Solid skills with EKS, EC2, IAM, VPC, S3, CloudWatch, Route 53, and practical cost optimization
- Building and running monitoring platforms with Prometheus, Grafana, Datadog, ELK/OpenSearch, and ideally OpenTelemetry for distributed tracing
- Real-world Terraform experience (state management, modules, workspaces)
- Track record of leading incident response, running blameless post-mortems and driving measurable reliability gains
- Strong Python and Bash skills with a habit of automating operational workflows end-to-end
- Good grasp of TCP/IP, DNS, load balancing, TLS, firewall rules and zero-trust principles
Why join us?
- You can grow at Aptiv. Aptiv provides an inclusive work environment where all individuals can grow and develop, regardless of gender, ethnicity or beliefs.
- You can have an impact. Safety is a core Aptiv value; we want a safer world for us and our children, one with: Zero fatalities, Zero injuries, Zero accidents.
- You have support. We ensure you have the resources and support you need to take care of your family and your physical and mental health with a competitive health insurance package.
Your Benefits at Aptiv:
- Private health care (Signal Iduna) and Life insurance for you and your beloved ones
- Well-Being Program that includes regular webinars, workshops, and networking events
- Access to sports groups and Multisport card
- Hybrid work (min. 47 days/yr of remote work, flexible working hours)
- Employee Pension Plan paid by the employer (you get + 3,5% on each gross salary)
Apply today, and together let’s change tomorrow!
#LI-MC1
Privacy Notice - Active Candidates: https://www.aptiv.com/privacy-notice-active-candidates
Aptiv is an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, sex, gender identity, sexual orientation, disability status, protected veteran status or any other characteristic protected by law.