ABOUT THIS POSITION
SRE Principal Engineer
We are seeking a highly skilled SRE Principal Engineer with Site Reliability Engineering (SRE) expertise to design, build, scale and optimize our cloud platform and infrastructure. This role demands deep hands-on experience with AWS cloud services across compute, storage, databases, networking, and security, combined with strong cost optimization strategies. The AWS Cloud Architect will help define the cloud roadmap and strategy, design scalable solutions, and ensure the reliability, security, and cost-efficiency of the platform and infrastructure. The role will be responsible for the scalability of the platform and infrastructure, ensuring it can support business growth while maintaining high availability and performance. Additional responsibilities will include mentoring junior members on the SRE team, reviewing and approving infrastructure code, and participating in key architectural discussions with product engineering and security teams to ensure new and existing services follow best practices and meet operational excellence standards.
If you are an experienced SRE Principal Engineer with a strong SRE mindset, passionate about high availability, security, automation, and cost efficiency, we would love to hear from you.
WHAT YOU'LL DO
Key Responsibilities
Cloud Strategy & Roadmap
- Help define and implement the cloud roadmap and strategy to drive scalability, reliability, security, and cost efficiency.
- Lead and contribute in cloud adoption initiatives, ensuring alignment with business objectives.
- Provide technical leadership and expertise on cloud governance, architectural best practices, and modernization strategies.
Incident Response & Operational Excellence
- Participate and help refine incident management processes for the SRE team, ensuring minimal downtime and fast recovery.
- Collaborate with Engineering and other teams to define SLOs, SLIs, and error budgets to drive system reliability.
- Participate in post-mortems and root cause analysis to prevent recurring issues.
Engineering Leadership & Code Review
- Approve merge and pull requests, ensuring high-quality, scalable, and secure infrastructure code.
- Mentor and upskill the junior members of the SRE team, fostering a culture of continuous learning.
- Participate in architecture discussions with product engineering teams for onboarding new services, ensuring they are scalable, cost-optimized, and aligned with best engineering practices.
- Collaborate with software developers to optimize application performance and cloud-native designs.
Automation & Reliability Engineering
- Develop Infrastructure as Code (IaC) using Terraform, CloudFormation, or AWS CDK for fully automated provisioning and deployment.
- Implement self-healing, fault-tolerant architectures that can automatically recover from failures.
- Optimize infrastructure monitoring and observability using Prometheus, Grafana, Loki, Tempo, Mimir, AWS CloudWatch, AWS Cloudtrail and New Relic.
Security, Compliance, and Best Practices
- Ensure cloud security best practices are embedded into all solutions, including IAM policies, VPC security, encryption, and compliance with industry standards (such as SOC 2, HIPAA).
- Implement least privilege access, network segmentation, and automated security controls across AWS services.
- Collaborate with InfoSec teams to enforce threat detection, logging, and security monitoring using tools such as AWS GuardDuty, Security Hub, CloudTrail, Reliaquest Greymatter and Google Chronicle.
AWS Cost Optimization & FinOps
- Continuously monitor and optimize AWS infrastructure costs using AWS Cost Explorer, Trusted Advisor, and Savings Plans/Reserved Instances.
- Drive FinOps culture, ensuring teams design and deploy cost-efficient cloud solutions.
- Implement auto-scaling, rightsizing strategies, and storage lifecycle policies to reduce costs.
Solution Architecture & Infrastructure Design
- Design and build highly available, scalable, and fault-tolerant AWS architecture using AWS services such as EC2, S3, RDS, DocumentDB, Lambda, EKS, Secrets Manager, SSM, API Gateway, and CloudFront and other related technologies such as Hashicorp Terraform, Vault and Consul and Ansible (AWX)
- Architect and implement resilient storage, compute, and database solutions optimized for performance and cost.
- Help define multi-region disaster recovery (DR) and backup strategies.
- Provide subject matter expertise in the design and implementation of Kubernetes-based infrastructure, including Amazon EKS and containerized workloads.
WHAT YOU'LL NEED
Required Qualifications
- 7+ years of experience in AWS cloud architecture and SRE/DevOps roles.
- Hands-on expertise with AWS services, including EC2, S3, Lambda, EKS, VPC, IAM, Secrets Manager, SSM and technologies such as Haschicorp Vault and Consul
- Strong knowledge of cost optimization techniques in AWS, including autoscaling, right-sizing, storage lifecycle policies, and Reserved Instances/Savings Plans.
- Deep experience with Infrastructure as Code (IaC) and configuration management using Terraform, CloudFormation, Ansible.
- Proficiency in Linux Administration, Python, or Bash scripting for automation.
- Experience with Kubernetes (EKS), Docker, and container orchestration.
- Strong security and compliance knowledge, including IAM, security groups, encryption, WAF, and logging with CloudTrail.
- Hands-on experience with monitoring and observability tools like Prometheus, Grafana, AWS CloudWatch, Loki, and New Relic.
- Experience in approving merge and pull requests, ensuring high-quality infrastructure code.
- Strong leadership, mentoring, and communication skills.
Preferred Qualifications
- AWS Certifications (e.g., AWS Certified Solutions Architect - Professional, AWS Certified DevOps Engineer).
- Experience with multi-account AWS organizations and AWS Control Tower.
- Familiarity with service meshes (Istio, Linkerd) and API gateways.
- Experience with Fortinet (FortiGate) firewalls and AWS networking (VPC, Transit Gateway, Direct Connect, etc.).
- Background in database administration (PostgreSQL, MySQL, DocumentDB, or NoSQL databases).
- Experience implementing resilience testing and chaos engineering
ABOUT WAYSTAR
Through a smart platform and better experience, Waystar helps providers simplify healthcare payments and yield powerful results throughout the complete revenue cycle.
Waystar’s healthcare payments platform combines innovative, cloud-based technology, robust data, and unparalleled client support to streamline workflows and improve financials so providers can focus on what matters most: their patients and communities. Waystar is trusted by 1M+ providers, 1K+ hospitals and health systems, and is connected to over 5K commercial and Medicaid/Medicare payers. We are deeply committed to living out our organizational values: honesty; kindness; passion; curiosity; fanatical focus; best work, always; making it happen; and joyful, optimistic & fun.
Waystar products have won multiple Best in KLAS® or Category Leader awards since 2010 and earned multiple #1 rankings from Black Book™ surveys since 2012. The Waystar platform supports more than 500,000 providers, 1,000 health systems and hospitals, and 5,000 payers and health plans. For more information, visit waystar.com or follow @Waystar on Twitter.
WAYSTAR PERKS
- Competitive total rewards (base salary + bonus, if applicable)
- Customizable benefits package (3 medical plans with Health Saving Account company match)
- We offer generous paid time off for our non-exempt team members, starting with 3 weeks + 13 paid holidays, including 2 personal floating holidays. We also offer flexible time off for our exempt team members + 13 paid holidays
- Paid parental leave (including maternity + paternity leave)
- Education assistance opportunities and free LinkedIn Learning access
- Free mental health and family planning programs, including adoption assistance and fertility support
- 401(K) program with company match
- Pet insurance
- Employee resource groups
Waystar is proud to be an equal opportunity workplace. We celebrate, value, and support diversity and inclusion. Qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, marital status, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws.
This applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.