AVID

Site Reliability Engineer

Philippines Full time

It's fun to work in a company where people truly BELIEVE in what they're doing!

We're committed to bringing passion and customer focus to the business.

ABOUT AVID

Avid makes technology and collaborative tools so creators can entertain, inform, educate and enlighten the world. Our customers are the visionaries behind the most inspiring feature films, television programs, news broadcasts, televised sporting events, music recording and live concerts. To learn how Avid powers greater creators or for more information, visit www.avid.com.

 

JOB SUMMARY:

Come and join us at Avid as a Site Reliability Engineer (Remote, Philippines), where you will play a key role in ensuring the reliability, performance, and scalability of our cloud infrastructure and production systems. You’ll work closely with cross-functional engineering teams to design resilient architectures, automate deployments, and deliver a highly available platform.

WHAT YOU WILL DO:

  • Champion and continuously improve platform reliability, observability, and DevOps culture across the engineering organization.

  • Define and track SLAs, SLOs, and SLIs to drive reliability goals and monitor service health across the platform.

  • Design, implement, and tune application and component monitoring, alerting and dashboards using Prometheus, Grafana, CloudWatch and Elastic (or similar tools)

  • Related to site reliability, you will also have the opportunity and responsibility to improve and harden core systems in conjunction with the larger cloud engineering team.  These would potentially include:

  • Design, operate, and optimize Kubernetes workloads on Amazon EKS, managing containerized applications across multiple environments.

  • Implement and maintain Istio service mesh for secure, resilient, and observable service-to-service communication within Kubernetes.

  • Build and manage GitOps pipelines using ArgoCD, ensuring Kubernetes manifests and Helm charts are deployed and audited correctly.

  • Automate CI/CD workflows with GitHub Actions, enabling fast and safe software delivery.

  • Automate infrastructure provisioning with Terraform, enabling consistent, repeatable, and auditable AWS deployments.

  • Participate in a 24/7 on-call rotation, handle incident response, perform postmortems, and maintain up-to-date runbooks.

  • Secure applications and infrastructure with tools like Snyk and follow security best practices.

  • Manage edge and DNS configurations with Cloudflare, R53, ensuring performance and global availability.

  • Operate and tune AWS services such as RDS, OpenSearch, and IAM, supporting data and identity needs.

WHAT YOU CAN DELIVER:

   Minimum Requirements:

  • Bachelor’s degree in Information Technology, Computer Science, Software Engineering, and/or other related fields.

  • 5+ years of experience in Site Reliability Engineering, DevOps, and/or equivalent.

  • Strong proficiency with Kubernetes (preferably Amazon EKS) and containerized application deployments.

  • Proficiency with observability stacks (Prometheus, Grafana, ELK) and alerting best practices.

  • Strong scripting skills (Bash, Python, or similar) for automation and tooling.

Preferred Skills, Experience, Capabilities:

  • Expertise with infrastructure-as-code tools such as Terraform, and GitOps workflows using ArgoCD.

  • Hands-on experience with AWS services including RDS, IAM, OpenSearch, CloudWatch, and Route 53.

  • Experience with CI/CD automation using GitHub Actions or similar tools.

  • Knowledge of service mesh technologies such as Istio.

Aside from the minimum requirements and preferred qualifications above, the successful candidate shall possess the following behavioral traits and technical skills:

  • Understanding of security best practices in cloud environments, including vulnerability scanning and remediation

  • Excellent troubleshooting skills, especially in distributed, cloud-native systems.

  • Strong communication skills and ability to work cross-functionally in a collaborative environment.

WHAT TO LOOK FORWARD TO:

  • Join a global team and experience a dynamic, collaborative work environment that fosters innovation and growth

  • Remote work model offering flexibility to balance work and life

  • Access to development programs with strong support and mentoring to help you grow and advance within the company

  • Attractive benefits package including health & life insurance, referral rewards, and generous leave policies to ensure a healthy work-life balance

OUTSIDE OF SCOPE:

  • Direct software feature development unrelated to infrastructure or reliability.

  • Desktop IT support, end-user troubleshooting, or helpdesk functions.

  • On-premises server hardware maintenance.

  • Manual infrastructure deployments without automation or version control.

  • Marketing, sales, or customer account management duties.

Avid is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

#LI-Remote #LI-NR1

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!