Hiring.fm
Browse All Jobs
Career Advice Blog
Sign In to Account
Create Free Account
Toggle theme
Home
Browse All Jobs
Site Reliability Engineer
Clearwater Analytics
Share Job
Site Reliability Engineer
Office - Noida
Full time
Keywords
Apply Now
Apply
Key Responsibilities
Build and maintain observability stacks using Prometheus and Grafana; define SLOs, SLIs, SLAs and error budgets.
Own incident response: on-call rotation, triage, mitigation, and blameless post-mortems.
Automate repetitive operational tasks and eliminate toil through scripting and tooling (Python, Bash, Go).
Design, deploy, and maintain highly available infrastructure on AWS using Terraform and Ansible for infrastructure-as-code workflows.
Manage and optimize Kubernetes clusters (EKS) and containerized workloads with Docker to support microservices architecture.
Collaborate with engineering teams during design reviews to embed reliability and scalability requirements.
Monitor capacity and performance trends; proactively identify and resolve bottlenecks.
Maintain and improve CI/CD pipelines and deployment automation.
Qualifications Required
2–8 years of experience in Site Reliability Engineering, DevOps, or a closely related discipline.
Working knowledge of monitoring and logging tools like Prometheus, Grafana, Dynatrace or Datadog, OpenSearch and Victoria metrics etc.
Tracking and monitoring SLAs for all critical services.
Experience with Linux systems administration.
Hands-on experience with Kubernetes and Docker in production environments.
Proficiency with AWS services (EC2, EKS, RDS, S3, VPC, IAM, CloudWatch).
Experience with Infrastructure-as-Code tools such as Terraform or Ansible.
Strong scripting skills in Python or Bash.
Familiarity with CI/CD tools (e.g., GitHub Actions, Jenkins, GitLab CI).
Familiarity with GitOps workflows (ArgoCD, Rancher etc).
Preferred
Experience in financial services, FinTech, or other regulated industries.
Knowledge of service mesh technologies (Istio, Linkerd).
Familiarity with distributed tracing tools (Jaeger, OpenTelemetry).
AWS certifications (Solutions Architect, DevOps Engineer, or equivalent).
Experience with cost optimization strategies in cloud environments.
Related Jobs
Software Engineer
Clarivate™
ISR- Jerusalem (PQ)
Full time
Staff Software Embedded Engineer
NGC
United States-Illinois-Rolling Meadows
Full time
Data Analyst II - Life Policyholder Behavior
Pacific Life
Newport Beach CA-700
Full time
Principal Engineer Software
NGC
United States-Florida-Melbourne
Full time
Sr Analyst II Cloud Engineering
DXC Technology
IND - HR - NOIDA
Full time
Software Engineer / Principal Software Engineer (AHT)
NGC
United States-Maryland-Hollywood
Full time