Site Reliability Engineer I
Are you our “TYPE”?

Monotype (Global)
Named "One of the Most Innovative Companies in Design" by Fast Company, Monotype brings brands to life through type and technology that consumers engage with every day.
The company's rich legacy includes a library that can be traced back hundreds of years, featuring famed typefaces like Helvetica, Futura, Times New Roman and more.
Monotype also provides a first-of-its-kind service that makes fonts more accessible for creative professionals to discover, license, and use in our increasingly digital world. We work with the biggest global brands, and with individual creatives, offering a wide set of solutions that make it easier for them to do what they do best: design beautiful brand experiences.

Monotype Solutions India
Monotype Solutions India is a strategic center of excellence for Monotype and is a certified Great Place to Work® three years in a row. The focus of this fast-growing center spans Product Development, Product Management, Experience Design, User Research, Market Intelligence, Research in areas of Artificial Intelligence and Machine learning, Innovation, Customer Success, Enterprise Business Solutions, and Sales.
Headquartered in the Boston area of the United States and with offices across 4 continents, Monotype is the world’s leading company in fonts. It’s a trusted partner to the world’s top brands and was named “One of the Most Innovative Companies in Design” by Fast Company.
Monotype brings brands to life through the type and technology that consumers engage with every day. The company's rich legacy includes a library that can be traced back hundreds of years, featuring famed typefaces like Helvetica, Futura, Times New Roman, and more. Monotype also provides a first-of-its-kind service that makes fonts more accessible for creative professionals to discover, license, and use in our increasingly digital world. We work with the biggest global brands, and with individual creatives, offering a wide set of solutions that make it easier for them to do what they do best: design beautiful brand experiences.
We are looking for a reliability-focused Site Reliability Engineer I to join our 24x7 production operations team supporting our SaaS and E-commerce platforms.Our environment is primarily AWS (80–90%), with heavy Kubernetes (EKS) usage and microservices architecture. We are targeting an uptime improvement from 99.9% to 99.95%, and we seek engineers who not only respond to incidents but also improve systems long-term.This role also provides structured exposure to platform engineering for high-performing candidates.

What you’ll be doing:
•Participate in 24x7 on-call rotation handling production incidents
•Perform initial triage, troubleshooting, and escalation to L3 teams when required
•Drive clear communication during high-severity incidents
•Conduct root cause analysis (RCA) and contribute to post-incident corrective actions
•Help reduce MTTR and improving overall service reliability
•Define, tune, and improve monitoring alerts in collaboration with engineering teams
•Reduce alert noise and improve signal-to-noise ratio
•Build and maintain dashboards using tools such as CloudWatch, Datadog, ELK, Prometheus, Grafana
•Leverage AI-assisted tools for log analysis, alert triage, and incident summarization
•Automate repetitive operational tasks using Bash/Python
•Identify recurring issues and implement long-term preventive fixes
•Contribute to Kubernetes resource optimization (CPU/memory tuning, scaling policies)
•Gradually contribute to infrastructure improvements using Infrastructure-as-Code (Terraform/CloudFormation) under guidance
•Support reliability initiatives aimed at achieving 99.95% uptime
•Gain exposure to multi-cloud environments (AWS primary, Azure/GCP/AliCloud presence)
•Collaborate with platform and engineering teams on resilience improvements
•Opportunity for high performers to contribute to DevOps, security, MLops, and AIops initiatives

What we’re looking for:
•2–3 years of experience in production cloud environments (monitoring, troubleshooting, scripting, system engineering)
•Strong understanding of Linux systems and networking fundamentals (Windows Server knowledge is a plus)
•Hands-on Experience with AWS services (EC2, VPC, S3, RDS, Route53, CloudFront, API Gateway, Autoscaling, etc.)
•Practical experience with Kubernetes (EKS preferred) and containerized workloads
•Experience with monitoring/observability tools (CloudWatch, Datadog, ELK, Prometheus, Grafana)
•Understanding of microservices-based architectures
•Familiarity with SLAs, SLIs, SLOs, and reliability metrics
•Working knowledge of infrastructure security concepts (IAM, least privilege, network security, secrets management)
•Strong ownership mindset during production incidents
•Ability to operate calmly in high-severity situations
•Structured troubleshooting and analytical thinking
•Proactive approach to improving systems rather than only resolving tickets
•Ability to work in a global and distributed environment
•Familiarity with Agile development practices
•Working knowledge of Docker, Redis, RabbitMQ, Jenkins, GitHub
•Awareness of cloud cost optimization principles
•Experience with Infrastructure-as-Code (Terraform / CloudFormation) or willingness to learn
•Experience tuning Kubernetes resource requests/limits and autoscaling is a plus
•Knowledge of CI/CD strategies (blue-green, canary deployments) is preferred
•Understanding of Node.js, Python, PHP, or Groovy is advantageous

You will have an opportunity to:
•PARTNER with engineering & product teams to provide high available and reliable systems, while build best practices and standards.
•BUILD and MAINTAIN high performance, flexible and highly scalable web, and mobile based applications
•PERFORM technical root causes analysis and outlines corrective action for given problems
•PARTICIPATE in a 24x7 rotation for production issue.
•PROVIDE reliable solutions to a variety of problems using sound problem solving techniques
•CONTRIBUTE to the business continuity by driving the opportunity of making systems highly resilient.
•GROW into platform engineering areas (DevOps, security, MLops, AIops)
•DEVELOP engineering excellence by implementing standard practices and standards

Monotype is an Equal Opportunities Employer. Qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status.
#LI-DNI

Site Reliability Engineer I

Related Jobs

Product Engineer (AI Agents for Growth)

HRIS Specialist (Temporary Position)

Technical Account Manager

Product Manager, Core Reporting

Senior Cloud Data & Analytics Architect

Senior Data & Analytics Systems Engineer