The Hartford

IND Senior Staff Reliability Engineer

India GCC-Puppalaguda Village Full time
IND Senior Staff Reliability Engineer - GCC019

We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future.

Position Overview 

We are seeking an experienced and highly motivated Sr Staff Reliability Engineer.​ The Sr Staff Reliability Engineer will have end-to-end accountability for the reliability of IT services within a defined application portfolio. A prerequisite to the role will be a “build-to-manage”, problem-solving and innovative mindset applied to the design, build, test, deploy, change and maintenance of services drawing from deep engineering expertise. ​ 

Key measures of success will include service stability, effective delivery and environmental instrumentation, deployment quality, technical debt reduction, asset resiliency, risk/security compliance, cost efficiency, proactive and preventative maintenance mechanisms, top quartile operating norms. ​ 

The Sr Staff Reliability Engineer will actively contribute to sustained advancement of the RE practice within and beyond a given area of responsibility.  

Key Responsibilities 

  • Guide the use of best-in-class software engineering standards and design practices for instrumenting code/application technology stack to enable the generation of relevant metrics on overall technology health - availability, performance, quality, currency and resiliency.​ 

  • Serve as key liaison between the architecture and software engineering teams to influence the technical strategy for the organization, keeping in mind its cross-functional impacts, integration across the organization, and architecture rationalization.​ 

  • Function as the go-to technical leader for the applications supported, requiring depth and breadth of knowledge in technologies, applications, integration, interfaces and business domain.​ 

  • DevSecOps Solution Responsibilities:​ 

  • Design, build, and maintain scalable and reliable systems for production environments. 

  • Automate infrastructure provisioning, CI/CD pipelines, and incident response process. 

  • Identify and mitigate risks to system reliability, security, and performance. 

  • Develop effective tooling, alerts, and response mechanisms to identify and address reliability risks leveraging automation to support problem prevention, detection, mitigation, and resolution.​ 

  • Enhance the delivery flow by engineering the appropriate solutions to increase delivery speed while adhering to technology standards for sustained reliability.​ 

  • Progressively implement preventative controls and drive increased automation and self-healing capabilities. Continue to improve cost efficiency baselines​ 

  • Promote and implement innovative solutions.​ 

  • IT Ops Responsibilities:​ 

  • Ensure operational excellence. Independently drive the triaging and service restoration of all high impact incidents to minimize the mean time to service restoration and impact to the business. Demonstrate end-to-end ownership.​ 

  • Partner with infrastructure teams to design and implement intelligent incident routing, enhanced monitoring/alerting capabilities and automated service restoration processes. Take proactive measures to prevent high impactful incidents.​ 

  • Achieve and maintain the continuity of Hartford and third-party assets that support a business function. Accountable for keeping the IT application and infrastructure metadata repositories current. 

Required Skills & Experience 

  • System Thinking end-to-end - Broad understanding of enterprise architectures and complex (backend) systems (understand more than the component itself)​ 

  • Highly collaborative partners with peers, stakeholders with a passion for delighting customers.​ 

  • Expert experience with Performance and Observability tools such as DynaTrace, Splunk, TrueSight, CloudWatch, CloudTrail, and related tools.​ 

  • Strong solution architecture orientation to enable expedient troubleshooting, issue-resolution and root-cause removal in a hybrid cloud environment.​ 

  • Experience with continuous integration and DevOps methodologies, preferred tools such as GitHub, Jenkins, Nexus, Rally, SonarQube etc..​ 

  • Experience with cloud platforms (AW, GCP, or Azure) 

  • Deep understanding of Linux systems, containers (Docker), and orchestration tools (Kubernetes) 

  • Expertise with Infrastructure as Code (Terraform, CloudFormation). 

  • Knowledge of complex traditional and modern enterprise architectures and systems (understand more than the component itself).​ 

  • Strong hybrid cloud experience (private and public) across various service delivery models – IaaS, PaaS, SaaS.​ 

  • Strong communication (verbally and written) / collaboration / negotiation skills, working in a diverse team across business units 

 

Preferred Qualifications 

  • Understanding FinOps or cost-optimization practices in the cloud. 

  • Experience with API gateways, and network-level observability. 

  • Experience in regulated environments (Insurance) 

  • AWS Solutions Architect certification 

  • Keeps abreast with new market technologies and adept at learning and adopting new models. Promotes and applies continuous learning.​ 

What We Offer 

  • Opportunity to work on cutting-edge automation technologies including GenAI in testing.  

  • Collaborative and innovative work culture.  

  • Competitive compensation and benefits.  

  • Continuous learning and growth opportunities.  

About Us | Our Culture | What It’s Like to Work Here