Deutsche Bank

Senior Site Reliability Engineer - AVP

Bangalore, Velankani Tech Park Full time

Job Description:

Job Title: Senior Site Reliability Engineer

Corporate Title: AVP

Location: Bangalore, India

Role Description

  • We are seeking a Site Reliability Engineer for Observability platforms in the Bank to enhance, scale, and modernise our enterprise observability capability.
  • This role focuses on owning and evolving Observability and Monitoring tools across the Bank, driving a shift towards OpenTelemetry (OTel)-based telemetry standardisation.
  • The successful candidate will contribute to automation, AI adoption, and observability-by-design practices to improve reliability, scalability, and developer experience.

What we’ll offer you

As part of our flexible scheme, here are just some of the benefits that you’ll enjoy,

  • Best in class leave policy.
  • Gender neutral parental leaves
  • 100% reimbursement under childcare assistance benefit (gender neutral)
  • Sponsorship for Industry relevant certifications and education
  • Employee Assistance Program for you and your family members
  • Comprehensive Hospitalization Insurance for you and your dependents
  • Accident and Term life Insurance
  • Complementary Health screening for 35 yrs. and above

Your key responsibilities

Tools Reliability Governance:

  • Own the availability, performance, and resilience of the Observability tool stack in the Bank
  • Act as admin of the tool stack, ensuring platforms effectively support enterprise monitoring requirements
  • Drive standardisation of telemetry using OpenTelemetry (OTel) across Metrics, Events, Logs, and Traces (MELT)
  • Define and implement telemetry collection, enrichment, and routing strategies using OTel collectors and pipelines
  • Identify and implement automation and self-healing for common issues and adopt AI practices to enhance tools availability and user experience

Own Incident and Problem Management framework (severity, escalation, response and resolution):

  • Ensure quick incident response, containment, and service restoration
  • Perform deep root cause analysis and deliver permanent resolutions
  • Oversee major incidents and proactively identify systemic risks
  • Identify and eliminate audit and control risks

Align and adhere with SRE best practices:

  • Provide frameworks, playbooks, and automation capabilities
  • Conduct reliability reviews and implement and improve SLO/SLI tracking
  • Maintain and govern error budgets
  • Promote observability-by-design principles across application and platform teams

Strong SRE / production engineering experience

  • Expertise in SLOs, error budgets, incident governance, and modern observability practices
  • Experience with distributed systems, GCP, Kubernetes, Openshift
  • Leverage OTel-driven telemetry insights to improve reliability and proactive issue detection
  • Strong understanding of risk, audit, and compliance (financial services preferred)
  • Own and evolve the Observability platform ecosystem – ITRS Geneos, New Relic (SaaS), Netcool, Grafana (KDB), and OTel-based telemetry pipelines

Your skills and experience

  • Strong experience as admin of at-least 2 of the observability tools: ITRS Geneos, New Relic (SaaS), Netcool, Grafana (KDB)
  • Strong understanding of MELT concepts and modern Observability architectures
  • Hands-on experience with OpenTelemetry (OTel):
  • Application and infrastructure instrumentation (auto and manual)
  • OTel collectors, exporters, and telemetry pipelines
  • Integration of OTel with tools such as Grafana and New Relic
  • Understanding of vendor-agnostic telemetry frameworks
  • Hands-on experience in working on Unix servers (Windows server would be added benefit), Google Cloud, Openshift
  • Strong hands-on experience in any scripting language: shell, bash, python etc. Experience with ansible playbooks and terraform will be beneficial
  • Experience in Oracle, MSSQL database, KDB knowledge will be an added advantage

How we’ll support you

  • Training and development to help you excel in your career.
  • Coaching and support from experts in your team.
  • A culture of continuous learning to aid progression.
  • A range of flexible benefits that you can tailor to suit your needs.

About us and our teams

Please visit our company website for further information:

https://www.db.com/company/company.html

We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.

Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.

We welcome applications from all people and promote a positive, fair and inclusive work environment.