About the Team & Role
As a Senior Manager for TAM, you will lead a team of Technical Account Managers focused on Site Reliability Engineering (SRE)-aligned customer outcomes. This role is responsible for ensuring customers achieve high levels of availability, performance, resilience, and operational maturity in their production environments.
You will lead a team that acts as a strategic partner to customers, helping them adopt SRE principles, improve reliability posture, and operate mission-critical systems at scale.
How you’ll make an impact
Leadership & Team Management
Lead and develop a team of Technical Account Managers with strong SRE and production operations expertise
Establish a high-performance culture focused on reliability, accountability, and continuous improvement
Define competency frameworks around SRE practices (SLIs/SLOs, error budgets, incident management)- ALL ESTABLISHED
Coach team members on technical depth, customer engagement, and incident leadership
Customer Reliability & Operational Excellence
Own customer outcomes related to availability, latency, scalability, and resilience
Guide customers in implementing SRE best practices such as:
Service Level Indicators (SLIs) and Objectives (SLOs)
Error budgets and reliability trade-offs
Incident management and postmortem culture
Conduct regular operational reviews, architecture reviews, and reliability assessments
Act as an executive escalation point during major incidents (Sev1/Sev2)
Incident & Crisis Leadership
Oversee TAM involvement in major incident response, ensuring structured communication and resolution
Ensure customers adopt best practices in:
Incident command frameworks
Blameless postmortems
Root cause analysis and remediation tracking
Drive improvements in MTTR (Mean Time to Resolution) and incident prevention
Technical Strategy & Advisory
Partner with customers to design resilient, scalable architectures
Provide guidance on observability (metrics, logs, tracing), alerting strategies, and automation
Align customer environments with SRE maturity models
Collaborate with Product and Engineering to address systemic reliability issues
Operational Scaling & Process Excellence
Define and standardize TAM engagement models for production-critical customers
Build playbooks for:
Incident response
Reliability reviews
Capacity planning and load testing
Track and report on key reliability and operational metrics:
Uptime / availability
Error rates
MTTR / MTBF
Change failure rate
Cross-functional Collaboration
Partner closely with SRE, Support, Engineering, and Product teams to resolve complex issues
Work with Sales and Customer Success to support renewals and growth through technical credibility and trust
Act as the voice of the customer in reliability and operational discussions
Experience you’ll bring
Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience), ITIL V4 certified
10+ years in technical roles (SRE, DevOps, Production Engineering, TAM, or similar)
3–5+ years of experience managing high-performing technical teams
Deep expertise in:
Distributed systems and cloud architectures
Incident management and production operations
Observability and monitoring tools
Proven experience working with enterprise customers running mission-critical systems
Preferred Qualifications
Hands-on experience in Site Reliability Engineering (SRE) or Production Engineering roles
Strong familiarity with cloud platforms (AWS, Azure, GCP)
Background in driving operational maturity transformations
Experience managing global or follow-the-sun teams
Key Competencies
Reliability-first mindset with strong operational discipline
Ability to lead under pressure during high-severity incidents
Strong systems thinking and problem-solving skills
Executive communication and stakeholder management
Coaching and developing technically deep teams
Success Metrics
Customer uptime and SLO attainment
Reduction in incident frequency and severity
Improvements in MTTR and operational efficiency
Customer satisfaction and retention
Adoption of SRE practices across customer environments
Team engagement and capability growth