Company
Cox Automotive - USAJob Family Group
Job Profile
Management Level
Flexible Work Option
Travel %
Work Shift
Compensation
Compensation includes a base salary of $159,400.00 - $265,600.00. The base salary may vary within the anticipated base pay range based on factors such as the ultimate location of the position and the selected candidate’s knowledge, skills, and abilities. Position may be eligible for additional compensation that may include an incentive program.Job Description
The Director of Incident Response & Enterprise Monitoring will lead a mission-critical organization responsible for ensuring operational resilience, visibility, and reliability across Cox Automotive’s global digital ecosystem. This role is central to the Next Generation Operations strategy—driving the modernization of incident response, enterprise monitoring, and automation capabilities across a complex, hybrid technology landscape.
This leader is expected to play a major hands-on role during high-impact incidents, serving as the primary incident commander as needed and visible operational leader during enterprise or client-facing events.
Equally important, the Director will be accountable for modernizing Cox Automotive’s enterprise monitoring practice — evolving from traditional alert-based monitoring to a data-driven, proactive observability discipline that leverages AI, automation, and advanced analytics to predict and prevent service disruptions.
Leadership & Strategy
Lead the Incident Response and Enterprise Monitoring teams within the Enterprise Operations organization.
Define and execute a modernization strategy for enterprise monitoring, transforming the practice from reactive alerting to proactive, insight-driven observability.
Partner with the Engineering Platform and Engineering Teams to embed observability, automation, and governance directly into CI/CD pipelines and service delivery processes.
Establish enterprise-wide standards for incident management, escalation, communication, and governance across all CAPTG (Cox Automotive Product & Technology Group) teams.
Represent Enterprise Operations in executive-level forums, articulating readiness posture, incident trends, and monitoring health.
Incident Response & Command
Serve as the executive incident commander as needed during major or business-critical outages, coordinating rapid technical recovery and engaging directly with senior leadership.
Drive consistent use of the Incident Resolution Framework, ensuring data-driven root cause analysis and long-term prevention.
Lead continual refinement of incident playbooks, automation, and communication protocols to accelerate mean time to resolve (MTTR).
Collaborate with Security, Platform Engineering, and Engineering Teams to ensure unified response and governance across CAPTG.
Monitoring & Observability Modernization
Lead the modernization of Cox Automotive’s enterprise monitoring practice, building an integrated observability ecosystem that spans infrastructure, applications, and digital experiences.
Partner with business, product, and engineering teams to define SLOs, SLIs, and error budgets that tie operational health to client outcomes and business value.
Champion predictive analytics and automation to proactively identify and mitigate emerging risks before they impact customers.
People & Culture
Lead, mentor, and grow a team of Incident Response Engineers, Observability Engineers, and Analysts.
Build a culture of ownership, speed, and precision in both incident response and monitoring disciplines.
Foster close collaboration with Platform, Reliability, and Security Engineering Teams to embed reliability as a shared responsibility.
Reinforce NextGen Ops principles—empowering engineers, simplifying operations, and elevating reliability standards across Cox Automotive.
10+ years of experience in IT Operations, Site Reliability Engineering, or Platform Engineering; 5+ years leading enterprise-scale incident response or monitoring functions.
Proven success leading and personally managing major incidents in distributed or hybrid cloud environments.
Deep expertise in modern observability and monitoring platforms (Service Now, Splunk, New Relic,etc ).
Strong understanding of event correlation, AIOps, and data-driven operational intelligence.
Technical fluency across cloud platforms (AWS, GCP, Azure), infrastructure, and CI/CD ecosystems.
Exceptional communication and composure under pressure; able to lead at both the executive and engineering levels.
Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
Demonstrated experience modernizing enterprise monitoring or observability programs in large-scale environments.
Experience implementing AI/ML-based monitoring or predictive analytics in operational contexts.
Passion for building resilient systems, empowering engineering teams, and advancing client trust through operational excellence.
Drug Testing
Benefits
About Us