Vantage Data Centers powers, cools, protects and connects the technology of the world’s well-known hyperscalers, cloud providers and large enterprises. Developing and operating across North America, EMEA and Asia Pacific, Vantage has evolved data center design in innovative ways to deliver dramatic gains in reliability, efficiency and sustainability in flexible environments that can scale as quickly as the market demands.
Position Description
Vantage is seeking a resourceful, detail-oriented, and self-motivated Problem Manager to drive our Operational Excellence initiatives, with a primary focus on leading and improving all aspects of the Problem Management process across the region.
The Problem Manager will play a crucial role in the region, being accountable for the end-to-end Problem Management capability across EMEA. Working closely with cross-functional teams across the organisation to identify, analyze, and permanently eliminate recurring Incidents, systemic failures, and latent operational risks, ensuring the long-term stability and reliability of our services.
As our key advocate for the Problem Management discipline, you will champion best practices, facilitate root cause analysis, and help implement preventive measures to reduce operational risks and minimize future Incidents. You will also play an active role in ensuring our internal tools and processes are effectively utilized to support high-quality Problem resolution, establishing and nurturing strong working relationships with internal Vantage teams.
Key Responsibilities
Own and ensure compliance with the Problem Management process, standards, and governance across EMEA
Analyze and communicate the influence of new policies, procedures, or regulations on the current Problem Management process
Oversee and manage the end-to-end Problem Management process, ensuring timely and thorough investigation of problems and implementation of solutions
Work closely with Event and Incident Management teams to identify, then manage the root causes for recurring Incidents and Major Incident
Partner with Reliability Engineering and other Operations teams, both in EMEA and globally, to drive permanent, engineered solutions
Own and manage the EMEA Known Error Database and the process for converting Known Errors to viable solutions
Facilitate and lead problem review and stakeholder meetings, promoting collaboration, accountability and ultimately the timely production of Root Cause Analysis (RCA) and After Action Review (AAR) documents
Examine information from internal departments impacted by the problem to find areas for improvement
Generate reports and statistics on the performance of the regional Problem Management process
Review and assess the quality, accuracy, and completeness of Problem records, providing feedback and guidance to teams as needed
Compile and present key performance indicators (KPIs) and reports to stakeholders, providing insights into operational health and areas for improvement
Provide training to other teams regarding Problem Management processes
Act as the regional lead for Problem Management within the global Digital Operations Management Office team, identifying system enhancement options and promoting them with your peers in the APAC and NA
Own the definition, prioritization and UAT testing of Problem Management enhancements within the region
Act as a subject matter expert (SME) on Problem Management, supporting training, awareness, and continuous improvement efforts across teams
Experience and Skills
Experience in performing a Problem, Incident, or Reliability Management associated role within a mission-critical environment
A strong understanding of ITIL Incident, Problem, Event, and Change Management
Experience of using ServiceNow or an equivalent platform in an enterprise level environment
Prior experience of working within an ITIL based Service Management environment
ITIL certification, or a commitment to achieve ITIL certification relevant to the role
Experience of working in a datacentre, high-tech, or rapid growth industry
Strong sense of personal accountability regarding decision-making and team leadership
Strong analytical, communication, and stakeholder management skills
Problem-solving skills to troubleshoot and resolve issues, experience or certification in recognized analytical methods such as Lean Six Sigma, Kepner-Tregoe or Ishikawa is highly desirable but not essential
Must be able to work in a collaborative team environment as well as individually
Excellent verbal and written communication skills
Strong organizational skills and attention to detail
International travel is expected
We operate with No Ego and No Arrogance. We work to build each other up and support one another, appreciating each other’s strengths and respecting each other’s weaknesses. We find joy in our work and each other, actively seeking opportunities to inject fun into what we do. Our hard and efficient work is rewarded with an above market total compensation package. We offer a comprehensive suite of health and welfare, retirement, and paid leave benefits exceeding local expectations.
Throughout the year, the advantage of being part of the Vantage team is evident with an array of benefits, recognition, training and development, and the knowledge that your contribution adds value to the company and our community.
Don't meet all the requirements? Please still apply if you think you are the right person for the position. We are always keen to speak to people who connect with our mission and values.
Vantage Data Centers is an Equal Opportunity Employer
Vantage Data Centers does not accept unsolicited resumes from search firm agencies. Fees will not be paid in the event a candidate submitted by a recruiter without an agreement in place is hired; such resumes will be deemed the sole property of Vantage Data Centers.