Job Summary
Synechron is seeking an experienced Cloud Incident Management Lead to oversee operational resilience, incident response, and support for cloud-based enterprise systems. This role involves managing complex cloud environments across AWS, Azure, and Google Cloud, coordinating incident resolution efforts, and implementing proactive measures to minimize downtime. The successful candidate will drive continuous improvement in incident response processes, develop strategic support frameworks, and build strong stakeholder relationships to uphold enterprise operational excellence.
Software Requirements
Required:
Hands-on experience supporting enterprise cloud environments in AWS, Azure, or Google Cloud Platform (GCP) (minimum 5 years)
Knowledge of cloud infrastructure components, including compute, network, security, and storage services
Experience with cloud automation, scripting (PowerShell, Bash, Python), and support tools (CloudWatch, Cloud Logging, Cloud Monitoring)
Familiarity with incident and problem management tools such as ServiceNow, Jira, or equivalent
Strong understanding of cloud security best practices, including IAM, encryption, and compliance standards (e.g., GDPR, HIPAA, PCI DSS)
Proficiency in monitoring and alerting tools for cloud environments (e.g., Datadog, Prometheus, Grafana, Splunk)
Preferred:
Experience with DevOps or SRE practices supporting incident prevention, automation, and resilience
Knowledge of infrastructure as code tools like Terraform or CloudFormation
Exposure to multi-cloud or hybrid cloud management solutions
Overall Responsibilities
Lead incident response and resolution for cloud infrastructure supporting enterprise applications, minimizing service disruptions and downtime
Develop, refine, and implement incident management frameworks, escalation procedures, and root cause analysis processes
Collaborate with technical teams to support deployment, support, and disaster recovery planning
Automate incident detection and remediation workflows, supporting proactive and predictive operational practices
Monitor system health, performance, and security alerts to diagnose and resolve issues efficiently
Conduct post-incident reviews, track KPIs, and drive continuous improvement initiatives to enhance operational resilience
Establish and enforce cloud security protocols, compliance standards, and best practices
Foster strong stakeholder relationships through regular reporting, strategic planning, and collaborative problem-solving
Technical Skills (By Category)
Cloud Platforms:
AWS, Azure, GCP (supporting monitoring, incident response, security)
Support & Automation Tools:
CloudWatch, Cloud Logging, Cloud Monitoring, Datadog, Prometheus, Splunk, Terraform, CloudFormation
Scripting & Automation:
PowerShell, Bash, Python for automation and operational workflows
Security & Compliance:
IAM policies, encryption standards, security auditing tools, support for relevant standards (GDPR, HIPAA, PCI DSS)
Experience Requirements
At least 5 years supporting cloud environments in enterprise or large-scale operations
Proven expertise in incident management, root cause analysis, and system troubleshooting in multi-cloud or hybrid environments
Experience supporting mission-critical applications supporting business operations, ideally within regulated industries (financial, healthcare, etc.)
Demonstrated success in automating incident detection, resolution, and preventive workflows
Industry experience in supporting high-availability mission-critical systems
Day-to-Day Activities
Monitor cloud environment health, review alerts, and analyze logs for early incident detection
Respond to and resolve system outages, security threats, and performance issues promptly
Lead incident investigations, root cause analysis, and develop corrective action plans
Automate incident response processes and support proactive monitoring solutions
Support deployment activities, disaster recovery drills, and environment change management
Document incident procedures, troubleshooting steps, and operational workflows
Conduct regular support meetings, post-incident reviews, and support strategy sessions
Qualifications
Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related discipline
5+ years of experience supporting cloud infrastructure operations, incident management, and security
Certifications such as AWS Certified Solutions Architect, Azure Solutions Architect, or equivalent are preferred
Proven ability to lead incident response teams and improve operational resilience in cloud environments
Professional Competencies
Strong analytical and troubleshooting skills for complex cloud incidents
Effective communication skills for stakeholder updates and cross-disciplinary collaboration
Leadership qualities to guide incident response teams and foster a culture of operational excellence
Strategic mindset for continuous improvement of incident management and support processes
Adaptability to evolving cloud technologies, security standards, and industry best practices
Time management and organizational skills to handle multiple incidents and support activities efficiently
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.