Job Summary
Synechron is seeking an experienced Staff Engineer specializing in Site Reliability Engineering (SRE) and Cloud Infrastructure (AWS) to lead the design, deployment, and management of resilient, scalable, and secure enterprise systems. You will oversee end-to-end system health, automate operational processes, and implement best practices in cloud architecture and security. Playing a pivotal role within our technical leadership team, you will drive innovation, ensure service continuity, and mentor teams to achieve high standards of operational excellence aligned with business and compliance requirements.
Software Requirements
Required Skills:
Extensive experience in cloud architecture and management with AWS (including services such as EC2, S3, RDS, Lambda, CloudFormation)
Strong expertise in Site Reliability Engineering (SRE) principles, including automation, observability, and incident management
Proficiency in scripting and automation using Python, Shell scripting, or similar tools
Experience with infrastructure-as-code (IaC) tools such as CloudFormation, Terraform, or similar
Knowledge of containerization (Docker) and orchestration (Kubernetes)
Familiarity with monitoring, logging, and alerting tools such as CloudWatch, Prometheus, Grafana, ELK Stack, or Splunk
Strong understanding of system security best practices, threat detection, vulnerability management, and compliance standards
Preferred Skills:
Support experience with automation/configuration management tools like Ansible, Chef, or Puppet
Knowledge of CI/CD pipelines, Jenkins, GitLab CI, or Azure DevOps
Experience with microservices architecture and cloud-native design patterns
Familiarity with compliance standards such as ISO, SOC2, or GDPR
Overall Responsibilities
Lead the end-to-end management of enterprise systems environments, ensuring high availability, scalability, and security
Architect and implement cloud-based solutions, leveraging AWS services and best practices in cloud security and cost optimization
Drive automation initiatives to improve operational efficiency, incident response, and system reliability
Manage system health, conduct proactive monitoring, and perform capacity planning and upgrades
Oversee incident response, root cause analysis, and problem resolution to ensure continuous service delivery
Develop and implement security controls, vulnerability assessments, and compliance procedures
Mentor and develop technical teams, sharing knowledge on cloud technologies, SRE practices, and automation strategies
Collaborate with business and technology teams to plan future system enhancements and migrations
Maintain comprehensive documentation of architecture, configurations, runbooks, and operational procedures
Lead service continuity testing, penetration testing, and vulnerability management programs to meet regulatory and security standards
Technical Skills (By Category)
Cloud Architecture & Services:
Required: Deep expertise in AWS core services (EC2, S3, RDS, Lambda, CloudFormation)
Preferred: Multi-cloud experience (Azure, GCP), serverless architectures, and advanced cloud security implementation
SRE & Automation:
Required: Automation of deployment, scaling, and incident response processes
Preferred: Monitoring with Prometheus, Grafana, ELK Stack, or Splunk; scripting using Python and Shell
Containerization & Orchestration:
Required: Docker containerization; Kubernetes for orchestration and managing microservices
Preferred: Helm charts, service mesh tools like Istio, and advanced deployment strategies
Security & Compliance:
Required: Implementation of security best practices, vulnerability management, and threat detection
Preferred: Experience with security audits, compliance frameworks, and encryption standards
Experience Requirements
10-12 years of proven experience in cloud infrastructure, site reliability, or enterprise systems management
Extensive experience designing, deploying, and managing AWS cloud architectures at scale
Strong background in SRE principles, automation, and incident management
Demonstrated leadership in managing cross-functional teams and guiding best practices in cloud operations
Experience with compliance, security, and vulnerability management (ISO, SOC2, GDPR effective practices)
Industry domain background in finance, banking, or fintech is highly beneficial but not mandatory
Day-to-Day Activities
Architect, deploy, and manage cloud-based enterprise systems ensuring their high availability and resilience
Automate system provisioning, scaling, and incident responses to improve SLAs and reduce manual intervention
Monitor system health metrics, conduct capacity planning, and optimize resource utilization
Lead root cause analysis efforts, manage incident responses, and implement preventive measures
Conduct vulnerability assessments, coordinate penetration testing, and implement security controls
Collaborate with development teams to incorporate security and reliability best practices into deployment pipelines
Lead service continuity testing, disaster recovery planning, and compliance audits
Develop and maintain operational documentation, runbooks, and automation scripts
Mentor technical staff, promote a culture of continuous improvement, and share knowledge industry-wide
Qualifications
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
Certifications in AWS (e.g., AWS Solutions Architect, DevOps Engineer) and security standards (e.g., CISSP, CISA) are preferred
Extensive hands-on experience in cloud architecture, SRE practices, automation, and security for large enterprise systems
Professional Competencies
Strong analytical and problem-solving skills
Excellent leadership and mentorship abilities
Effective communication skills across technical and non-technical stakeholders
Strategic thinking with a focus on operational excellence, security, and cost efficiency
Adaptability to evolving technology stacks, industry standards, and regulatory requirements
Proactive and solution-oriented mindset, with a focus on continuous improvement
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.