We are seeking a seasoned professional to lead the reliable, secure, and scalable operation of our hybrid infrastructure environment, spanning on-premises and multi-cloud platforms. This senior-level role oversees daily operations, system administration, lifecycle management, and strategic infrastructure planning. The ideal candidate will bring deep technical expertise, strong problem-solving skills, and a proactive approach to automation and infrastructure management. They will drive operational excellence, ensure compliance with regulatory standards, and lead initiatives to enhance resilience, performance, and efficiency across the enterprise.
RESPONSIBILITIES
Manage and optimize servers, storage, networking, and virtualization platforms to ensure high availability and performance across data centers and cloud environments.
Administer and monitor multi-cloud platforms (Azure, AWS, GCP, OCI), including provisioning, cost optimization, and security compliance.
Implement Infrastructure as Code (IaC) using tools like Terraform, ARM templates, or CloudFormation.
Automate deployment, configuration, and monitoring using scripting tools such as PowerShell, Python, or Ansible.
Support core infrastructure services including Active Directory, DNS, load balancers, patching (BigFix), and domain controller management.
Collaborate with IAM and Security teams to troubleshoot issues related to CyberArk, Okta, compliance, and audit.
Collaborate with cross-functional teams to support business continuity, disaster recovery, and incident response planning.
Deploy and manage monitoring solutions (e.g., Azure Monitor, Splunk, Datadog, Logicmonitor) for proactive issue detection and resolution.
Optimize system performance, availability, and capacity using platforms like OpsRamp, Splunk, New Relic, Rubrik, and Cohesity.
Collaborate with security teams to enforce policies, conduct regular vulnerability assessments, manage patching, and ensure compliance with standards (e.g., ISO, SOC, HIPAA)
Lead & own root cause analysis for infrastructure incidents and maintain documentation for operational procedures and troubleshooting.
Support administration and integration of collaboration and productivity platforms, including Microsoft O365, M365, and Zoom
Work closely with cross-functional teams including DevOps, Security, and Application teams.
Maintain architectural documentation, SOPs, and system diagrams.
Participate in daily standups, operational syncs, and change management activities in alignment with Inspire’s policies.
Drive continuous improvement through automation, performance tuning, and incident response/root cause analysis (RCA).
EDUCATION AND EXPERIENCE QUALIFICATIONS
4 Year Degree and/or Bachelor’s degree in Information Technology or a related field, or equivalent professional experience.
Minimum 7 years of experience in infrastructure and cloud operations.
KNOWLEDGE, SKILLS, AND ABILITIES
Expertise in cloud platforms (especially Azure and AWS), including provisioning, networking, cost optimization, and cloud security.
Strong experience managing Microsoft Active Directory, Azure AD, and O365 environments.
Strong expertise in Windows/Linux server administration, virtualization (VMware/Hyper-V), and networking.
Proficiency in scripting and automation (PowerShell, Bash, Python).
Familiarity with DevOps practices and CI/CD pipelines.
Excellent troubleshooting, communication, and documentation skills.
Cloud certifications are a plus
Hands-on experience with infrastructure monitoring and security tools such as Splunk, New Relic, BigFix, Rubrik, and Cohesity.
Proficiency in identity and access management (IAM) and information security best practices.
Familiarity with enterprise collaboration platforms (Zoom, O365), secure MFT tools, and compliance frameworks.
Knowledge of cloud-native architecture, containers, and infrastructure automation tools.
Demonstrated experience managing critical production incidents, conducting RCA, and implementing sustainable solutions.