Key Responsibilities
• Infrastructure Management
o Managing Linux and Windows environments
o Proficient in the command-line (RHCSA/RHCE certification or equivalent)
• Virtualization
o Managing virtualized environments (VMWare)
• Containerization
o Docker, Podman
o Kubernetes, Openshift
• Networking
o Firewall rules
o DNS, DHCP
o network storage (NFS/NAS, SAN, CIFS/SMB)
o load balancing
• Automation
o Scripting knowledge to automate repetitive tasks
o Building and integrating tools that will assist in improving system availability,
reliability and performance
o Utilizing CI/CD tooling to automate (Azure DevOps, Jenkins)
• Configuration Management
o Enforcing configuration (Ansible, Puppet, Chef)
• Monitoring and troubleshooting o performance monitoring
o Proactive identifying problem areas of a system (i.e. software bugs,
misconfigurations, bottlenecks)
o Trend analysis
• Incident and Problem Management
o Coordinating incident management and service restoration
• Capacity Planning
• Service Level Management
o Proactive monitoring and adherence to SLAs
o Holding IT Engineering, Security and Architecture accountable for
remediation of any SLA degradation
• IT Risk Key Controls
o Ensuring that IT is ‘in CONTROL’ by holding IT groups accountable for
adherence
o Collating and providing necessary evidence to auditors for these controls
(Vulnerability Management, Identity & Access Management, Platform
Security, Foundational Controls and Security Detection & Response)
• Disaster Recovery (DR) & Business Continuity Planning (BCP)
o High availability and resilience patterns
Key Capabilities/Experience
• Work comfortably with Linux and Windows
• Work with virtualization technologies such as VMWare
• Develop and run applications in the cloud (Azure, AWS, GCP)
• Containerize and orchestrate applications (Docker, Openshift, Kubernetes,
microservices)
• Continuously integrate and deploy (CI/CD, git, Jenkins, Ansible)
• Comfortably write automation scripts (bash, go, python)
• Administer and query SQL
• Work and analyze aggregated logs (ELK)
• Configure networks (load balancing, firewalls)
• Work with observability tools (Prometheus, Grafana