Astreya

Incident Response Analyst II

Dublin, Ireland Full time


 Key Responsibilities

  • Monitor alarms and alerts across data center infrastructure, including network and server devices, hardware health indicators (disk usage, temperature, etc.), WAN circuits, local connectivity, and facility environmental systems such as temperature, humidity, power, racks, and PDUs.

  • Monitor Oracle Cloud Infrastructure (OCI), cloud dashboards, and relevant monitoring tools for system health and service availability.

  • Assess external risks such as natural events (floods, earthquakes, fires, severe weather) that may impact data center operations.

  • Monitor and review CCTV activity for security events within data center facilities.

  • Validate, investigate, and analyze alerts to determine impact and urgency.

  • Create, update, and manage incident tickets using Jira or similar ITSM platforms.

  • Escalate incidents promptly to cross-functional teams including Network, Compute, Storage, Facilities, Security, and Cloud teams.

  • Communicate incident updates and notifications using tools such as Everbridge and email.

  • Prepare Post-Incident Reviews (PIRs), document timelines, and maintain accurate incident records and handover notes.

  • Review and track vendor emails regarding planned maintenance, emergency changes, and service advisories.

  • Maintain documentation including SOPs, process updates, and knowledge base articles.

  • Utilize DCIM systems, monitoring tools, CCTV platforms, and communication systems to support daily operations.

Basic Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or related field, or equivalent practical experience.

  • 2–5 years of experience in a data center operations center, NOC/SOC, monitoring environment, or incident management role.

  • Basic understanding of network fundamentals (TCP/IP, routing, WAN/LAN concepts) and server or hardware components (CPU, memory, storage).

  • Ability to interpret system alerts, analyze issues, and perform initial triage.

  • Excellent verbal and written communication skills, with the ability to work independently, meet goals, and maintain attention to detail.

  • Demonstrated ability to interact effectively at all levels within the organization, including with clients, while being a collaborative team player.

Preferred Qualifications

  • Experience in a Data Center, NOC, SOC, or similar operations command center environment.

  • Familiarity with tools such as Jira, Everbridge, DCIM systems, monitoring platforms (SolarWinds, Nagios, Grafana, etc.), and CCTV monitoring tools.

  • Experience writing incident reports or post-incident reviews.

  • Exposure to cloud platforms such as Oracle Cloud Infrastructure, AWS, Azure, or GCP.

  • Knowledge of ITIL processes, especially Incident, Problem, and Change Management.

  • Strong situational awareness and ability to manage multiple priorities during critical events.

  • Effective collaboration skills when working with cross-functional support teams.