N-ix

Site Reliability Engineer

Poland Full Time

We are seeking experienced Site Reliability Engineers (SREs) to help monitor, maintain, and scale DocuSign’s software production environments, with a primary focus on onboarding new microservices.

You will work closely with development and platform teams to automate and program‑manage the onboarding lifecycle—from initial requirements and environment setup through deployment, testing, documentation, and handover—ensuring reliability, scalability, performance, and compliance at every step.

Key Responsibilities

1. Service Onboarding & Automation
  • Lead and support the end-to-end onboarding process for new microservices into production environments.
  • Identify and automate gaps in the current onboarding workflow (deployment, configuration, monitoring, scaling, etc.).
  • Provide program management for onboarding activities, including timelines, dependencies, and stakeholder communication.
  • Collaborate with development and operations/platform teams to ensure smooth and consistent rollout of new services.
2. Monitoring, Logging & Observability
  • Design and implement monitoring, logging, and alerting for all onboarded services.
  • Ensure comprehensive metrics collection (e.g., availability, latency, error rates, throughput) to support SLOs/SLIs.
  • Tune alerts to minimize noise while ensuring rapid detection and response to production issues.
3. Scalability, Load & Performance
  • Perform load and stress testing to validate that services can scale to meet current and projected demand.
  • Implement and refine auto‑scaling mechanisms and capacity planning practices.
  • Conduct ongoing performance tuning and optimization to achieve minimal latency and high throughput.
4. Reliability, Resilience & Uptime
  • Drive high service reliability and uptime for all onboarded microservices.
  • Help teams design and implement fault‑tolerant architectures, including failover and redundancy mechanisms.
  • Work with teams to adopt SRE best practices (e.g., error budgets, post‑incident reviews, runbooks).
5. Security & Compliance
  • Ensure all onboarded services meet security and compliance requirements.
  • Integrate security best practices into deployment, monitoring, and operational processes.
  • Maintain audit trails and documentation for onboarding activities to support regulatory and internal compliance.
6. Documentation, Training & Knowledge Transfer
  • Create detailed documentation for the service onboarding process, including standards, patterns, and templates.
  • Develop and maintain runbooks, playbooks, and SOPs for ongoing operations.
  • Conduct training sessions and workshops for internal teams to enable self‑service onboarding and long‑term maintainability.
7. Planning, Testing & Post‑Onboarding Support
  • Participate in requirements analysis for new services; define onboarding success criteria and KPIs.
  • Develop onboarding plans outlining steps, timelines, responsibilities, and acceptance criteria; present plans to stakeholders for review and approval.
  • Prepare and validate environments, ensuring appropriate access, permissions, and tooling are in place.
  • Conduct comprehensive functional, performance, reliability, and security testing prior to go‑live.
  • Provide post‑onboarding support, monitoring services to ensure continued reliability and quickly addressing any issues that arise.

Required Qualifications

  • Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role in microservices-based environments.
  • Strong understanding of microservices architecture, distributed systems, and cloud‑native concepts.
  • Hands-on experience with:
    • Production monitoring, logging, and alerting (e.g., metrics, tracing, log aggregation tools).
    • Automation of deployment and operational workflows (e.g., scripts, pipelines, IaC, or similar).
    • Load/performance testing and capacity planning.
  • Demonstrated ability to improve service reliability, scalability, and performance in production.
  • Familiarity with security best practices related to service deployment, monitoring, and operations.
  • Experience working across cross‑functional teams (development, operations, security, compliance) to deliver complex initiatives.
  • Excellent documentation, communication, and stakeholder management skills.

Preferred Qualifications

  • Experience defining and tracking SRE KPIs/SLOs/SLIs for onboarding and production services.
  • Background in program or project management of technical initiatives (especially service onboarding or platform rollouts).
  • Prior experience in high‑availability, regulated, or large‑scale SaaS environments.

We offer*:

  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits

*not applicable for freelancers