The Site Reliability Engineer is a pivotal role in our SaaS strategy. You will work closely with our engineering team to ensure unrivaled observability, availability, and performance of Tricentis SaaS Products.

Where you’ll work
You’ll find us in our Prague office, where we like to see each other 3 days a week (yes, coffee tastes better together ☕). We work in a hybrid setup with no core working hours — early bird or night owl, it’s up to you.

This role is not fully remote and doesn’t come with visa sponsorship — but it does come with great people and a friendly team 🙂

As a Site Reliability Engineer (SRE), you'll be the driving force of our user-facing services and production systems. We're seeking individuals with pragmatic operational skills and software craftsmanship, applying engineering principles, and operational discipline to elevate our operating environments and codebase to new heights.

At the core of your responsibilities, you'll specialize in systems such as operating systems, storage subsystems, observability and networking while implementing best practices for availability, reliability, and scalability. But that's just the beginning of your thrilling journey with us!

Your Impact as an SRE 🚀

Design, build, and maintain the product cloud infrastructure that enables seamless scaling to support hundreds of thousands of concurrent users.
Develop advanced monitoring systems that proactively alert on symptoms, ensuring rapid response to potential issues.
Leverage tools like Terraform, GitHub actions, and Kubernetes to efficiently manage our AWS or AZURE infrastructure.
Continuously enhance operational processes, making deployments, upgrades, and other tasks as boring and automated as possible.
Collaborate with product engineers on daily basis and influence product architectures designs
Be part of an on-call (PagerDuty) rotation to respond swiftly to incidents affecting availability, offering support to product engineers during customer incidents.

As a valuable member of our SRE team, you'll have the opportunity to 💪

Act as a reliability champion for stable counterpart assignments, ensuring a robust and resilient infrastructure.
Propose innovative ideas and solutions within the SRE organization and engineering
Plan, design, and execute solutions to achieve goals agreed upon by the team.
Leading by example with positive and inclusive attitude and fostering constructive discussions between SRE and engineering
Proactively identify opportunities to enhance system availability and performance by applying insights gained from monitoring and observation.
Share your learnings with the wider community
Be the first responder during emergencies and on-call duties, promptly addressing symptoms and conducting root cause analysis to implement corrective actions and prevent recurring issues.

Our Tech Stack 🌐

AZURE , AWS, Terraform, GitHub Actions, ArgoCD, Kubernetes, DataDog, Prometheus, Grafana, Betterstack, All-in-one incident management platform | incident.io , Jira

About You 🎯

Proficiency in Terraform syntax and GitHub Actions configuration, including pipelines and job management using GitOps
Working knowledge of SaaS architecture concepts and designs.
Understanding of Kubernetes, including CLI usage and service re-provisioning
Ability to provision and set up metrics along with managing alerts and silences.
Identify Service Level Indicators (SLIs) that align the team with availability and latency objectives.
Experience with Linux operating system configuration, package management, and troubleshooting.
Working experience with cloud environments like AZURE or AWS and provisioning infrastructure there.
Good cultural fit: clear communication, empathy, curiosity & continuous learning, no blame attitude, but instead supportive

Our Culture 🦄

We don't just preach our values; we embody them in everything we do. We are committed to creating an environment that empowers, supports, and includes individuals, where trust, transparency, creativity, curiosity, and continuous improvement thrive on a daily basis.

You can look forward to: 🙂

Flexible working schedule (no core hours)
Learning and career growth opportunities
25 days of paid time off
3 Sick Days
2 days of paid Volunteering Leave per year to get involved in your local community or in a cause that matters to you
Hybrid work environment, with home-office allowance
Meal allowance
Pension Contribution
Life & Disability Insurance
Paid Sickness Leave
A team of passionate professionals who are experts in their fields
Events for employees to learn, celebrate and socialise (training sessions, hackathons, parties, sports events, board game gatherings, BBQs) and much more

Tricentis Core Values:

Knowing what we need to achieve and how to achieve it is important. Tricentis core values define our ways of working and the behaviours we model that create an enjoyable and successful Tricentis life.

Demonstrate Self-Awareness: Own your strengths and limitations.
Finish What We Start: Do what we say we are going to do.
Move Fast: Create momentum and efficiency.
Run Towards Change: Challenge the status quo.
Serve Our Customers & Communities: Create a positive experience with each interaction.
Solve Problems Together: We win or lose as one team.
Think Big & Believe: Set extraordinary goals and believe you can achieve them.

If you're ready to make a lasting impact as a Site Reliability Engineer and be at the forefront of revolutionizing Tricentis SaaS Products, don't miss this.

Tricentis is proud to be an equal opportunity workplace. Qualified applicants will receive consideration for employment without regard to race, color, ethnicity, gender, religious affiliation, age, sexual orientation, socioeconomic status, or physical and mental disability and other statuses protected by law.

Global Sanctions Compliance

We comply with all applicable global sanctions and export control laws. Candidates must not be listed on any government restricted party lists (including OFAC SDN List and U.S. Commerce Department restricted lists) and must certify that their employment would not violate any sanctions or export control regulations. Candidates must notify us of any changes to their status during the application process or subsequent employment.

Senior SRE/DevOps Engineer (Cloud & SaaS) for Prague office

Your Impact as an SRE 🚀

As a valuable member of our SRE team, you'll have the opportunity to 💪

Our Tech Stack 🌐

Our Culture 🦄

You can look forward to: 🙂

Related Jobs

Data Engineer - Digital Assets

Senior Associate Sourcing Consultant

Public Sector Account Executive

Solutions Consultant

Principal ASIC Engineer (San Diego, CA)

Human Resources Business Partner