Why Join Exadel
We’re an AI-first global tech company with 25+ years of engineering leadership, 2,000+ team members, and 500+ active projects powering Fortune 500 clients, including HBO, Microsoft, Google, and Starbucks.
From AI platforms to digital transformation, we partner with enterprise leaders to build what’s next.
What powers it all? Our people are ambitious, collaborative, and constantly evolving.
What You’ll Do
Reliability Improvements Within Java Applications
- Review current Java services and identify reliability gaps
- Introduce patterns such as rate limiting, backpressure, traffic shedding, and circuit breakers
- Support uplift plans that raise applications to an agreed level of resilience
- Guide development teams toward practical changes that improve stability and consistent delivery
Production Experience With Java Systems
- Troubleshooting and root cause analysis involving memory issues, thread behavior, and runtime failures
- Load testing or stress testing experience using tools such as JMeter
OpenTelemetry Instrumentation in Java Code
- Add metrics, logs, and traces using OpenTelemetry libraries
- Ensure the Java service exposes meaningful telemetry
- Use the client’s existing platform for ingestion and dashboards
Collaboration and Client Interaction
- Able to explain reliability concepts in a clear and simple way to product owners and senior stakeholders
- Lead conversations about SLOs, SLIs, service availability, and service behavior
What You Bring
- Demonstrated expertise in SLOs, SLIs, error budgets, and tying them to release velocity and operational decision-making
- Experience creating or improving Service Level Contracts across distributed systems
- Ability to apply failure-mode analysis, chaos practices, and resilience engineering patterns to assess application readiness
- Deep understanding of traffic management techniques: rate limiting, backpressure, load shedding, circuit breakers, concurrency limits
- Experience designing or evaluating progressive delivery (canary releases, feature flags, blue/green, rollout strategies) with reliability risk in mind
- Strong familiarity with container orchestration (K8s, ECS, etc.) from a production reliability perspective rather than simple deployment automation
- Ability to assess application teams on CI/CD maturity, test coverage quality (unit/integration/e2e), pipeline reliability, and deployment safety
- Experience conducting reliability audits or operational maturity reviews for applications or services
- Ability to produce structured maturity scoring, gap analysis, and concrete improvement roadmaps
- Hands-on experience implementing or improving distributed tracing, structured logging, and actionable metrics (RED/USE frameworks)
- Ability to identify observability blind spots, instrumentation gaps, cardinality issues, or mis-configured alerts
- Experience designing alerting strategies focused on signal-to-noise ratio, actionable pages, and reducing alert fatigue
- Strong understanding of distributed systems concepts: latency vs. throughput tradeoffs, timeouts, retries, idempotency, consistency patterns
- Experience with capacity planning, load testing, and performance benchmarking, including identifying bottlenecks at the application or infrastructure layer
- Ability to collaborate with engineering teams to design systems that are horizontally scalable, fault-tolerant, and operable
- Demonstrated experience running or participating in incident response, including on-call rotations
- Experience running blameless post-mortems, writing incident reviews, and turning findings into systemic improvements
- Strong understanding of operational readiness reviews and production launch criteria
- Experience embedded directly into teams to raise reliability maturity, not just automate pipelines
- Ability to communicate reliability trade-offs and influence engineering teams, product teams, and leadership
- Proven ability to drive cross-team initiatives around reliability, observability, and operational practices
English level
Intermediate+
Legal & Hiring Information
- Exadel is proud to be an Equal Opportunity Employer committed to inclusion across minority, gender identity, sexual orientation, disability, age, and more
- Reasonable accommodations are available to enable individuals with disabilities to perform essential functions
- Please note: this job description is not exhaustive. Duties and responsibilities may evolve based on business needs
Your Benefits at Exadel
Exadel benefits vary by location and contract type. Your recruiter will fill you in on the details.
- International projects
- In-office, hybrid, or remote flexibility
- Medical healthcare
- Recognition program
- Ongoing learning & reimbursement
- Well-being program
- Team events & local benefits
- Sports compensation
- Referral bonuses
- Top-tier equipment provision
Exadel Culture
We lead with trust, respect, and purpose. We believe in open dialogue, creative freedom, and mentorship that helps you grow, lead, and make a real difference. Ours is a culture where ideas are challenged, voices are heard, and your impact matters.