Staff Software Engineer — Data Platform (Python)
If you want to own the data infrastructure that powers real-time pricing decisions for thousands of hotels worldwide — and you're energised by the idea of doing it in an engineering culture where AI isn't a future ambition but how work actually gets done today — this is the role. You'll be the technical authority on Duetto's data lakehouse, driving everything from pipeline architecture and data quality to the shift from batch to near-real-time streaming.
What Makes Us Different?
Duetto is the hospitality industry's leading revenue management platform, founded in 2012 by former Wynn Resorts executives who knew the industry needed better technology. We built the world's first Revenue & Profit Operating System — a suite of tools (GameChanger, ScoreBoard, BlockBuster, Advance and more) that goes beyond room pricing to give hotels, resorts and casinos a complete picture of their revenue and profitability. Trusted by clients ranging from independent boutique hotels to global chains, we've been named the #1 Revenue Management Software by HotelTechAwards four years running and the #1 Best Place to Work in Hotel Tech in 2025. Backed by GrowthCurve Capital since 2024, we're accelerating our investment in AI — and we're genuinely passionate about the industry we serve. We build products we're proud of, for customers we care about.
What You'll Be Doing
- You'll own the design, performance, and reliability of Duetto's data lakehouse — evolving the Python/PySpark pipeline framework across a bronze → silver → gold architecture on AWS, including Glue jobs, Iceberg MERGE operations, schema evolution, and partitioning strategies.
- You'll architect the shift from batch to near-real-time streaming, building SQS-driven stream pipelines with Iceberg sinks and expanding ingestion, normalisation, and analytics layers across the full lakehouse.
- You'll drive data quality and governance at scale — extending the Great Expectations framework, leading adoption of data contracts to formalise schemas between producers and consumers, and owning the Athena SQL layer that analysts and product teams depend on.
- You'll strengthen observability and reliability through Datadog, Sentry, and Sumo Logic, while optimising Glue job performance — worker sizing, DPU allocation, Spark tuning, and cost management.
- You'll build and maintain shared internal Python libraries published to JFrog, and drive improvements to GitHub Actions, Docker-based testing, and CI/CD deployment workflows.
- You'll work AI-first every day — using Claude Code and MCP tools in your regular workflow, and contributing to AI-assisted pipeline generation, schema inference, and automated data quality alongside a custom multi-agent system with 17 specialised agents.
What We're Looking For
You may be a good fit if you have:
- 7+ years building production data systems in Python
- Deep expertise in PySpark and distributed data processing — Glue, EMR, or Databricks
- Strong experience with lakehouse architectures: Iceberg, Delta Lake, or Hudi on S3
- Production experience with Airflow or a comparable workflow orchestrator
- Solid AWS production experience across S3, Glue, Athena, Lambda, and SQS
- A track record of improving data quality, governance, and pipeline reliability at scale
Strong candidates may also have:
- Working knowledge of Java for reading upstream systems
- Experience with Trino or Presto for interactive SQL analytics at scale
- Experience with dbt for data transformation and modelling
- Familiarity with Great Expectations or similar data quality frameworks
- Genuine interest in AI-assisted development and LLM-based tooling
- Familiarity with hospitality data — reservations, rates, inventory, demand signals
Why Duetto?
- AI-first is not a marketing line here. Every engineer uses Claude Code and a custom multi-agent system daily. You'll contribute to AI-assisted pipeline generation and work alongside 17 specialised agents with human-in-the-loop approval gates — this is what the future of data engineering looks like, and we're already living it.
- The technical challenge is real. Millions of rate decisions processed daily, 80+ integration partners, and a live evolution from batch to near-real-time streaming — the scale and complexity are genuine.
- You'll own something meaningful. This is a Staff-level role with true technical authority over the lakehouse that powers every pricing decision, forecast, and report in the product.
- A team worth joining. Low ego, high EQ, genuine intellectual curiosity, and active mentorship in both directions across a collaborative US and Europe team.
- Modern stack, real scale. Python/PySpark, Apache Iceberg, Airflow, AWS Glue, SageMaker, Terraform — tools selected for the problem, not the press release.
The Details
- Location: Remote (US)
- Team: Data Platform
- Direct reports: None
- Travel: None
Duetto is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other characteristic protected by applicable law.
Sound like you?
You don't need to tick every box — if you're a strong Python data engineer who's excited by lakehouse architecture, streaming pipelines, and working in a genuinely AI-first engineering culture, we'd love to hear from you.
#LI-REMOTE