About Us
At Sully.ai, We’re Building the Most Impactful Healthcare Company on Earth
We believe that access to a great doctor is a basic human right. Today, that’s not a reality. Delays, misdiagnoses, administrative chaos, and burnout plague the system.
Our Mission:
One Human, One Doctor. We build AI teammates that augment clinicians — scribes, nurses, receptionists, translators — all powered by our own world-class models and deployed in real-world care.
Our Traction
- 450+ organizations signed 16 months
- AI agents cut admin by ~2.8 hours daily and reduce onboarding 85%.
- 5M+ Clinical Tasks completed to date, serving 36+ specialties.
- Raised $25M from YC, Eric Yuan, Amity, Semper Virens
- Patented AI architecture (MedCon-1) outperforms GPT-4.5, Gemini, Claude on clinical reasoning tasks
Sully requires A-players capable of 4 months = 1 year output.
What You'll Do
Build and scale automated evaluation pipelines (LLM-as-judge + human review) with clinical-grade benchmarks.
What you Must Bring
- Proven experience designing agentic processes and LLM evaluation/benchmarking frameworks.
- Strong Python and ML background (PyTorch/TensorFlow, Hugging Face, LangChain/LlamaIndex).
- Demonstrated ability to design rigorous experiments and translate findings into production.
- Track record of published research or deep applied work in LLMs and agent evaluation.
- Strong communication and technical writing skills to articulate complex findings clearly.
First-Month Focus
- Audit existing evaluation approaches for clinical and agentic tasks.
- Define initial benchmarks and build early automated pipelines.
- Partner with engineering to land first set of CI gates for accuracy, factuality, and safety.
90 Days
- Deliver a repeatable evaluation framework with automated pipelines in production.
- Demonstrate measurable improvements in robustness, hallucination reduction, or safety.
- Publish or present internal research findings that directly shape product reliability.
If you’ve ever said, “I want to do work that actually matters”, this is it. Let’s build something life-changing, together.
Who Thrives Here
- Entrepreneurial to your core: You think in outcomes, thrive in chaos, and take ownership without limits
- Mission-obsessed: You’re here to save lives, not just ship features — patients and doctors are your why.
- Impact-driven & fast-moving: You sprint toward hard problems and ship with sharp judgment.
- Elite teammate: You raise the bar through high standards, direct feedback, and craft excellence.
Why Join Sully.ai? 🔥 Revolutionizing the antiquated $800B+ Healthcare market
🧠 50%+ of us are ex-founders. We hire A-players, not passengers
⚡️ Speed matters - we operate with urgency, autonomy, and ownership
🧪 You’ll work on real, first-of-their-kind problems at the edge of AI and medicine
❤️ Your work helps doctors reclaim their time - and patients get better, faster care
🚀 Y Combinator Company Info
Y Combinator Batch: S21
Team Size: 49 employees
Industry: B2B Software and Services -> Productivity
Company Description: AutonomousOS for healthcare organizations
💰 Compensation
Salary Range: $180,000 - $220,000
📋 Job Details
Job Type: Full-time
Experience Level: 3+ years
Engineering Type: Machine learning