Code and theory

Senior Data Quality Engineer, ML (India)

Bengaluru, Karnataka, India Full Time

Our AI/ML engineering team ensures Code and Theory delivers innovative, immersive web experiences that delight our clients and their customers. We are always striving to balance the demanding nature of working on cutting-edge technologies with the real-world demands of high performance, high security, and accessibility. Working in collaboration with our multi-disciplinary engineering, design, and quality assurance teams, you will build software that solves real-world problems for incredible clients. 

WHAT YOU’LL DO

  • Write Python and SQL scripts to evaluate outputs from large language models (LLMs)
  • Design and implement LLM-as-Judge evaluations with clear scoring rubrics (faithfulness, relevance, completeness, correctness)
  • Define and calculate quality metrics such as exact match, token-level F1, ROUGE, and subjective rubric scores
  • Build and maintain ground-truth datasets for benchmarking and regression testing
  • Automate evaluation pipelines and integrate them into CI/CD workflows
  • Conduct in-depth analysis of large unstructured datasets to identify inconsistencies, anomalies, missing values, and potential biases
  • Diagnose and report failure modes (hallucinations, irrelevant answers, formatting errors)
  • Collaborate and serve as a crucial link between AI engineers, QA, data scientists and product managers to set quality standards and release criteria
  • Document processes and maintain reproducibility of evaluation runs
  • Create comprehensive technical documentation, including design specifications, architecture diagrams, and code comments

WHAT YOU’LL NEED

  • Strong proficiency in Python and SQL (data handling, scripting, test automation)
  • Experience with data cleaning and standardization techniques to facilitate ingestion and analysis by various teams
  • Understanding of generative AI concepts (prompts, hallucinations, grounding)
  • Experience designing structured LLM prompts for evaluations
  • Familiarity with at least one evaluation framework (RAGAS, DeepEval, TruLens, LangSmith) or ability to learn quickly
  • Familiarity with cloud runs and automation (GCP preferred) or ability to learn quickly
  • Ability to translate ambiguous quality expectations into measurable metrics
  • Excellent problem-solving abilities and analytical thinking
  • Effective communication skills to collaborate with cross-functional teams and present technical concepts to both technical and non-technical stakeholders

ABOUT US

Born in 2001, Code and Theory is a digital-first creative agency that sits at the center of creativity and technology. We pride ourselves on not only solving consumer and business problems, but also helping to establish new capabilities for our clients. With a global client roster of Fortune 100s and start-ups alike, we crave the hardest problems to solve. We have teams distributed across North America, South America, Europe, and Asia. The Code and Theory global network of agencies is growing and includes Kettle, Instrument, Left Field Labs, Create Group, Mediacurrent, Rhythm, and TrueLogic.

Striving never to be pigeonholed, we work across every major category: from tech to CPG, financial services to travel & hospitality, government and education to media and publishing. We value the collaboration with our client partners, including but not limited to Adidas, Amazon, Con Edison, Diageo, EY, J.P. Morgan Chase, Lenovo, Marriott, Mars, Microsoft, Thomson Reuters, and TikTok.

The Code and Theory network is comprised of nearly 2,000 people with 50% engineers and 50% creative talent. We’re always on the lookout for smart, driven, and forward-thinking people to join our team.