This role is with Mercor. Mercor uses RippleMatch to find top talent.
Role Overview
Mercor is collaborating with a leading AI lab on a short-term project focused on improving preference ranking models for conversational AI systems. We’re seeking detail-oriented generalists—ideally with prior experience in data labeling or content evaluation—to assess and rank model outputs across a variety of domains. This opportunity is well-suited for professionals comfortable with nuanced judgment tasks and working independently in a remote setup.
Key Responsibilities
Evaluate and compare AI-generated responses based on quality, coherence, and helpfulness
Assign preference rankings to pairs or sets of model outputs
Follow detailed labeling guidelines and adjust based on evolving criteria
Provide brief written explanations for ranking decisions when required
Flag edge cases or inconsistencies in task design or model output
Ideal Qualifications
Prior experience in data labeling, content moderation, or preference ranking tasks
Excellent critical thinking and reading comprehension skills
Comfort working with evolving guidelines and ambiguity
Strong attention to detail and consistency across repetitive tasks
Availability for regular part-time work on a weekly basis
More About the Opportunity
Remote and asynchronous — set your own hours
Expected commitment: 10–20 hours/week
Flexible workload depending on your availability and performance
Compensation & Contract Terms
$25–35/hour depending on experience and location
Payments issued weekly via Stripe Connect
This is a freelance engagement; you’ll be classified as an independent contractor
How to Apply
If your application moves forward on RippleMatch, you will be sent an external assessment link, which must be completed on Mercor's website
After creating your Mercor profile and submitting your application, you'll complete a quick general interview.
Most applicants hear back within a week of applying.