Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models. As a Welocalize brand, WeloData leverages over 25 years of experience in partnering with the world’s most innovative companies and brings together a curated global community of over 500,000 AI training and domain experts to offer services that span:

ANNOTATION & LABELLING: Transcription, summarization, image and video classification and labeling.

ENHANCING LLMs: Prompt engineering, SFT, RLHF, red teaming and adversarial model training, model output ranking.

DATA COLLECTION & GENERATION: From institutional languages to remote field audio collection.

RELEVANCE & INTENT: Culturally nuanced and aware, ranking, relevance, and evaluation to train models for search, ads, and LLM output.

Want to join our Welo Data team? We bring practical, applied AI expertise to projects. We have both strong academic experience and a deep working knowledge of state-of-the-art AI tools, frameworks, and best practices. Help us elevate our clients' Data at Welo Data.

Project Overview

We are seeking experienced bilingual evaluators to support a multilingual AI safety project focused on evaluating model responses across culturally specific prompt-image datasets.

This project involves applying a structured safety rubric to assess AI-generated responses for appropriateness, safety, and reliability within the target locale’s cultural context.

Each language stream will process approximately 1,000 prompt-image pairs. Every item will receive two independent evaluations, with arbitration applied in cases of disagreement. Evaluations will primarily be documented in English, with a defined in-language sample.

Project Details

Location: Remote – Singapore

Team: Welo Data – AI Services

Engagement Type: Freelance – Remote

Start Date:

Duration: 2-3 weeks

Weekly Commitment: 20–40 hours per week

Schedule Options:

• 4 hours per day, Monday–Friday

• 2 hours per day, Monday–Friday + 10 weekend hours

Hourly rate: 28 USD

Responsibilities

- Evaluate AI-generated responses using a structured safety rubric

- Complete two independent evaluations per item

- Provide concise, well-structured rationales in English

- Participate in calibration sessions

- Support arbitration when evaluation discrepancies occur

- Maintain quality and throughput targets during the evaluation window

Qualifications

- Fluency in the target language and English

- Deep cultural understanding of the target locale

- Strong written English skills for documentation and rationales

- Prior experience in safety evaluation, policy review, content moderation, or rubric-based assessment preferred

- Ability to apply detailed guidelines consistently

- Strong analytical skills and attention to nuance

- Reliable availability during the production window

- Priority may be given to contributors who previously worked on prompt-image or similar evaluation projects to reduce onboarding time and maintain continuity.

Disclaimer: This role involves working with explicit and sensitive content. Applicants should be comfortable working with adult material in a professional capacity. Please apply only if you fully understand and are prepared for the nature of this role.

Ara Zeta - AI Safety Evaluator – Tamil (Singapore)

Related Jobs

Laborer (Golf Course Groundskeeper)

Sales Operations Manager

Reporting Analyst

Manager Global Readiness

Program Manager, Third Party Security

Sr. Solutions Engineer (Federal)