Position: LLM – AI Quality Analyst (Personalization) – Dutch
Type: Short-Term Contract
Location: Remote
Commitment: 20-40 hours/week with 4 hours overlap with PST
Engagement Length: 1 month
Start Date: Immediate
Role Responsibilities
• Design multi-turn conversational prompts based on personal context
• Evaluate personalized AI responses for relevance, grounding, and helpfulness
• Assess correct and incorrect use of personal data in model outputs
• Perform side-by-side (SxS) evaluation and ranking of AI responses
• Identify grounding errors, poor inferences, and forced personalization
• Write clear, structured rationales referencing specific conversation turns
• Extract and verify model debug information and data source usage
• Maintain strict data hygiene by deleting evaluation conversations
Requirements
• Dutch fluency (reading and writing) is mandatory, as Dutch is the focus language for this project
• Experience in data annotation, AI quality evaluation, content moderation, or related roles is strongly preferred
• Strong analytical thinking and attention to detail
• Ability to evaluate nuanced and ambiguous AI responses
• Comfortable using a primary personal Google account with enabled data sources
• BS/BA degree or equivalent experience in a relevant analytical field
• Strong written communication and structured feedback skills
• Self-motivated and able to work independently in a remote setting
• Reliable desktop/laptop with stable internet connection