This job offer is not available in your country.

Arabic Language Expert

Innodata Inc.WorkFromHome, Saudi Arabia

1 day ago

Job description

Overview

We are looking for native Arabic experts from Saudi Arabia to contribute on our ongoing AI / LLM project. This is remote, freelance work.

Position Overview :

Experience Level : Degree in Linguistics or a related field
Location : Remote | Freelance
Mandatory : To complete LLM Evaluation assessment by coming Sunday, September 28th

LLM Evaluation Guidelines :

No. of Questions : 122

No. of Sections : 7

Test Duration : 120 minutes (2 hours)

Total Marks : 122

Passing Criteria : High (Get at least 100+ questions correct to pass)

Required : Complete by Sunday, 28th September

Note : Assessment auto-submits after 120 minutes; maintain speed with accuracy.

Job Description :

We are seeking highly analytical and detail-oriented professionals with hands-on experience in Red Teaming, Prompt Evaluation, and AI / LLM Quality Assurance. The ideal candidate will help us rigorously test and evaluate AI-generated content to identify vulnerabilities, assess risks, and ensure compliance with safety, ethical, and quality standards.

Key Responsibilities :

Conduct Red Teaming exercises to identify adversarial, harmful, or unsafe outputs from large language models (LLMs).

Evaluate and stress-test AI prompts across multiple domains (e.g., finance, healthcare, security) to uncover potential failure modes.

Develop and apply test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential in AI-generated responses.

Collaborate with data scientists, safety researchers, and prompt engineers to report risks and suggest mitigations.

Perform manual QA and content validation across model versions, ensuring factual consistency, coherence, and guideline adherence.

Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.

Document findings, edge cases, and vulnerability reports with high clarity and structure.

Requirements :

Proven experience in AI red teaming, LLM safety testing, or adversarial prompt design.

Familiarity with prompt engineering, NLP tasks, and ethical considerations in generative AI.

Strong background in Quality Assurance, content review, or test case development for AI / ML systems.

Understanding of LLM behaviors, failure modes, and model evaluation metrics.

Excellent critical thinking, pattern recognition, and analytical writing skills.

Ability to work independently, follow detailed evaluation protocols, and meet tight deadlines.

Preferred Qualifications :

Prior work with teams like OpenAI, Anthropic, Google DeepMind, or other LLM safety initiatives.

Experience in risk assessment, red team security testing, or AI policy & governance.

Background in linguistics, psychology, or computational ethics is a plus.

Seniorities and Employment Details :

Seniority level : Mid-Senior level

Employment type : Part-time

Job function : Writing / Editing and Art / Creative

Industries : Business Consulting and Services and IT Services and IT Consulting

Referrals increase your chances of interviewing at Innodata Inc. by 2x

Other job postings related to this project may appear below.

#J-18808-Ljbffr

Create a job alert for this search

Arabic • WorkFromHome, Saudi Arabia