Assessment Science

Built on Psychometric Best Practices

Our assessments are designed by language assessment experts, powered by advanced AI, and validated against the globally recognized CEFR framework.

Global Standard

CEFR-Aligned Scoring

The Common European Framework of Reference for Languages (CEFR) is the international standard for describing language ability. Our assessments map directly to CEFR levels A1 through C2, providing universally understood proficiency ratings.

A1-A2
Basic User
Can understand and use familiar everyday expressions
B1-B2
Independent User
Can handle most situations in the target language
C1-C2
Proficient User
Can express fluently with precision and nuance

Sample Score Report

Detailed breakdown by skill dimension

B2
Upper Intermediate
78/100
Speaking (20% each)
Pronunciation82
Fluency75
Grammar78
Vocabulary80
Coherence74
Writing (20% each)
Task Response79
Organization76
Vocabulary81
Grammar77
Conventions80
Test Format

Comprehensive Assessment Structure

Our assessments evaluate both productive skills through carefully designed sections that progress in complexity.

Speaking Assessment

~7 minutes • 4 questions • 3 sections

1
Warm-Up & Read Aloud
Self-introduction and passage reading to assess pronunciation
2
Topic Discussion
Extended responses on general topics with 10s think time
3
Situational Response
Workplace scenarios requiring practical communication
Scoring Dimensions
Pronunciation
Fluency
Grammar
Vocabulary
Coherence

Writing Assessment

~15 minutes • 2 questions • 2 sections

1
Professional Email
5 minutes to compose a workplace email (50-100 words)
2
Extended Essay
10 minutes to write an opinion piece (150-250 words)
Scoring Dimensions
Task Response
Organization
Vocabulary
Grammar
Conventions
AI-Powered Analysis

Advanced Language Models with Guardrails

Our scoring engine combines state-of-the-art AI with rigorous quality controls to deliver consistent, fair, and accurate assessments.

Multi-Modal Analysis

Speaking and writing assessments use specialized AI models optimized for each modality, capturing nuances specific to verbal and written communication.

  • Native audio analysis (Google Gemini)
  • Advanced text understanding (OpenAI)
  • 5 dimensions scored per section

Security Guardrails

Our system includes multiple layers of protection against manipulation attempts, ensuring assessment integrity.

  • Prompt injection detection
  • Content sanitization
  • Score anomaly flagging

Structured Feedback

Every response receives detailed, actionable feedback with specific examples to help candidates improve.

  • “What went well” highlights
  • Specific improvement areas
  • Dimension-level reasoning
Dynamic Question Bank
Expert-Crafted Prompt Library
Speaking Prompts
Writing Prompts
Prompt Types
Warm-Up
Topic Discussion
Situational
Read Aloud
Email
Essay
Domains Covered
Workplace
Academic
Social
Travel
Technology
Question Design

Scenario-Based Prompts

Our question bank features carefully designed prompts that elicit authentic language use. Each prompt targets specific CEFR levels and assesses multiple competencies.

Randomized Selection
Each candidate receives a unique combination of prompts, preventing memorization and ensuring fair assessment.
Difficulty Calibration
Prompts are tagged with CEFR difficulty ranges and selected to match assessment goals.
Skill Targeting
Each prompt is designed to elicit specific language skills like argumentation, formal register, or narrative ability.
Real-World Contexts
Questions simulate authentic workplace and social situations candidates will encounter.
Fairness & Validity

Commitment to Unbiased Assessment

We actively work to identify and mitigate bias in our assessments, ensuring fair evaluation regardless of accent, dialect, or background.

Accent-Neutral

Trained on diverse English accents from around the world

Content Focus

Evaluates language ability, not cultural knowledge

DIF Analysis

Statistical monitoring for differential item functioning

Human Review

Expert oversight for edge cases and quality assurance

Research-Backed

Continuous Validation

We continuously validate our assessments against human rater benchmarks and industry standards to ensure reliability and accuracy.

0.92
Inter-rater reliability (Target)
Cohen's Kappa
0.85
Consistency (Target)
Cronbach's Alpha
< 2%
AI-Human deviation
Average score difference

See Research-Backed Assessment in Action

Try Evalingo free with 10 assessments — no credit card required. Experience how our AI-powered scoring delivers reliable, CEFR-aligned results.

Get Started Free