The ranker · benchmark 06 / 2026
Every model. Same audio. Same metrics. Re-run on every model or pricing change.
28
Models Ranked
932
Total Benchmarks
11
Languages Tested
Jun 12, 2026
Last Updated
01 · Leader
Ink-Whisper
Cartesia
81.3
score
WER
22.80%
Latency
947ms
Cost
$0.0022/min
02 · Runner-up
GPT-4o Mini Transcribe
OpenAI
78.1
score
WER
20.32%
Latency
1.9s
Cost
$0.003/min
03 · Third
Nova-2 Phone Call
Deepgram
76.4
score
WER
17.93%
Latency
1.0s
Cost
$0.0058/min
01
Ink-Whisper
81.3
22.80% WER
947ms
$0.0022/min
02
GPT-4o Mini Transcribe
78.1
20.32% WER
1.9s
$0.003/min
03
Nova-2 Phone Call
76.4
17.93% WER
1.0s
$0.0058/min
04
Nova-2 Voicemail
75.7
17.99% WER
1.2s
$0.0058/min
05
Nova-2 Meeting
73.9
17.22% WER
2.0s
$0.0058/min
06
Nova-2 Conversational AI
72.2
19.10% WER
2.2s
$0.0058/min
07
Nova-2 Finance
71.5
19.39% WER
2.4s
$0.0058/min
08
Nova-3
69.7
22.03% WER
1.3s
$0.0077/min
09
Scribe v2
68.5
22.74% WER
4.0s
$0.004/min
10
Nova-2
67.8
31.44% WER
1.6s
$0.0058/min
11
Universal-3 Pro
67.1
17.30% WER
5.8s
$0.0035/min
12
Whisper 1 (API)
65.1
24.09% WER
3.6s
$0.006/min
13
GPT-4o Transcribe
63.1
33.37% WER
2.8s
$0.006/min
14
Whisper Large V3
63.0
24.07% WER
4.3s
$0.006/min
15
Enhanced
61.7
25.24% WER
1.9s
$0.0165/min
16
Speechmatics Standard
59.2
19.05% WER
7.1s
$0.005/min
17
AssemblyAI Best
57.5
20.48% WER
4.1s
$0.09/min
18
Speechmatics Enhanced
49.6
15.60% WER
8.7s
$0.0083/min
19
GPT-4o Transcribe Diarize
49.1
17.85% WER
18.1s
$0.006/min
20
Base
48.8
56.14% WER
1.1s
$0.0145/min
21
Amazon Transcribe
47.7
20.63% WER
11.9s
$0.006/min
22
Chirp 3
43.0
14.05% WER
10.3s
$0.0107/min
23
Amazon Transcribe Medical
42.1
15.86% WER
11.4s
$0.075/min
24
Google Telephony
41.8
16.44% WER
22.3s
$0.016/min
25
Google Latest (Long)
36.3
27.50% WER
13.5s
$0.0107/min
26
Google Command & Search
30.3
39.37% WER
11.6s
$0.016/min
27
Google Default
30.0
39.90% WER
11.9s
$0.016/min
28
Google Latest (Short)
21.2
57.54% WER
12.1s
$0.016/min
WER vs. cost & speed
no single winner
Code-Switching
Scribe v2
ElevenLabs
Conversational
Chirp 3
Google Cloud
Finance
GPT-4o Transcribe Diarize
OpenAI
General
Amazon Transcribe Medical
Amazon Web Services
Legal
Universal-3 Pro
AssemblyAI
Medical
Chirp 3
Google Cloud
Noisy Environment
Nova-3
Deepgram
Technical
Scribe v2
ElevenLabs
weighted composite
50%
Accuracy
WER + CER vs. reference transcripts
30%
Speed
Median end-to-end latency
20%
Cost efficiency
Price per minute of audio
open methodology
01
Golden set
Curated test audio with verified reference transcripts across languages, accents, and noise levels
02
Same audio, every model
Every provider runs the identical test set. No cherry-picked clips, no provider-tuned inputs.
03
Event-driven & automated
Benchmarks re-run automatically on every model or pricing change, with no human bias. Every provider gets the same test audio.
04
Scoring
Overall score is a weighted composite: 50% accuracy (WER), 30% speed, 20% cost efficiency
how errors are counted
WER
Word Error Rate
Percentage of words incorrectly transcribed (lower is better)
CER
Character Error Rate
Percentage of characters incorrectly transcribed (lower is better)
MER
Match Error Rate
Ratio of errors to total alignment length (lower is better)
WIL
Word Information Lost
Fraction of word information lost in transcription (lower is better)
One endpoint, every provider. Pin the leader or let us auto-route to the best model under your accuracy and latency budget.