Price Range

$0.0022

$0.09

Compliance

Best For:

Catalog · 32 models · 8 providers

Transcription models.

Every speech-to-text model worth using, one rate card.

32

Total models

106

Languages

$0.0022

Cheapest /min

14.1%

Best WER

32 models

Base

Deepgram

Batch
United States

Deepgram Base model — high volume, cost-effective

$0.0145/min

WER

56.14%

CER

72.41%

Speed Factor

0.1x

Uptime

100.0%

Billing

per 1s

Retention

No retention

19 languages

(view all)

Auto-detectDiarizationWord Timestamps+1
TranscribeView Benchmarks
AssemblyAI Best

AssemblyAI

Batch
United States

AssemblyAI's best transcription model

$0.09/min

WER

20.48%

CER

11.80%

Speed Factor

0.3x

Uptime

100.0%

Billing

per 1s

Retention

No retention

78 languages

(view all)

Auto-detectDiarizationWord Timestamps+1
TranscribeView Benchmarks
Chirp 3

Google Cloud

Batch
United States

Google's latest generative ASR foundation model — 85+ languages

$0.0107/min
Most Accurate
Best Medical
Best Conversations

WER

14.05%

CER

8.65%

Speed Factor

0.3x

Uptime

100.0%

Billing

per 15s

Retention

Custom

72 languages

(view all)

GenerativeAuto-detectDiarization+2
TranscribeView Benchmarks
Google Command & Search

Google Cloud

Batch
United States

Optimized for short queries and voice commands

$0.016/min

WER

39.37%

CER

28.27%

Speed Factor

0.1x

Billing

per 15s

Retention

Custom

40 languages

(view all)

Word Timestamps
TranscribeView Benchmarks
Google Default

Google Cloud

Batch
United States

General-purpose model

$0.016/min

WER

39.90%

CER

28.43%

Speed Factor

0.2x

Billing

per 15s

Retention

Custom

40 languages

(view all)

DiarizationWord Timestamps
TranscribeView Benchmarks
Enhanced

Deepgram

Batch
United States

Deepgram Enhanced model — high accuracy for uncommon words

$0.0165/min

WER

25.24%

CER

15.53%

Speed Factor

0.1x

Uptime

100.0%

Billing

per 1s

Retention

No retention

15 languages

(view all)

Auto-detectDiarizationWord Timestamps+1
TranscribeView Benchmarks
Speechmatics Enhanced

Speechmatics

Batch
Europe

Highest accuracy model — 55+ languages, best-in-class

$0.0083/min
Best Accented

WER

15.60%

CER

8.63%

Speed Factor

0.3x

Uptime

100.0%

Billing

per 1s

Retention

No retention

47 languages

(view all)

Async APIAuto-detectDiarization+3
TranscribeView Benchmarks
Flux

Deepgram

Realtime
United States

First conversational ASR model built for voice agents — model-integrated endpointing

$0.0077/min

Realtime model — batch benchmarks not applicable

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

endpointing
CS
turn detection+1
TranscribeView Benchmarks
GPT-4o Mini Transcribe

OpenAI

Batch
United States

GPT-4o Mini optimized for fast transcription

$0.003/min

WER

20.32%

CER

13.73%

Speed Factor

0.1x

Uptime

100.0%

Billing

per 1s

Retention

No retention

98 languages

(view all)

Auto-detectWord Timestamps
TranscribeView Benchmarks
GPT-4o Transcribe

OpenAI

Batch
United States

GPT-4o optimized for transcription with improved WER

$0.006/min

WER

33.37%

CER

26.27%

Speed Factor

0.2x

Billing

per 1s

Retention

No retention

98 languages

(view all)

Auto-detectWord Timestamps
TranscribeView Benchmarks
GPT-4o Transcribe Diarize

OpenAI

Batch
United States

GPT-4o transcription with built-in speaker diarization

$0.006/min

WER

17.85%

CER

11.86%

Speed Factor

0.2x

Billing

per 1s

Retention

No retention

12 languages

(view all)

Auto-detectDiarization
TranscribeView Benchmarks
Ink-Whisper

Cartesia

Batch & Realtime
United States

Whisper rearchitected for real-time and batch voice AI — fastest TTCT, 99-language coverage

$0.0022/min
Best Value

WER

22.80%

CER

15.08%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

100 languages

(view all)

Word Timestampsdynamic chunking
TranscribeView Benchmarks
Google Latest (Long)

Google Cloud

Batch
United States

Conformer model for long-form audio (minutes to hours)

$0.0107/min

WER

27.50%

CER

17.65%

Speed Factor

0.4x

Billing

per 15s

Retention

Custom

40 languages

(view all)

Word Timestamps
TranscribeView Benchmarks
Google Latest (Short)

Google Cloud

Batch
United States

Conformer model for short utterances (< 60s)

$0.016/min

WER

57.54%

CER

51.69%

Speed Factor

0.1x

Billing

per 15s

Retention

Custom

40 languages

(view all)

Word Timestamps
TranscribeView Benchmarks
Nova-2

Deepgram

Batch & Realtime
United States

Deepgram's Nova-2 speech recognition

$0.0058/min

WER

31.44%

CER

41.14%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

33 languages

(view all)

Auto-detectDiarization
CS
+2
TranscribeView Benchmarks
Nova-2 Conversational AI

Deepgram

Batch & Realtime
United States

Optimized for human-to-bot interactions (IVR, voice assistants)

$0.0058/min

WER

19.10%

CER

13.08%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

Diarization
CS
Word Timestamps+1
TranscribeView Benchmarks
Nova-2 Finance

Deepgram

Batch & Realtime
United States

Optimized for earnings calls with finance vocabulary

$0.0058/min

WER

19.39%

CER

13.41%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

Diarization
CS
Word Timestamps+1
TranscribeView Benchmarks
Nova-2 Meeting

Deepgram

Batch & Realtime
United States

Optimized for conference room audio

$0.0058/min

WER

17.22%

CER

11.78%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

Diarization
CS
Word Timestamps+1
TranscribeView Benchmarks
Nova-2 Phone Call

Deepgram

Batch & Realtime
United States

Optimized for low-bandwidth phone call audio

$0.0058/min

WER

17.93%

CER

12.44%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

Diarization
CS
Word Timestamps+1
TranscribeView Benchmarks
Nova-2 Voicemail

Deepgram

Batch & Realtime
United States

Optimized for low-bandwidth single speaker voicemail

$0.0058/min

WER

17.99%

CER

12.35%

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

CS
Word TimestampsCustom Vocabulary
TranscribeView Benchmarks
Nova-3

Deepgram

Batch & Realtime
United States

Deepgram's flagship model — 53% lower WER vs competitors, code-switching support

$0.0077/min
Fastest

WER

22.03%

CER

14.88%

Speed Factor

0.1x

Uptime

100.0%

Billing

per 1s

Retention

No retention

47 languages

(view all)

Auto-detectDiarizationSmart Format+3
TranscribeView Benchmarks
Scribe v2

ElevenLabs

Batch
United States

State-of-the-art batch STT — 90+ languages, speaker diarization, audio tagging

$0.004/min
Best Technical

WER

22.74%

CER

27.14%

Speed Factor

0.2x

Billing

per 1s

Retention

30 days

76 languages

(view all)

Auto-detectDiarization
CS
+1
TranscribeView Benchmarks
Scribe v2 Realtime

ElevenLabs

Realtime
United States

Most accurate low-latency STT — <150ms, 90+ languages

$0.0065/min

Realtime model — batch benchmarks not applicable

Speed Factor

0.0x

Billing

per 1s

Retention

30 days

76 languages

(view all)

streamingAuto-detect
CS
+2
TranscribeView Benchmarks
Speechmatics Standard

Speechmatics

Batch
Europe

Cost-effective model — fast turnaround, 55+ languages

$0.005/min

WER

19.05%

CER

11.31%

Speed Factor

0.1x

Uptime

100.0%

Billing

per 1s

Retention

No retention

47 languages

(view all)

Async APIAuto-detectDiarization+3
TranscribeView Benchmarks
Google Telephony

Google Cloud

Batch
United States

Optimized for telephony audio (8kHz)

$0.016/min

WER

16.44%

CER

10.52%

Speed Factor

0.2x

Uptime

100.0%

Billing

per 15s

Retention

Custom

9 languages

(view all)

Word Timestamps
TranscribeView Benchmarks
Amazon Transcribe

Amazon Web Services

Batch
United States

AWS foundation model-powered ASR — 100+ languages

$0.006/min

WER

20.63%

CER

11.85%

Speed Factor

0.4x

Uptime

100.0%

Billing

per 1s

Retention

Custom

77 languages

(view all)

Async APIAuto-detectDiarization+2
TranscribeView Benchmarks
Amazon Transcribe Medical

Amazon Web Services

Batch
United States

Medical transcription with HIPAA eligibility

$0.075/min

WER

15.86%

CER

11.08%

Speed Factor

0.4x

Uptime

100.0%

Billing

per 1s

Retention

Custom

1 languages

(view all)

HIPAA CompliantAsync APIDiarization+2
TranscribeView Benchmarks
Universal-3 Pro

AssemblyAI

Batch
United States

AssemblyAI's most powerful speech language model — up to 1000 keyterm phrases

$0.0035/min
Best Legal
Best Noisy

WER

17.30%

CER

9.93%

Speed Factor

0.2x

Uptime

100.0%

Billing

per 1s

Retention

No retention

6 languages

(view all)

Auto-detectDiarization
CS
+2
TranscribeView Benchmarks
Universal Streaming

AssemblyAI

Realtime
United States

Purpose-built for real-time voice agents — ~300ms immutable transcripts

$0.0025/min

Realtime model — batch benchmarks not applicable

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

1 languages

(view all)

streamingWord Timestamps
TranscribeView Benchmarks
Universal Streaming Multilingual

AssemblyAI

Realtime
United States

Multilingual streaming STT — English, Spanish, French, German, Italian, Portuguese

$0.0025/min

Realtime model — batch benchmarks not applicable

Speed Factor

0.1x

Billing

per 1s

Retention

No retention

6 languages

(view all)

streaming
CS
Word Timestamps
TranscribeView Benchmarks
Whisper 1 (API)

OpenAI

Batch
United States

OpenAI's Whisper API model

$0.006/min

WER

24.09%

CER

13.61%

Speed Factor

0.3x

Billing

per 1s

Retention

No retention

98 languages

(view all)

Auto-detectWord Timestamps
TranscribeView Benchmarks
Whisper Large V3

OpenAI

Batch
United States

OpenAI's Whisper large-v3 model

$0.006/min

WER

24.07%

CER

13.64%

Speed Factor

0.3x

Billing

per 1s

Retention

No retention

98 languages

(view all)

Auto-detectWord Timestamps
TranscribeView Benchmarks