AssemblyAI Universal-3 Pro: model profile

Universal-3 Pro is AssemblyAI's promptable speech-to-text model for pre-recorded audio, released on February 3, 2026. AssemblyAI describes it as a "SpeechLLM" and positions it as the company's most capable model for entity-rich and domain-specific transcription.

Specifications

Developer	AssemblyAI
Released	February 3, 2026
Model type	Promptable speech-to-text model, described by AssemblyAI as a SpeechLLM
Training data	Not publicly disclosed for Universal-3 Pro
Native languages	English, Spanish, Portuguese, French, German, Italian
Extended language coverage	99-language workflow through fallback to Universal-2
Modes	Pre-recorded cloud API; related Universal-3 Pro Streaming model for real-time workflows
Deployment	AssemblyAI cloud API, EU endpoint, and self-hosted streaming for the streaming model
Pricing	Universal-3 Pro async: $0.21/hour; Universal-3 Pro Streaming: $0.45/hour base rate
Key add-ons	Prompting, keyterms, speaker diarization, Medical Mode, PII redaction
License	Proprietary API service

Not disclosedParameters

Known limitations

No public Universal-3 Pro model card, architecture paper, parameter count, tokenizer description, training-data disclosure, or U3-specific bias audit was found in the reviewed sources.
Native Universal-3 Pro language coverage is six languages. The 99-language story uses Universal-2 fallback outside that set.
Async mode supports prompt or keyterms_prompt, but not both in the same request. Streaming supports prompt and keyterms together, so behavior differs across modes.
Public latency numbers use different boundaries and should not be compared without checking definitions.
Keyterms can cause overcorrections or hallucinations if the list is too large or low quality.
The reviewed sources do not show a self-hosted async Universal-3 Pro product or an official on-device deployment program.

Full technical breakdown9 sections

Overview

Universal-3 Pro is built around a control surface rather than a disclosed new architecture. The public docs emphasize prompts, keyterms, audio tags, disfluency handling, speaker cues, and code-switching hints. AssemblyAI markets this as a way to improve transcription before post-processing, especially for voice agents, healthcare, customer conversations, and other workflows where rare terms and identifiers matter.

AssemblyAI has not published a Universal-3 Pro model card, architecture note, parameter count, tokenizer description, training-corpus size, data-source mix, optimizer setup, or fine-tuning recipe. The company has published detailed architecture information for Universal-1, but the public sources do not show whether Universal-3 Pro uses the same architecture.

Universal-3 Pro has six native languages in the pre-recorded API: English, Spanish, Portuguese, French, German, and Italian. For 99-language pre-recorded coverage, AssemblyAI recommends routing with speech_models: ["universal-3-pro", "universal-2"], which tries Universal-3 Pro first where supported and falls back to Universal-2 elsewhere.

Capabilities and features

Natural-language prompt guidance for pre-recorded transcription. In async mode, the request can use either prompt or keyterms_prompt, but not both in the same request.
Keyterms prompting for up to 1,000 words or phrases, with a maximum of 6 words per phrase. AssemblyAI says effective prompting can improve domain-specific term accuracy by up to 45%.
Audio event tagging based on more than 50 audio-event tags, with prompting support for domain-specific tags.
Code-switching support in the native language set, with fallback routing to Universal-2 for broader language coverage.
Disfluency and verbatim controls, including guidance for filler words and non-speech tags.
Medical Mode through domain="medical-v1", documented for medications, procedures, conditions, dosages, and other clinical vocabulary. Medical Mode supports English, Spanish, German, and French. Unsupported languages skip the add-on and are not charged.
Streaming counterpart: Universal-3 Pro Streaming uses the u3-rt-pro speech model, supports prompt plus keyterms together, and allows mid-stream configuration updates.

Language support

Universal-3 Pro's native pre-recorded coverage is six languages: English, Spanish, Portuguese, French, German, and Italian. AssemblyAI's 99-language story for pre-recorded transcription uses model routing, not native Universal-3 Pro coverage across all 99 languages.

The distinction matters for evaluation. A benchmark or production result on an unsupported language may exercise Universal-2 fallback rather than Universal-3 Pro itself. AssemblyAI's docs make the fallback strategy explicit, so buyers should test the actual speech_models configuration they plan to deploy.

Performance and benchmarks

AssemblyAI's pre-recorded benchmark docs report Universal-3 Pro at a mean English WER of 5.6% and a median English WER of 4.9%, compared with Universal-2 at 6.1% mean and 6.5% median. On FLEURS multilingual benchmarks, AssemblyAI reports an average WER of 4.58% for Universal-3 Pro and 7.42% for Universal-2.

AssemblyAI's benchmark site reports Universal-3 Pro at 8.23% global multilingual WER, close to Speechmatics Enhanced at 8.22%, and ahead of OpenAI GPT-4o Transcribe at 9.52%, OpenAI Whisper-1 at 14.39%, and Deepgram Nova-3 at 15.71% on that suite. The same benchmark site shows Universal-3 Pro leading the displayed code-switching and diarization comparisons.

For streaming, AssemblyAI reports Universal-3 Pro Streaming with 5.53% average WER and a 10.46% streaming medical missed-entity rate in the displayed comparisons. The Pipecat-linked benchmark view shows a median TTCT of 335 ms, which is faster than some competitors but slower than Deepgram Nova-3's 247 ms result in that benchmark.

Independent evidence is thinner. The open Pipecat STT Benchmark is the main public framework cited in the source article. A separate Voice of India preprint reports severe failures for "AssemblyAI Universal" on some Indian-language cases, but the article treats that as evidence about the broader Universal stack and fallback behavior rather than a clean Universal-3 Pro result on its native six languages.

Latency and throughput

The reviewed sources give several latency figures with different measurement boundaries. AssemblyAI docs describe Universal-3 Pro Streaming as sub-300 ms for time-to-complete transcript latency. Other tutorial materials cite figures around 150 ms or 307 ms, and the Pipecat benchmark page shows 335 ms median TTCT and 534 ms P95. These numbers should not be treated as interchangeable. Teams should test first partial latency, turn completion latency, and total response latency separately.

For pre-recorded jobs, free accounts get 5 parallel transcriptions and paid accounts start at 200+ parallel transcriptions, with higher limits available. For streaming, paid accounts start at 100+ new sessions per minute, and AssemblyAI documents automatic scale-up behavior. Each self-hosted streaming instance supports up to 48 concurrent streams without runtime degradation, according to AssemblyAI.

Deployment and integrations

Universal-3 Pro is available through AssemblyAI's cloud API for pre-recorded audio. AssemblyAI also offers an EU endpoint for data residency.

Universal-3 Pro Streaming is available through the streaming API and self-hosted streaming. The self-hosted docs describe containerized deployment inside customer infrastructure, with audio, transcripts, and PII remaining inside the customer environment. The reviewed sources do not document a self-hosted async Universal-3 Pro product.

A Cloudflare AI catalog entry for assemblyai/universal-3-pro suggests partner-hosted availability, but the reviewed sources do not show an official AssemblyAI on-device or edge deployment program.

Security and compliance

AssemblyAI states that it offers HIPAA BAA support, incorporates a DPA into customer terms, supports EU data residency, and holds SOC 2 Type 2 and ISO 27001 certifications. Product and security materials also cite PCI DSS v4.0. AssemblyAI documents encryption in transit and at rest, deletion APIs, and retention controls.

AssemblyAI says certain submitted files may be used for model training after PII redaction, where permitted by contract. Files are not used for training if the customer is under a BAA, uses EU servers, or opts out. For streaming customers who opt out of model training, AssemblyAI describes zero retention of audio and transcripts in the streaming production environment, apart from limited metadata for logging and billing.

Pricing

Item	Public price
Universal-3 Pro async	$0.21/hour
Universal-2	$0.15/hour
Prompting add-on	$0.05/hour, listed as beta in the reviewed source
Keyterms Prompting	$0.05/hour
Speaker Diarization	$0.02/hour
Medical Mode	$0.15/hour
Universal-3 Pro Streaming	$0.45/hour base rate; keyterms included, Streaming Diarization +$0.12/hour, Prompting beta +$0.05/hour
Voice Agent API	$4.50/hour

AssemblyAI states that standard usage requires no commitments.

Development and ownership

Universal-3 Pro is developed and operated by AssemblyAI. The public sources identify AssemblyAI as a research-oriented company led by founder and CEO Dylan Fox. The Universal-3 Pro launch and enablement materials were authored by Madison Bernstein, Ryan Seams, Martin Schweiger, and Kelsey Foster.

The closest detailed research lineage in the public record is Universal-1. AssemblyAI's Universal-1 paper describes a 600M-parameter Conformer RNN-T model pretrained with BEST-RQ on 12.5 million hours of unlabeled multilingual audio, then fine-tuned with supervised and pseudo-labeled data. Public sources do not verify whether the Universal-3 Pro team reused that architecture.

Release history

Date	Milestone	Notes
December 2023	AssemblyAI Series C	AssemblyAI said the funding would support work on "superhuman" Speech AI models
February 3, 2026	Universal-3 Pro launch	Promptable pre-recorded speech model released
March 2026	Universal-3 Pro Streaming	Streaming counterpart released for real-time workflows
2026	Medical and self-hosted expansion	Medical Mode and self-hosted streaming are documented in the current product surface

Sources

Introducing Universal-3 Pro: A new class of speech language model optimized for Voice AI. https://www.assemblyai.com/blog/introducing-universal-3-pro
AssemblyAI pre-recorded audio benchmarks. https://www.assemblyai.com/docs/pre-recorded-audio/benchmarks
AssemblyAI FAQ: Can you sign a BAA. https://www.assemblyai.com/docs/faq/can-you-sign-a-baa
Universal-3 Pro async docs. https://www.assemblyai.com/docs/pre-recorded-audio/universal-3-pro
AssemblyAI pricing. https://www.assemblyai.com/pricing
Self-hosted streaming docs. https://www.assemblyai.com/docs/streaming/self-hosted-streaming
Universal-1 research. https://www.assemblyai.com/research/universal-1
Pipecat STT Benchmark. https://github.com/pipecat-ai/stt-benchmark
Voice of India preprint. https://arxiv.org/pdf/2604.19151
Medical Mode docs. https://www.assemblyai.com/docs/pre-recorded-audio/medical-mode
AssemblyAI benchmark site. https://www.assemblyai.com/benchmarks
Expanding enterprise security and data residency capabilities. https://www.assemblyai.com/blog/expanding-enterprise-security-and-data-residency-capabilities
Universal-3 Pro streaming docs. https://assemblyai.com/docs/streaming/universal-3-pro
Cloudflare AI catalog entry for assemblyai/universal-3-pro. https://developers.cloudflare.com/ai/models/assemblyai/universal-3-pro/
Rate limits docs. https://www.assemblyai.com/docs/pre-recorded-audio/rate-limits
FAQ: Are files submitted to the API used for model training. https://www.assemblyai.com/docs/faq/are-files-submitted-to-the-api-used-for-model-training
Delete transcripts docs. https://www.assemblyai.com/docs/pre-recorded-audio/delete-transcripts
AssemblyAI About page. https://www.assemblyai.com/about
Series C announcement. https://www.assemblyai.com/blog/announcing-our-50m-series-c-to-build-superhuman-speech-ai-models
Universal-3 Pro product page. https://www.assemblyai.com/universal-3-pro
Optimizing accuracy and latency in streaming. https://www.assemblyai.com/docs/streaming/getting-started/optimizing-accuracy-and-latency
Universal-3 Pro Streaming launch and pricing. https://www.assemblyai.com/blog/universal-3-pro-streaming