OpenTranscription/ Blog
2026-07-03 · MODEL PROFILE

Deepgram Enhanced: model profile

Reference spec sheet for Deepgram Enhanced, a 2022 speech-to-text tier, including its Future AGI Agent Command Center gateway integration.

model-profilespeech-to-textdeepgramai-gatewayasr
Deepgram
Model profile Deepgram

Deepgram Enhanced is a speech-to-text tier in Deepgram's model lineup, introduced in May 2022 as a higher-accuracy option above the Base tier, built on Deepgram's "next generation End-to-End Deep Learning speech model architecture."

Specifications

DeveloperDeepgram
ReleasedMay 26, 2022 (English); German support added October 2022
Model typeSpeech-to-text, "next generation End-to-End Deep Learning speech model architecture"
LanguagesEnglish (May 2022); German (October 2022)
Modes (batch / streaming)Both: pre-recorded via the /v1/listen family and streaming via WebSockets
LatencyNot publicly disclosed for Enhanced specifically. Platform-level 2022 claims: one hour of pre-recorded audio in about 30 seconds; less than 300 ms real-time streaming lag.
Throughput / concurrencyUp to 50 concurrent pre-recorded requests and up to 150 concurrent streaming requests on starter-like limits (North America/Europe tables), rising on Growth and Enterprise plans
DeploymentDeepgram hosted API; self-hosted deployments available; also routable through Future AGI Agent Command Center
PricingNot publicly disclosed. Future AGI's Deepgram calculator pages mark pricing for Enhanced variants as not currently public or pending.

Not disclosedParameters · Training data · License

Full technical breakdown9 sections

Overview

Deepgram introduced Enhanced in May 2022, claiming 19% higher relative accuracy than the previous model and improved handling of long-tail vocabulary. Deepgram positioned Enhanced as the middle tier between Base and custom-trained models: better out-of-the-box quality than Base, without the cost and effort of full custom model training. In current Deepgram documentation, Enhanced remains available with variants such as general, meeting, phonecall, and finance, while Deepgram's 2025 to 2026 product narrative centers on newer speech products such as Nova-3 and Flux.

Enhanced is also exposed through Future AGI's Agent Command Center, a gateway and control-plane layer that sits between an application and model providers. Future AGI's public model catalog lists routable model IDs including deepgram/enhanced, deepgram/enhanced-general, deepgram/enhanced-phonecall, deepgram/enhanced-meeting, and deepgram/enhanced-finance, advertised as callable through Agent Command Center with unified observability, caching, fallback, and "15 routing strategies including cost-optimized fallback."

Capabilities and features

Vendor-reported capabilities from Deepgram's 2022 launch and product materials:

  • 19% higher relative accuracy than the previous model, better word recognition, and stronger long-tail vocabulary handling.
  • An "increased effective vocabulary" that handles infrequent and uncommon words significantly better than lower tiers, positioned for teams that need stronger recognition of domain terms without committing to a custom-training program.
  • Variants: general, meeting (beta), phonecall, finance (beta).

Enhanced uses Deepgram's standard STT interfaces rather than a bespoke API. Pre-recorded STT uses the /v1/listen family and accepts audio and video inputs, with JSON responses for transcription results or a request ID for callback and asynchronous paths. Streaming uses WebSockets and supports standard streaming controls and transcript options such as sample rate, smart formatting, diarization, interim and final results, and supported audio encodings. Deepgram's docs state support for 100+ audio formats and encodings overall.

Deepgram advised customers that model upgrades can change transcript outputs and recommended side-by-side testing with features such as keywords before production rollout.

Language support

English availability was announced at launch in May 2022. Enhanced support expanded to German in October 2022. No further language coverage for Enhanced is stated in the sources reviewed.

Performance and benchmarks

Deepgram does not publish model-specific WER benchmarks, named benchmark datasets, or competitor tables for Enhanced in the sources reviewed. By 2025 to 2026, Deepgram's public benchmark language moved to newer models such as Nova-2, Nova-3, and Flux.

Scope Figure or claim What it means Caveat Source
Deepgram Enhanced launch 19% higher relative accuracy vs previous model Official launch claim for Enhanced tier No public WER table or named benchmark dataset in reviewed source
Deepgram long-tail vocabulary "Increased effective vocabulary," better handling of uncommon words Product rationale for Enhanced Qualitative claim, not benchmarked in reviewed docs
Deepgram family speed One hour of audio in ~30 seconds; <300 ms real-time lag Historical Deepgram platform-level speed claims Not Enhanced-specific and partly older marketing
Deepgram Nova-3 6.84% median WER streaming; 5.26% batch Current official flagship benchmark Applies to Nova-3, not Enhanced
Future AGI gateway throughput ~28,889 req/s on 4 vCPU / 16 GB t3.xlarge profile Gateway throughput under mock-upstream benchmark Measures gateway overhead, not upstream model latency
Future AGI gateway latency P95 2.8 ms at ~1k RPS; full chat proxy ~66 µs internal wall time Fast gateway/control-plane overhead Self-published benchmark against mock upstream
Independent STT critique 44% average transcription error on spoken U.S. street names across evaluated top-provider models Real-world high-stakes error failure mode Not Enhanced-specific; cross-provider study
Independent accent critique Deepgram and peers vary materially on non-native accented English Robustness gaps beyond headline WER Not Enhanced-specific
Independent audit critique Standard ASR audits can understate weakness for aphasia speakers Quality varies by speech type and evaluation method Not Enhanced-specific

Third-party evaluation context: independent research shows that speech systems from top providers, including Deepgram, can fail on short high-stakes utterances, non-native accented English, aphasia speech, and other off-benchmark conditions. These studies are cross-provider and not Enhanced-specific. Deepgram publishes multiple pieces advising customers to evaluate against production audio rather than headline WER.

Deepgram model context

Model/tier Official positioning Public architecture/training disclosure Public accuracy disclosure Public options noted Service limits visible in docs Source
Base Cost-effective tier on Deepgram's end-to-end STT architecture Built on Deepgram's "signature end-to-end deep learning" architecture; no detailed model card on reviewed page No headline WER on reviewed page general, meeting, phonecall, voicemail, finance, conversationalai, video Enhanced/Base/Nova families share concurrency tables in current limits docs
Enhanced Higher-accuracy tier above Base "Next generation End-to-End Deep Learning speech model architecture"; no parameter count or dataset card publicly disclosed 19% higher relative accuracy vs prior model; improved long-tail vocabulary general, meeting beta, phonecall, finance beta Starter/Growth/Enterprise concurrency tables list Enhanced for both pre-recorded and streaming
Nova Predecessor to Nova-2 Docs say training spanned 100+ domains and 47B tokens No single WER figure on the model page excerpt reviewed general, phonecall Same concurrency table family
Nova-3 Current flagship general ASR Deepgram's current benchmark and product pages emphasize it; formal architecture details still limited in docs reviewed 6.84% median WER streaming, 5.26% batch; Deepgram claims 54.3% and 47.4% relative reductions vs competitors General-purpose current flagship Current rate limits list Nova-3 prominently
Flux Conversational streaming ASR with turn detection Positioned as latest-generation streaming/conversational model Independent Coval validation cited by Deepgram for latency/turn-taking; not directly an Enhanced successor Conversational/turn-based experiences Current limits list Flux streaming separately

Latency and throughput

Deepgram does not disclose latency-to-first-token or latency distributions for Enhanced specifically. Vendor-reported platform-level figures from 2022 materials: use-case models able to transcribe one hour of pre-recorded audio in about 30 seconds, and real-time streaming with less than 300 ms lag. These are family and platform-level claims, not Enhanced-only measurements.

Current Deepgram rate-limit docs still list Enhanced. On starter-like limits visible in the docs, Enhanced is shown at up to 50 concurrent pre-recorded requests and up to 150 concurrent streaming requests in North America/Europe tables, rising on Growth and Enterprise plans. These figures are service limits, not intrinsic model throughput measurements.

Deployment and integrations

Deepgram exposes Enhanced through its hosted API (pre-recorded /v1/listen and WebSocket streaming). Deepgram publishes official developer SDKs and playground tooling for STT, TTS, and voice APIs, including the API Playground, JavaScript SDK, and STT getting-started guides for pre-recorded and live transcription. Deepgram's trust and self-hosting materials state that self-hosted deployments typically keep audio and transcripts inside customer infrastructure, apart from license validation and usage reporting, with hardened systems, RBAC, encryption, SOC 2 controls, HIPAA-compatible offerings for enterprise customers, and GDPR readiness with EU data residency.

Future AGI Agent Command Center

Future AGI's Agent Command Center is a gateway and control-plane layer between the client application and upstream providers. Each request passes through a fixed-order plugin chain: IP ACL, authentication, RBAC, cache lookup, budget checks, guardrails, tool policy, validation, and rate limiting before the provider call; the response then continues through cost tracking and logging. Cache hits can short-circuit the provider call entirely.

Future AGI's public Deepgram model pages state that calling models such as deepgram/enhanced-general through Agent Command Center returns metadata on x-agentcc-* headers including provider, cost, latency, cache hit, and request ID. The catalog advertises 36 Deepgram models, including Enhanced variants, as routable through Agent Command Center.

Documentation mismatch: Agent Command Center's provider docs enumerate OpenAI, Anthropic, Gemini, Bedrock, Azure, Cohere, Groq, Mistral, Together, Fireworks, DeepInfra, Perplexity, Cerebras, xAI, OpenRouter, Hugging Face, Anyscale, Replicate, and several self-hosted backends, but do not explicitly mention Deepgram. The primary API reference frames /v1/audio/transcriptions in Whisper/OpenAI-compatible terms. The public sources reviewed do not resolve this discrepancy.

Gateway characteristics from Future AGI's public docs and README:

  • Observability: every request is logged with request ID, trace ID, session ID, model requested vs model actually used, provider, token counts, cost, latency, cache status, guardrail results, and fallback/error events. Metrics export to Prometheus and traces to OpenTelemetry; users can propagate their own trace IDs via request headers.
  • Caching: server-side exact and semantic strategies with configurable TTL, namespaces, LRU eviction, force-refresh, and Cache-Control: no-store. The open-source README states support for 6 exact-cache backends (mem, redis, disk, s3, gcs, azblob) and 4 semantic-cache backends (mem, pinecone, qdrant, weaviate).
  • APIs and SDKs: OpenAI-compatible endpoints including /v1/chat/completions, /v1/embeddings, /v1/audio/transcriptions, /v1/audio/speech, /v1/images/generations, and /v1/responses. Future AGI publishes a client SDK repository for Python and TypeScript; the control-plane/gateway service lives in the platform monorepo.
  • Deployment: hosted gateway, Docker, and Go-binary self-hosting from a config.yaml, intended to keep requests inside the customer's infrastructure. The README states clustering/HA uses Raft-based clustering; official Kubernetes manifests and Helm charts were "coming soon" at the time of the README snapshot reviewed.
  • Security: virtual API keys, encrypted storage of provider credentials, RBAC, rate limits, budget controls, IP allowlists, and append-only audit logging. Prompts and completions are not stored by default; caching is opt-in and configurable per organization.

Routing strategies

Future AGI's public README names 15 routing and reliability strategies: roundrobin, latency, costopt, adaptive, complexity, conditional, providerlock, accessgroups, race/hedged, mirror/shadow, modelfallback, failover, circuitbreaker, retry, healthmonitor. The formal routing docs fully explain only a subset.

Strategy What the public docs support Detail level in public docs Source
Round robin Evenly rotates traffic across providers; default example strategy Fully specified
Weighted Splits traffic by assigned weights Fully specified
Least latency Routes to fastest provider using recent response times Fully specified
Cost optimized Chooses cheapest provider supporting the requested model Fully specified
Adaptive Dynamically adjusts weights using real-time performance Fully specified at high level
Complexity-based Scores requests on 8 signals and maps them to a model tier Fully specified at high level
Conditional Docs/config mention "conditional routing rules"; exact rule language is not fully documented on the pages reviewed Partially specified
Provider lock README names it; likely forces/pins a request to a chosen provider for compliance or policy reasons, but exact semantics were not publicly documented in reviewed pages Named only
Access groups Public docs explain access groups as logical sets of models/aliases for policy management Partially specified
Race / hedged README names hedged/race requests; implies parallel or near-parallel upstream calls to reduce tail latency, but exact policy knobs were not documented in reviewed pages Named only
Mirror / shadow Shadow experiments mirror a sampled portion of production traffic to a target model/provider without affecting users Well specified
Model fallback Per-model ordered fallback chains; e.g., if one model fails, try alternatives in sequence Fully specified
Failover Triggers on 429, 5xx, timeouts, and connection errors; routes to backup provider list Fully specified
Circuit breaker Opens on repeated failures, then half-opens for recovery probes Fully specified
Retry Exponential backoff with configurable retry counts and backoff windows Fully specified
Health monitor Public provider-health docs describe continuous health tracking plus cooldown/probe-based reentry Partially specified

Pricing

Not publicly disclosed for Enhanced in the sources reviewed. Future AGI's public Deepgram calculator pages mark pricing, context window, and benchmarks for Enhanced variants as not currently public or pending.

Development and ownership

Deepgram was founded in 2015. Its official company story states the company emerged from machine-learning work for waveform analysis in a dark-matter detector in China, and that Scott Stephenson, identified by Deepgram as CEO and co-founder, later explored deep learning for audio analysis at the University of Michigan before building Deepgram around end-to-end deep learning. Deepgram's official author bio states Stephenson earned a PhD in particle physics from the University of Michigan and left a postdoctoral research role to found the company.

In January 2026, Deepgram raised a $130M Series C at a $1.3B valuation and acquired OfOne. Its Series C press release cites US patent 12,499,875, "Deep Learning Internal State Index-Based Search and Classification." USPTO patentee indexes also show Deepgram assignee entries for "Hardware efficient automatic speech recognition" (granted June 2025) and "End-to-end automatic speech recognition with transformer" (granted August 2025). These patents do not provide a public model card for Enhanced.

Future AGI is an open-source, end-to-end AI agent engineering platform spanning tracing, evaluation, optimization, protection, and gateway functions. Public profiles identify Nikhil Pareek as Founder and CEO, and Charu Gupta as co-founder; public sources indicate Future AGI was founded in 2024, though that founding year is not clearly stated on the official homepage reviewed.

Company Person Publicly confirmed role Bio details visible in reviewed sources Public profiles Source
Deepgram Scott Stephenson CEO, co-founder Deepgram describes him as a dark-matter physicist turned deep-learning entrepreneur; PhD in particle physics, University of Michigan; left postdoc work to found Deepgram. Deepgram author page; LinkedIn public profile snippet
Deepgram Other co-founders Not fully specified in official pages reviewed Deepgram's official story references "Scott Stephenson and his teammate" but does not name the teammate on the pages reviewed. Unspecified in official material reviewed
Deepgram Andrew Seagraves VP of Research Official Deepgram byline identifies him as VP of Research; additional biography details were not published on the page reviewed. Deepgram article byline
Future AGI Nikhil Pareek Founder & CEO Public profiles describe him as Founder/CEO; Forbes profile says he has about a decade in AI, a prior exit, and this is his second funded startup; a founder interview says his first job involved autonomous agents for drones. LinkedIn; Forbes Council profile; Cerebral Valley interview
Future AGI Charu Gupta Co-founder Public profiles and a founder story describe her as co-founder with 15+ years of business scaling experience. LinkedIn public profile snippet; founder story
Future AGI N.V.J.K. Kartik Founding Engineer Public LinkedIn snippet identifies him as Founding Engineer at Future AGI; education listed as IIIT Dharwad. Official webinar page says he "ships the routing and caching surface" of Agent Command Center. LinkedIn; Future AGI webinar page
Future AGI Rishav Hada Senior Applied Scientist Official Future AGI webinar page says he leads the guardrails segment; multiple Future AGI research/blog pages identify him as Senior Applied Scientist. A personal site describes him as a Mila graduate researcher focused on reliable ML systems. Future AGI blog/webinar pages; personal site
Future AGI Nikita Sklyarov Technical Lead Public LinkedIn snippet describes him as Technical Lead building agentic AI systems; more detailed background was not specified in the reviewed official sources. LinkedIn public snippet

Release history

Date Deepgram Future AGI Source
2015 Deepgram founded
May 2022 Enhanced launched; English availability announced
Oct 2022 Enhanced support expanded to German
2024 Public sources identify Future AGI as founded in 2024
Feb 2025 Nova-3 launched
Oct 2025 Future AGI public roundup highlights open-source stack progress
Jan 2026 Deepgram raises $130M Series C at $1.3B valuation; expands patent emphasis; acquires OfOne
Mar to May 2026 Deepgram publishes newer benchmark/voice-AI comparison materials; Enhanced still documented but not central Future AGI publishes multi-model routing comparisons, gateway docs, and Deepgram model catalog pages carrying Agent Command Center routing claims

Deepgram's current marketing and changelog emphasis favors Nova-3, Flux, multilingual expansion, medical specialization, and voice-agent stacks rather than further public narrative around Enhanced; Enhanced is still supported but is no longer Deepgram's flagship public innovation story.

Sources

The platform

Put these benchmarks to work

The same evaluations behind these dispatches drive OpenTranscription — one API that routes every job to the right speech model for your audio, language, and budget.

© 2026 OpenTranscription · Signal is our journal.Set in system grotesque, serif & mono