At the moment, Copenhagen-based healthcare AI Corti is launching Symphony for Speech-to-Textual content, a brand new technology of clinical-grade speech recognition fashions engineered particularly for real-time dictation, conversational transcription, and batch audio processing — and their accuracy fee is the very best for this particular use case but recorded.
"We’re targeted on guaranteeing our AI scribes may be trusted by physicians, medical practitioners and sufferers…the complete healthcare system," mentioned Andreas Cleve, co-founder and CEO of Corti, in an unique video name interview with VentureBeat.
The efficiency knowledge the corporate is bringing to the desk paints a stark image of the present state of enterprise AI: in the case of extremely regulated, specialised industries, domain-specific fashions can beat out the muse mannequin suppliers.
In a newly printed analysis paper, Corti revealed that its new clinical-grade speech fashions diminished phrase error charges (WER) by as much as 93% in comparison towards main generalist speech fashions and APIs on medical terminology.
On English medical terminology, its Symphony for Speech-to-Textual content achieved a remarkably low 1.4% WER. By comparability, OpenAI’s speech mannequin registered a 17.7% WER, ElevenLabs hit 18.1%, Whisper recorded 17.4%, and Parakeet scored 18.9%.
Corti’s announcement serves as a vital inflection level for healthcare builders. Whereas general-purpose APIs like OpenAI’s whisper are enough for broad-domain transcription, they often stumble over medical acronyms, complicated medicine dosages, shorthand, and noisy emergency room environments. Symphony for Speech-to-Textual content goals to unravel this by offering builders with a extremely specialised, production-grade API designed from the bottom up for medical workflows.
The agentic period calls for flawless knowledge inputs
The launch of Symphony for Speech-to-Textual content highlights a basic shift in how healthcare makes use of voice expertise. For many years, medical speech recognition was primarily about producing a static textual content doc for human docs to overview—a digital alternative for a notepad.
However because the healthcare business hurtles into what technologists name the "agentic period," the place autonomous AI brokers actively help in medical decision-making, EHR navigation, and real-time assist, the transcript is now not the ultimate product. It’s the foundational knowledge layer.
“Speech has at all times been one among healthcare’s most vital inputs,” Cleve mentioned in a press release offered to VentureBeat. “What’s altering is what occurs after the phrases are captured. Within the agentic period, speech recognition requires greater than merely producing a transcript – we have to give AI programs correct medical info to motive from. If a mannequin mishears a drugs, dosage, or symptom, each downstream step turns into much less dependable. Symphony for Speech-to-Textual content provides healthcare builders a speech layer correct sufficient to thrive in medical actuality.”
That is the place the compounding hazard of excessive phrase error charges comes into play. If a general-purpose AI mannequin hallucinates a transcription—turning "hyperthyroidism" into "hypothyroidism," or misinterpreting a vital medicine dosage—each subsequent AI agent counting on that transcript will function on corrupted knowledge. Corti’s structure mitigates this threat by producing structured, clinically usable output straight from the API, serving to downstream AI purposes motive over clear info slightly than messy, unformatted textual content.
Nowhere is that this extra evident than in Corti’s entity recall benchmarks. Symphony for Speech-to-Textual content reached an astonishing 98.3% recall fee on formatted medical entities—comparable to dosages, measurements, and dates. In distinction, Corti reported that the strongest general-purpose baseline mannequin maxed out at simply 44.3% recall for the identical entities.
For builders constructing ambient AI documentation instruments, that 54% hole is the distinction between a instrument that saves a doctor time and a instrument that constitutes a medical legal responsibility.
Dethroning the business ldears
Whereas Corti’s benchmarks towards trendy LLM builders like OpenAI and ElevenLabs are placing, the corporate can be taking purpose at legacy medical transcription giants.
For years, the gold commonplace for devoted clinician dictation has been Dragon Medical One. Nevertheless, these legacy programs had been traditionally optimized strictly for intentional clinician dictation, not as underlying infrastructure for ambient AI, complicated multi-party conversations, or real-time medical assist instruments.
In evaluations of real-world English medical dictation, Corti achieved a 4.6% WER, outperforming Dragon’s 5.7% (a 19% relative enchancment).
Moreover, Corti demonstrated a better medical time period recall than Dragon (93.5% versus 92.9%).
By offering this degree of accuracy by way of an API endpoint, Corti is enabling third-party builders, EHR distributors, and digital care platforms to construct their very own customized dictation and ambient listening instruments that outperform the business's legacy incumbent.
"We wish individuals to construct apps atop our fashions," Cleve mentioned. "The objective is to diffuse the expertise as extensively as it’s wanted so it may be as useful as attainable to sufferers and their docs and professionals."
For Cleve and his co-founders, the mission is a private one: Cleve's personal mom was a healthcare skilled attacked by a affected person and spent years struggling to get better. He sought to enhance healthcare processes as a manner of honoring her sacrifice.
Fixing the healthcare mannequin puzzle
The calls for of healthcare prolong far past English-speaking hospitals, and international well being programs have traditionally been underserved by medical NLP fashions. Early adopters are already leveraging Corti’s new fashions in linguistically demanding environments, proving the expertise's viability in complicated worldwide markets.
Switzerland, as an illustration, requires care supply throughout a number of languages—usually concurrently inside a single medical establishment. It serves as one of the vital stringent proving grounds for multilingual medical speech fashions on this planet. Corti’s Symphony fashions demonstrated large efficiency good points in these non-English checks, reaching a 2.4% WER in German (in comparison with 13.0% for the next-best system) and a 3.9% WER in French (versus 10.6%).
“In a medical dialog, each phrase issues – a missed medicine identify, a misheard dosage, or a mistranscribed symptom can change the that means of an encounter," mentioned Pierre Corboz, Head of Options & Enterprise Improvement at Voicepoint, a Swiss healthcare expertise supplier, in a press release offered to VentureBeat. "Symphony’s accuracy on medical terminology provides us the muse to deliver extra trusted AI capabilities into medical workflows with our Voicepoint Xenon platform. When Corti improves the speech layer, the workflows we construct collectively grow to be sharper, safer, and extra helpful for clinicians in Switzerland.”
AI vrticalization and specialization are yielding good points
At the moment’s announcement of Symphony for Speech-to-Textual content shouldn’t be an remoted occasion; it’s the fruits of a strategic narrative Corti has been aggressively pushing during the last a number of weeks.
The broader Symphony platform—which powers medical and administrative purposes for a world community of EHR distributors and life sciences organizations—has been systematically proving the defensibility of vertical AI labs towards horizontal tech giants.
This marks the third main benchmark Corti has launched in simply six weeks, touching totally different layers of healthcare AI efficiency.
In April, the corporate revealed that its Symphony for Medical Coding system outperformed general-purpose fashions by greater than 25% in medical accuracy benchmarks, tackling one among healthcare’s most notoriously complicated workflows.
And simply final week, Corti introduced that its flagship clinical-grade mannequin outscored OpenAI on HealthBench Skilled, OpenAI’s personal healthcare benchmark.
Taken collectively, these three knowledge factors—medical coding, medical reasoning, and speech-to-text accuracy—illustrate a rising consensus within the enterprise expertise sector: generalized fashions are hitting a ceiling in regulated industries.
Fashions deployed in hospitals should inherently perceive complicated acronyms, sudden interruptions, medical shorthand, specialty-specific language, and strict compliance constraints. By coaching particularly on these distinctive edge circumstances, vertical AI labs like Corti are constructing a formidable moat that corporations relying solely on API calls to generalized giant language fashions can’t simply cross.
Availability and product lineup
Builders are clearly taking discover of the efficiency hole. In accordance with momentum knowledge offered to VentureBeat, Corti is seeing a 30% development in new sign-ups for its platform in quarter-to-date comparisons, signaling that builders and healthcare builders are actively gravitating towards vertical, clinical-grade fashions over generalist APIs.
Corti, which already serves over 100 million sufferers yearly throughout main well being programs together with the UK’s Nationwide Well being Service (NHS), is positioning Symphony for Speech-to-Textual content because the default engine for the following technology of healthcare software program.
You will need to word that Corti shouldn’t be launching the overarching Symphony platform itself right now; slightly, Symphony for Speech-to-Textual content operates as a brand new, distinct functionality inside that broader ecosystem, accessible by way of its personal API endpoints.
Symphony for Speech-to-Textual content is mostly out there beginning right now. Builders and enterprise architects can entry the fashions by way of the Corti API console, with full technical documentation out there to assist combine the clinical-grade speech layer into their current purposes.
In a transfer towards analysis transparency, Corti has additionally printed its full analysis paper detailing its methodology, together with a separate comparability instrument designed to assist clear analysis of medical speech recognition programs throughout the business.
Because the healthcare business continues its speedy embrace of AI-driven automation, the foundational knowledge layer has by no means been extra vital. Corti’s newest launch is a stark reminder that within the medical discipline, generic AI merely isn't ok. The longer term belongs to the specialists.

