-

3Play Media Study Finds Artificial Intelligence Innovation Has Led to Significant Improvements in Automatic Speech Recognition (ASR)

Impressive new entrants have raised the bar for industry leaders, with AssemblyAI, Speechmatics, and Whisper leading the pack

BOSTON--(BUSINESS WIRE)--ASR technology has never been as accurate as it is today thanks to advances in artificial intelligence (AI), according to a report from 3Play Media, the leading media accessibility provider, released today. The annual State of ASR study analyzes the general state of speech-to-text technology as it applies to the task of captioning and transcription.

According to the study, in which the company tested speech recognition with ten relevant ASR engines, the accuracy of the technology has improved measurably since the company’s last evaluation in 2022. As ASR improves, it's important to understand which engine is best for different use cases. Some nuances to consider include performance on different error types, transcription styles, formatting, and industry-specific content.

“The advances in AI we’ve seen across industries have also had an impact on ASR,” Chris Antunes, co-CEO and co-Founder, 3Play Media, said. “Longtime industry leader Speechmatics and newer entrants AssemblyAI and Whisper performed at the top of the pack, with each excelling in different areas. This proves that not all engines are created equal - the training material and models matter - and that there is room at the top for multiple engines to specialize in different use cases.”

Accuracy is the key component in captioning for several reasons, most importantly ensuring that individuals who are deaf or hard of hearing and rely on captions as an accommodation receive information that fully depicts the original content. For captions to be accessible and legally compliant, they need to be 99% accurate, the industry requirement for accessibility. While there was improvement across industry leaders, the study found that even the best engines performed well below 99% accuracy, indicating a continued need for human revision.

This report measures accuracy against two measurements, Word Error Rate (WER) and Formatted Error Rate (FER). While WER is used as the standard measure of transcription accuracy, FER takes into account formatting, sound effects, grammar, and punctuation and is a better representation of the experienced accuracy of captioning. Accuracy in FER is harder to achieve, and even the best-tested engines were only 82% accurate, whereas the best-tested engines in WER were 93% accurate.

Additionally, the study identified a new type of error. Hallucinations are the tendency to generate text that has no basis in the audio. The State of ASR report found evidence of hallucinations in the Whisper transcriptions, often occurring when the topic shifted. Some of the hallucinations were significant and could pose issues for the captioning use case in particular. However, hallucinations seemed rare and did not prevent Whisper from performing competitively.

To download the report, please visit: https://go.3playmedia.com/rs-2023-asr

About 3Play Media

3Play Media is an integrated media accessibility platform with patented solutions for closed captioning, transcription, live captioning, audio description, and subtitling. 3Play Media combines machine learning (ML) and automatic speech recognition (ASR) with human review to provide innovative, highly accurate services. Customers span multiple industries, including media & entertainment, corporate, ecommerce, fitness, higher education, government, and elearning.

Contacts

Phil LeClare
phil.leclare@3playmedia.com
617-209-9406
www.3playmedia.com
@3playmedia

3Play Media

Details
Headquarters: Boston, Massachusetts
CEO: Chris Antunes
Employees: 50
Organization: PRI

Release Versions

Contacts

Phil LeClare
phil.leclare@3playmedia.com
617-209-9406
www.3playmedia.com
@3playmedia

More News From 3Play Media

3Play Media Launches Global Linguist Marketplace and AI-Enabled Language Solutions for Video, Positioning Businesses for European Accessibility Act Compliance

BOSTON--(BUSINESS WIRE)--3Play Media, the industry leader in media accessibility solutions, today announced the launch of its global linguist marketplace along with its cutting-edge, AI-Enabled accessibility and localization solutions for video-forward businesses across media & entertainment and enterprise. This timely release comes as the European Accessibility Act (EAA) is set to take effect in June 2025, placing 3Play Media at the forefront of helping organizations meet new captioning an...

3Play Media Expands Executive Team with Strategic Promotions to Support Growth and Innovation in Media Accessibility & Localization

BOSTON--(BUSINESS WIRE)--3Play Media, the leading media accessibility and localization provider in North America, today announced the promotion of two key executives to its C-suite, strengthening the company's leadership team as it continues to innovate and expand its solutions. Lily Bond has been promoted to Chief Growth Officer, and Dan Caddigan has been elevated to Chief Technology Officer. "These strategic appointments reflect our commitment to leading innovation in the media accessibility...

3Play Media Launches AI-Enabled Video Accessibility Solutions to Help Universities Meet 2026 ADA Requirements

BOSTON--(BUSINESS WIRE)--3Play Media, the leading media accessibility provider in North America, announced today the launch of a comprehensive package of AI-enabled video accessibility tools designed to help universities comply with expanded ADA Title II regulations before the 2026 enforcement deadline. The patented, industry-first solutions suite includes AI-enabled audio description, captioning, and live captioning. 3Play Media’s new solution set uses patented technology to combine AI-enabled...
Back to Newsroom
  1. There was an issue with the authorization server. Please contact support if the issue persists.