Free Transcription in Teams and Zoom

Live captioning that supports live events has been made readily available when using Zoom, Microsoft PowerPoint, Google Slides, and Microsoft Teams. But when it comes to transcribing live meetings to make them compliant with the American Disabilities Act, the costs can add up quickly. Online service providers like Thumbtack and Yelp estimate the cost of transcription at between 75 cents to $1.50 per audio minute, and some services charge an hourly rate of $15-$30 for transcribing your audio files.

“Transcriptions cost at least a $1,000 for an all-day event; that’s incredibly expensive,” said James Cho, coordinator of technology services at the University of Nevada–Reno’s Joe Crowley Student Union. He has been sharing a no-cost work-around that can generate ADA-compliant transcriptions from Zoom and Microsoft sound sources by integrating their captioning features into a live event’s program video feed using a video switcher.

“By using the free captioning feature in Teams Live or Zoom, you can burn that content into a window in YouTube,” he said. Using a video editor’s masking feature, Cho crops out the entire video screen of the laptop running the software, except for the captioning area. He can then move that section anywhere, but the standard for ease of use is the upper or lower third of the screen.

Once the process is running, there is a about a five second delay between the live speaker and the production of the transcript, which Cho estimates has an accuracy of approximately 95%. Accuracy can be affected if the speaker is wearing a mask or if people talk over one another. That’s compared to an estimated accuracy rate of between 97-99% for a human transcriber, and that rate can also decrease when speakers wear masks or people speak simultaneously.

Cho has created a step-by-step guide, with screenshots, that explains how to integrate and then activate Zoom and Microsoft Teams captioning software to provide transcripts during live events. The guide, downloadable here, also provides a process for incorporating multiple microphones into the service by using a sound mixer to support a Zoom or Teams video conferencing session with on-screen transcripts viewable in a large event space.

Zoom supports live captioning during meetings and webinars, but they have to be typed directly into Zoom or added to Zoom via an integration with a third party software or service. And while most campuses have agreements with third party captioning services, that still doesn’t address the ability to provide transcriptions. The same is true if a Zoom meeting or webinar is recorded and saved to the cloud. Zoom can generate captions for the archived video using automatic speech recognition, but that typically requires editing before content is suitable for publication.

Using automatic speech recognition software is a process that can save substantial time over captioning video from scratch, but the software is not currently accurate enough to serve as a dependable service for people who depend on captions. Depending on the quality of the captions provided through the speech recognition software, it may be more cost-effective to download the MP4 video recording of the Zoom meeting and send that to a third party captioning provider, rather than fixing all the errors yourself in Zoom. Still, that is another charge and again, does not address providing transcriptions as a service. Event managers should also be aware of automated captioning solutions, such as those in PowerPoint and Google Slides, that cannot be turned off by students who find them distracting.

