Cloud & Captioning


ASR + machine learning

captioning serving the needs of the Deaf and Hard of Hearing community


High quality captioning is usually delivered by highly trained real-time captioners using human techniques based on stenotype machines. Because of the high cost and limited availability of this method, companies are moving to automatic speech recognition (ASR) that delivers captions at a lower price.

Most of the real-time captioning technologies in the market that use ASR are offering varying degrees of quality that are not enough to provide access to the intended audience - the deaf and hard of hearing community.

This is why Cloud & Captioning combines engines and data in a learning machine to offer an accurate text representation of the audio.


1  Live audio from a studio feeds the Cloud & Captioning device. This audio can be analog line-level or digital in SD/HD/3G-SDI format.

2  An audio signal is sent to the cloud, where it is received first by the Sound Analyzer filtering module.

3  Then, the filtered sounds are translated into text-based by sophisticated AI (artificial intelligence) algorithms that take daily content from the Internet, from a list of customized common terms and bad words, and proprietary databases to build an accurate sentence.

4  After that, an assembler module converts the text into closed captioning EIA-608/708 format according to previously defined row position and captioning format settings.

5  The captions created by the assembler are sent back to the Cloud & Captioning device.

6  Based on the output, selected captions will be available through a specific closed captioning encoder or a live web streaming platform like Facebook Live, YouTube Live or ZOOM.