Meet the Deaf Developer Behind Google’s Live Transcribe App
Seeing Is Believing When It Comes To Speech-To-Text Transcription
08 February 2019
Google’s new Live Transcribe app may actually be the real-time speech-to-text transcription technology I’ve been waiting for. And if seeing is believing, Google’s launch video may convince you that truly reliable real-time transcription for everyday use is finally here.
Watch Dimitri Kanevsky, a deaf Google research scientist, use the Android app to order tea at Starbucks and chat with a colleague about a weekend chili party.
And watch Dr. Mohammed Obiedat, a deaf professor at Gallaudet University, play a board game with his kids, chat with them about their schoolwork, and actively participate in a parent-teacher conference.
I loved the scene where Kanevsky orders at Starbucks. For years before I got my cochlear implants, I never ordered anything more than a tall coffee. Otherwise I risked having to engage in an impossible «excuse me?….what?….say it again, please» conversation with the barista.
And even the simple coffee order was a challenge before I realized they asked the same questions every time: «Ooo-fuh-eem?» was «Room for cream?,» «Eh-eh-eh eth?» was «Anything else?,» and «Oo-oo ah-a ee-eek?» was «Do you want a receipt?»
Real conversations, real-time transcription
But when Kanevsky orders his tea, the barista’s questions pop up on his smartphone screen as soon as she asks them. And he’s ready with immediate answers. He’s having a real conversation!
Kanevsky has worked on speech recognition and communications technology for 30 years. Deaf since early childhood, he had been disappointed by speech-to-text solutions that always seemed inadequate in spite of promising advances in digital transcription technologies. So he teamed up with Google engineer Chet Gnegy to develop the Live Transcribe app.
They also collaborated with Dr. Obiedat and others at Gallaudet University to better understand what features in the app would be most useful to people with hearing loss.
Combining proven technologies
Until now, truly effective real-time speech-to-text translation for everyday use has seemed just around the corner, but not yet here.
Sure, there’s been a flood of speech recognition technologies («Hey Siri») and other applications such as fast translation of voice mail to text messages. And real-time captioning of phone conversations has been an invaluable service for people with hearing loss (although live operators are still necessary to assist with the transcriptions).
But those promising technologies never quite seemed to come together in a truly intuitive smartphone appthat people can use for routine, everyday conversations. Now Google promises that its Live Transcribe app, introduced this week, is the answer.
Google says real-time voice-to-text transcriptions of conversations will be available in up to 70 languages and dialects. The app supports external microphones in wired headsets, Bluetooth headsets and USB mics. It’s built into Google’s new Pixel 3 phone, and it will be available for Android 5.0 phones and later.
Beta testers wanted
Currently in beta test, an «unreleased» version is listed by Google Research on the Google Play Store, and users who want to join the beta test can sign up on the Android web site. Or if you have a new Pixel 3 phone, you can activate Live Transcribe in its Accessibility settings.
Google has been playing catch-up to Apple in offering accessibility options for people with hearing loss. Now Live Transcribe, which is available only with Android phones, may give Google a nice boost in its ongoing competition with the iPhone ecosystem.
Waiting for the holy grail
For years, I’ve been waiting for someone to deliver the holy grail combining various proven technologies in an app that solves problems such as overall accuracy of speech recognition, latency (the processing delay in transcribing the speech and showing it as text on the screen) and ease of use.
Now, when Dimitri says in the video that «speech recognition finally became so good I could finally fulfill my dream,» it’s music to my ears and (thank you, YouTube captions) to my eyes as well.
In fact, if Google’s Live Transcribe app works as advertised, it might even entice me to trade in my beloved iPhone.
Why not make it available for google apps on all phones?
Microsoft have done generally on Skype across all platforms.
Not really. The accuracy on Skype is not good and it’s can’t tell context.
Maybe as a hearing person it works well as you can hear and subconsciously pick up the inaccuracies in your audio feed to put the pieces together to make sense. But as a deaf person, you don’t have the ability. Turn your speakers completely off and try having your conversation for a real idea of how well it works for the deaf. Not very well.
Go the Bose store and just try their «HEAR PHONE». Free and quick. I have found it better than any of the expensive hearing aids. It’s an amazing improvement for me being able to converse in a normal way. It’s comparatively inexpensive and doesn’t require batteries….just plug it in at night for 12 hours of use. FYI: I’m not associated with Bose.
In terms of COST, the BOSE are $500, and the PHONAK are $4,000. So you could purchase the Bose 8 times with the money you’d spend on the Phonak.
For phone calls, they BOTH are EXCELLENT. You can hear in BOTH EARS with either one of them. They BOTH allow you to MIX the sounds from the MICROPHONES with the sound of STREAMING, adjusting them to your liking. I can highly recommend BOTH the BOSE and the PHONAK. I will continue using both.