Six Colors
Six Colors

by Jason Snell & Dan Moren

Support this Site

Become a Six Colors member and get access to an exclusive weekly podcast, community, newsletter and more.

Linked by Dan Moren

The human side of voice recognition

Bloomberg’s Matt Day, Giles Turner, and Natalia Drozdiak on the humans who audit recordings from Amazon’s Alexa in order to improve speech recognition:

Amazon.com Inc. employs thousands of people around the world to help improve the Alexa digital assistant powering its line of Echo speakers. The team listens to voice recordings captured in Echo owners’ homes and offices. The recordings are transcribed, annotated and then fed back into the software as part of an effort to eliminate gaps in Alexa’s understanding of human speech and help it better respond to commands.

And just in case you think Amazon’s the only one doing this:

Apple’s Siri also has human helpers, who work to gauge whether the digital assistant’s interpretation of requests lines up with what the person said. The recordings they review lack personally identifiable information and are stored for six months tied to a random identifier, according to an Apple security white paper. After that, the data is stripped of its random identification information but may be stored for longer periods to improve Siri’s voice recognition.

At Google, some reviewers can access some audio snippets from its Assistant to help train and improve the product, but it’s not associated with any personally identifiable information and the audio is distorted, the company says.

There are a couple of different takeaways here: firstly, that our technology apparently isn’t yet good enough that any of these systems can get away without human intervention.

Point two is that there really should be some sort of standard for how this data is treated by companies around the world. If human intervention is required, it shouldn’t be up to each company to decide how it’s going to protect that information, which is occasionally sensitive.