The chairman and CEO of voice recognition pioneer Nuance Communications confirmed for the first time this past week that his company's technology powers Apple's Siri voice-enabled virtual
assistant. He also forecast the coming of cross-platform voice agents.
Paul Ricci, interviewed at The Wall Street Journal's D11 technology conference in the Los Angeles area, acknowledged this poorly kept secret, and noted that his company's technology is used in Siri both on the client side as well on the back end.
Ricci added that Nuance's technology is not being used in Google's voice recognition software, although it is being used on some other Android devices, such as Samsung's S Voice and HTC's virtual assistant. Ricci pointed out that the natural language processing for Siri originally began with the technology that Apple acquired when it bought the company by that name in 2010.
He predicted that voice agents will become even better at "command and control" of a device, learning from users' habits and preferences and becoming better at recognizing context and situations. He also envisioned that virtual assistants will become available across platforms, so that, for instance, you can continue a "conversation" with a specific voice agent as you move from your TV to your phone to your tablet.
In fact, Ricci said, he could envision microphones as standard components in a home, so that your voice agent can react to your voice-delivered commands for your appliances.
Nuance has seen the future for some time now. Ricci noted that the company had been showing examples of Star Trek's voice-recognizing, omniscient computer during its early presentations, when its only products were Dragon Naturally Speaking speech-to-text and various call center applications. In addition to embedding technology for Siri and other advanced smartphone voice agents, Nuance's products now include the Swype predictive keyboard.
Aside from working better in noisy environments, Ricci said that the main areas for improvements in today's speech recognition are cognition -- understanding all the nuances, so to speak, that go into common speech in common situations, not to mention the complexities of uncommon contexts.
With voice agents becoming more widely available in smartphones and game consoles such as Microsoft's Xbox, Ricci said he expects to see the technology become common and built-in on other platforms, such as TVs and cars, where it currently exists but is rare and non-standard. In moving "devices" like cars, the client needs to be robust, since the major processing that takes place on the back end cannot always be relied upon, a fact that Ricci noted.
One possible expansion area for Nuance is login via voice. Earlier this month, for instance, it released the results of two surveys it conducted, which found that 85 percent of users are dissatisfied with current authentication methods, and that 90 percent would be eager to use voice biometric solutions for logins. Voice biometrics are intended to provide secure authentication through a user's unique voiceprint.