Smartphone users have gotten used to speaking to virtual assistants like Apple's Siri and Google Now, and a new set of startups are guessing users will be excited about talking with devices from thermostats to smartwatches.
"There is a very big fascination about man talking to machines," says Alex Lebrun, founder of Wit.ai, which provides speech recognition services for Internet of Things devices. "There is a kind of cultural fascination with that, starting with 2001: A Space Odyssey."
And even once the novelty wears off, makers of voice recognition tools are betting they'll be won over by the convenience of simply speaking to their appliances instead of hunting for remote controls or tablets and smartphones with control apps.
"Our long-term goal is to give users complete autonomy over their homes and smart products,"Insteon CEO Joe Dada said this summer in announcing the home automation company would be integrating Microsoft's virtual assistant Cortana into the Windows Phone app that controls its lights, thermostats, and other networked devices. "Adding a voice-driven, personal assistant into the mix is just another way that we can make people's lives easier."
And vendors say they're working to make sure 2001's vision of out of control computers doesn't turn out too prophetic, by integrating passwords and user voice recognition features to make sure users can control just who can order their appliances around.
"It's what we call speaker authentication, or speaker ID," says Lebrun. "This is exactly the kind of thing we'll add in the next months to the API."
The possibility of appliances responding to unauthorized commands was a joke on 30 Rock in 2011, with a voice-controlled TV reacting to on-screen dialogue, and, earlier this year, some Xbox owners claimed a Breaking Bad commercial showing characters playing a voice-controlled Xbox actually activated their video game systems.
Lebrun says Wit.ai, which translates spoken commands into structured data appliances can parse, will soon be able to give different level of access to homeowners and their guests, or parents and their children. The company's also adding emotion-detection capabilities, so a device can react differently to an angry or frustrated user versus a calm one, he says.
Even before many voice recognition platforms have such features, users can add some security by customizing the messages they use to control devices like garage doors, effectively integrating passwords so an outsider won't be able to unlock a home simply by guessing command phrases, says Leor Grebler, cofounder of Ubi, which makes a standalone home voice recognition system that can communicate with other devices and apps like IFTTT.
And, it turns out, users aren't just using new voice recognition systems to boss their appliances around, they're also using them to talk to each other. Grebler says that during Ubi's Kickstarter campaign, he was surprised to see users buying multiple devices to use as intercoms and started building in more human-to-human communication features.
"I can now actually have preconfigured triggered messages," he says of the devices, which are now on sale to the public for $299. "You can have Ubi be the bad guy in the family like 'Okay kids, brush your teeth.'"
And some users have Ubi announce when a family member arrives home or a pet with a proximity sensor on its collar makes an unexpected getaway.
"If the dog runs out, and the proximity tag changes to not in proximity, it'll announce, 'the dog just ran out the front door,'" says Grebler.
And Wit.ai's emotion detection will be useful in automatically adding emoticons to dictated texts and instant messages, a common use case, Lebrun says. Lebrun says he was surprised to find Wit.ai's systems being used for human-to-human communication, but it makes sense, since software can parse out elements of messages like meeting times and places and save users the trouble of dropping in calendar invites and map links, he says.
"You can use with it with voice, but it will also analyze what you say, and provide context," he says. "So if you say, let's meet this place at 6, it will show you a map and a calendar event."