Friday, June 2, 2017

You're Going To Talk To All Your Gadgets. And They're Going To Talk Back.

It’s a familiar scene: a crowd of people poking slabs of illuminated glass, completely enraptured by their pocketable computers. They tap, tap, swipe while waiting for the bus, walking down the street, or slouched on a couch at a party made boring by all that inattention. Ever since the introduction of the iPhone a decade ago, touchscreens have kept our eyes cast downward instead of on the world around us.

Today, there are some 2 billion devices running Android, and another 700 million or so iPhones are in use based on analyst estimates. A generation of people, especially in markets like India and China, have come online with smartphones, bypassing mouse and keyboard–based desktop PCs altogether. Tap, tap, swipe is now more ubiquitous than type, type, click ever was.

That kind of growth has left device manufacturers anxious for another hit. But so far the touchscreen smartphone has proved too neat a trick to repeat. No matter what Next Big Thing comes its way — Google Glass, Apple Watch, Oculus Rift — people just seem to keep their heads down. Swipe, tap, poke, pinch.

But while we were paying attention to the things on our wrists and (ugh) faces, another major technological shift took hold. And this new interface, just now becoming mainstream, is the next era of computing, just as surely as the punch card, command line, mouse-and-keyboard graphical interface, and touch interface that came before it. It’s also the oldest interface in the world, our first method to communicate with each other, and even with other animals — one that predates letters or language itself.

It’s your voice. And it’s the biggest shift in human-computer interaction since the smartphone.

It was hard to see, initially, just how transformative the portable touchscreen computer was, precisely because it lacked a keyboard. But it was the ability to hold the world’s information in our hand, in a visually accessible way that we could manipulate with our fingers, that turned out to be so powerful. Pinch to zoom, pull to refresh, press and hold to record and share.

Voice-based computing will be everywhere, listening to what we say, learning from what we ask.

Right now, something similar is underway with voice. When voice-powered devices come up, the conversation often turns to what’s missing: the screen. Especially because for most of us, our first experience with a voice-based interface took place on a touchscreen phone. (Hey, Siri!)

But when you take away the keyboard, something very interesting happens, and computing becomes more personal than it ever has been. Soon, human-computer interaction will be defined by input methods that require little know-how: speaking, pointing, gesturing, turning your head to look at something, even the very expressions on your face. If computers can reliably translate these methods of person-to-person communications, they can understand not just what we say in a literal sense, but what we mean and, ultimately, what we are thinking.

In the not-too-distant future between now and Black Mirror, voice-based computing will be everywhere — in cars, furniture, immersion blenders, subway ticket counters — listening to what we say, learning from what we ask. Advanced supercomputers will hide under the guise of everyday objects. You’ll ask your router, “Hey Wi-Fi, what the hell is wrong with you?” Or your fridge, “What’s a recipe that uses all of the vegetables about to go bad?” Or just, to the room, aloud, “Do I need a jacket?”

Best of all, most people will be able to use this new species of gadgets, not just those with technological proficiency. Proxy devices, like keyboards and mice, require training and practice. But in this vision of the future, you’ll be able to use natural language — the kind of speech you’d use with a date, your kids, your colleagues — to access the same functions, the same information that typing and tapping can.

Make no mistake: The touchscreen isn’t going anywhere. But increasingly we’re going to live in a world that’s defined by cameras and screens, microphones and speakers — all powered by cloud services, that are with us everywhere we go, interpreting all our intents, be they spoken, gestured, or input via a touchscreen keypad.

Welcome to the age of ubiquitous computing, or the ability to access an omnipresent, highly knowledgeable, looking, listening, talking, joke-making computer at any time, from anywhere, in any format. In many ways, we’re already living with it, and it all starts with your voice.

The Rise of Alexa

Photo Illustration by BuzzFeed News; Images courtesy Amazon, MGM

Dotsy lives in Palm Beach, Florida, and she won’t tell me exactly how old she is. “You can google that!” she says, laughing. “It frightens me to say it out loud.”

The octogenarian is Really Cool. About a month ago, she picked up a new instrument — the drums — and recently “jammed” at a friend’s “crib” (her words). She’s an avid reader who tries to keep her schedule open and in flux. But there’s one thing she needs a little help with: seeing.

Dotsy owns not one but two of Amazon’s Echo devices: She keeps a smaller Dot in her bedroom (her daily alarm) and the larger flagship Echo speaker on the porch (where its AI personal assistant, Alexa, reads her Kindle books aloud). Alexa’s primary role in Dotsy’s life, though, is providing information that her eyes have trouble discerning. “I find it mostly helpful for telling the time! I don’t see very well, so it’s a nuisance for me to find out what time it is.”

Her vision isn’t good enough for her to use a smartphone (though she wishes she could) and she doesn’t use a computer. But Dotsy loves being able to ask Alexa questions, or have it read her books.

Overall, “I think it’s wonderful!” Dotsy proclaims. She sometimes even thanks Alexa after a particularly good response, which prompts the AI bot to say, “It was my pleasure” — a nice, human touch.

It turns out that the internet-connected, artificially intelligent Echo is more accessible and more powerful than a mobile device or laptop for someone who touchscreens have left behind.

Alexa and the Echo have lots of room for improvement. Dotsy still can’t add new “skills” on her own, for example, because that requires the Alexa mobile or web app. ("Skills" is Amazon's term for capabilities developers can add to Alexa, like summoning an Uber or announcing what time the next bus will arrive.) But what Amazon showed with the Echo is where voice computing is best positioned to prevail: your private spaces.

Google and Apple had voice assistants long before Alexa came along, but those were tied to your phone. Not only does that mean you have to pull it out of your pocket to use it, but it’s also prone to running out of battery, or being left behind (even if behind is just the other room). The Echo, on the other hand, is plugged into a wall, always on, always at attention, always listening. It responds only to queries that begin with one of its so-called wake words such as “Alexa” or “Computer,” and is designed to perform simple tasks while you’re busy with your own. So when your hands are tied up with chopping vegetables, folding laundry, or getting dressed in the morning, you can play a podcast, set a timer, turn on the lights, even order a car.

“A lot of personal technology today involves friction,” said Toni Reid Thomelin, VP of Alexa experience and Echo devices. “We envision a future where that friction goes away, the technology disappears into the background, and customers can be more present in their daily lives.”

Amazon’s Alexa-powered speaker was a sleeper hit. The company, which rarely issues public numbers, won’t say officially how many Echo speakers are active or have been sold (just that it receives “several millions of queries every day” from “millions” of customers). A recent survey shows that sales have more than doubled since the product first launched in late 2014 and estimates that around 10.7 million customers own an Echo device, but that number, which doesn’t account for those with multiple devices, likely does not reflect how many Echoes have actually been sold.

“A lot of personal technology involves friction...We envision a future where that friction goes away, the technology disappears into the background, and customers can be more present in their daily lives.”

The Echo’s sales are still small compared with Siri and Google Assistant’s reach, but the device has garnered mainstream popularity (even the Kardashians have one) and legions of superfans in a way that other assistants simply haven’t. The flagship Echo has over 29,000 reviews on Amazon from “verified purchases” (people who actually bought their Echo through Amazon) and among those, nearly 24,000 are positive. The most telling numbers, though, are the ones on Reddit. The number of /r/amazonecho subreddit subscriptions (about 37,000) eclipses that of Google Assistant’s subreddit (280) and Siri’s (1,502). It’s also worth noting that there are even more (3,800) on /r/SiriFail.

The Echo is, for many of its highly satisfied users, the ideal at-home smart device. It’s so easy to use that toddlers who can’t read yet, and seniors who have never used a smartphone, can immediately pick up a conversation with Amazon’s AI-powered robot.

For years, the online bookstore turned e-commerce giant had been unintentionally working on the infrastructure for a voice-enabled AI bot. “We were using machine-learning algorithms internally at Amazon for a long period of time,” said Thomelin, who has been with the company for nearly two decades. “Mostly in the early days, for our Amazon.com recommendations engines. And seeing the success of our recommendations engines, we thought, How could you use those similar techniques in other areas throughout Amazon? That was a big piece of what helped bring the Echo to life.”

Leaps in cloud computing — or the ability to process data on a remote, internet-hosted server instead of a local computer — were also crucial to the Echo’s development. “About five years ago,” Thomelin said, “we saw internally how fast cloud computing was growing with AWS...so we wanted to capitalize on all that computing power being in our own backyard and bring it into a new device category like Echo.” (AWS, or Amazon Web Services, is a platform originally built to run Amazon’s own website, but now handles the traffic for some of the biggest internet companies in the world, including Netflix, Spotify, and Instagram.) When people talk about the cloud what they really mean is a bunch of Amazon server farms. It’s those machines that host Alexa’s knowledge and instantly sling its responses to millions of Echo devices simultaneously.

But the magic of the Echo isn’t that it’s particularly smart — it’s that it is an exceptionally good listener. The Echo can hear a command from across a room, even with a TV or side conversation running in the background. Alexa has a far better ear than anything that has come before, and stands out among all those other “Sorry, I didn’t catch that” assistants.

Rohit Prasad, VP of Alexa machine learning, said that a voice-first user interface was a far-fetched idea when the team began developing the technology. “Most people, including industry experts, were skeptical that state-of-the-art speech recognition could deliver high enough accuracy for conversational AI,” Prasad said. There are a lot of challenges when it comes to recognizing that “far-field” (or faraway) speech, and a particular one for the Echo is being able to hear the wake word, “Alexa,” while the device is playing music loudly. Advancements in highly technical areas — such as deep learning for modeling noisy speech and a uniquely designed seven-microphone array — made that far-field voice recognition possible.

And Amazon is now handing out that technology — that special sauce. For free! To anyone who wants to build it into a device of their own! One of Amazon’s priorities is getting its assistant in as many places as possible, and it is doing that by providing an API and a number of reference designs to developers, so that the Echo is just one of the many places Alexa can be found.

Amazon is moving fast, with thousands of people in the Alexa organization alone (up from one thousand at this time last year). The company is also investing huge sums of money in companies interested in building Alexa into their products, like the smart thermostat maker Ecobee, which got $35 million in a funding round led by Amazon. In April, Steve Rabuchin, VP of Alexa voice services, told me the team is focused on integrating the voice assistant with a breadth of devices, including wearables, automobiles, and appliances, in addition to smart home products. Amazon wants to make sure that users can ask, demand, and (most importantly) buy things from Alexa from anywhere, at any time.

This massive, almost desperate effort isn’t surprising. Amazon at last made the AI assistant people love to talk to. But it had a late start compared to Google, Apple, and even Microsoft. And what’s more, a big hurdle still stands in Alexa’s way to becoming the go-to assistant we access from anywhere, anytime. Because there’s already a device we carry with us everywhere, all the time: the smartphone.

Advantage Apple

Photo Illustration by BuzzFeed News; Image courtesy Apple

At the end of 2016, Apple beat out Samsung to become the number one smartphone maker in the world, with 78.3 million iPhones sold that holiday quarter alone, compared with 77.5 million Samsung handsets in the same period. It’s a massive device advantage over not just Samsung but everyone. And what’s more, every one of those devices shipped with Siri.

Apple acquired Siri, a voice-command app company, in 2010 and introduced an assistant with the same name built into the iPhone 4S in 2011. When Siri hit the market, it instantly became the first widely used voice assistant. Google Now, the predecessor to Google Assistant, wouldn’t ship for another year, and Alexa and Microsoft Cortana for another four.

The only problem? It sucked.

The voice assistant’s high error rate at launch has plagued Siri’s perception to this day, even though its recognition capability has improved significantly (by a factor of two, thanks to deep learning and hardware optimization).

After the fifth anniversary of Siri’s launch last October, app developer Julian Lepinski nailed why users can’t get into the assistant: because they just can’t trust it. “Apple doesn’t seem to be factoring in the cost of a failed query, which erodes a user’s confidence in the system (and makes them less likely to ask another complex question),” Lepinski writes. Instead of asking clarifying questions or requesting more context, “Apple has a bias towards failing silently when errors occur, which can be effective when the error rate is low.”

Siri is by far the most widely deployed assistant, with a global reach that spans 21 different languages in 34 countries. Google Assistant supports seven languages, while Alexa supports just two. Still, Siri usage isn’t what it could be. Apple says that it receives 2 billion non-accidental requests a week, which means that — if the 700 million active iPhones estimate is correct — that’s only 2–3 queries per phone, every seven days.

Meanwhile, it’s under assault on its own devices. There are now multiple options for assistants on the iPhone, all vying to be the AI of choice for iOS users: Amazon baked Alexa into its main shopping app for iPhone and Google launched an iOS version of Assistant this year.

Of course, neither are as as deeply integrated or accessible on the iPhone as Siri. So Google and Amazon are racing to prove that their assistants are worth the extra taps. They’re also trying to do end runs around the phone altogether, by releasing tools that let developers build their assistants into all the devices we surround ourselves with. It starts with speakers, but cars and thermostats and all manner of other things are on deck.

Apple, meanwhile, has been trying to change Siri’s image — so that no matter what’s around you, you’ll say “Siri” and not “Alexa” or “Okay, Google.” In an August 2016 interview with the Washington Post, CEO Tim Cook was asked whether Apple can catch up with Facebook, Google, and Amazon’s AI capabilities, to which Cook responded, “Let me take exception to your question. Your question seems to imply that we’re behind.”

Siri is under assault on Apple's own devices.



from BuzzFeed - Tech https://www.buzzfeed.com/nicolenguyen/my-voice-is-my-passport?utm_term=4ldqpia

No comments:

Post a Comment