We've seen a number of paradigm shifts in interaction with technology over the decades. The first interaction paradigm that I experienced with computers was one that involved using a keypunch machine to create punch cards that you fed into a hopper for the computer to read and the output was a printout from the printer. Next was a keyboard and cathode ray tube (CRT) followed by a personal computer. Next was a mobile phone using T9 texting followed by the amazing multitouch iPhone.
We're now witnessing another phenomenally important paradigm shift in interaction: the use of voice as input and audio as output. I remember using voice in the past as an input mechanism. In fact, I wrote much of my book, User-Centered Design: An Integrated Approach, using voice dictation with IBM's ViaVoice. That generation of voice interface was limited to dictation and voice commands to control a computer interface. The former was quite popular especially with specialized applications like medicine. However, the voice commands never caught on. The technology wasn't ready for prime time, but it is now. Today's voice interfaces don't require a computer and they're free-form. Amazon Echo is an instance of an ambient voice interface, being able to speak to it anywhere in a room, while the Apple AirPods are a personal instance that makes Apple's Siri accessible with the simple double-tab of the AirPods. Because it's personal, Apple's AirPods essentially act like an augmentation of the human brain. Issue a question or a command, like you would to your brain, and your trusty AirPods deliver the answer or action directly to you or for you personally without anyone else knowing.
So, what are the implications for this new interaction method for designers. Well, it means that the traditional mainstays of design, like typography, iconography, and color, are no longer the only types of skills that are relevant. And, voice, earconography, and tembre are now important. Is the voice whimsical, authoritative, or neutral? Is the earconography recognizable, meaningful, pleasant? Is the tempre that of a woman's voice, a man's voice, or mechanical voice? These are entirely new challenges for designers to understand, master, and apply.
Given that the interaction is now more natural and human, expectations are also higher, expecting human-like interactions. How intelligent the content of the interaction is turns out now to be crucially important too. I'll deal with that and the broader topic of artificial intelligence or what IBM more all encompassingly calls "cognitive computing" in a future post. Its worth noting that much content delivery is now also consumed via audio. The popularity of podcasts and audiobooks is evidence of that trend.
The point I'd like to leave designers with here is that future "user interfaces" may not at all be what you've been considering UIs thus far and the skills you'll need in this new world will also be different from the ones you've developed to date. Of course, not all interfaces will be voice and audio based but increasingly more and more will be, similar to the transition from full desktop user interfaces to increasingly mobile ones. The future will likely see certain interactions being delivered by voice, others by a mobile device, and yet others still using a computer. It's an exciting time to be a designer, as long as you add voice and audio interface design to the skills you're going to focus on in your Career Workouts (see my last post for more information on this).