At the CEWIT 2013 Conference in October, I spoke about the challenges of visual computing and developing an exclusively voice driven user interface. In preparation, I began by researching forecasts for the wearable computing market and I was amazed at how the forecasts were scattered all over the place. In the graph below you can see that statistics for device shipments through 2018 range from close to 500 million all the way down to less than 100 million. Although the discrepancy is significant, it’s safe to conclude that the wearable computing market is going to grow. How much, how fast, and how long will it take? Who knows? What we do know is that it will grow!
I looked at the evolution of head mounted computers and again was fascinated at how far we’ve come. From Steve Mann’s backpack computer in the 80’s to a fully self-contained head wearable computers like the Motorola HC1 and Google Glass of today. You can now put a smart device on your head and actually speak to and it listens. Or does it?
What are the major challenges of visual computing?
As we began developing the voice driven UI that drives one of our products we came across two major challenges related to the speech engine:
1.)Features – What are they and are they obvious to the user?
2.)Phrasing – What can the user say to make the application work? And How should they say it?
These two challenges may sound easy to solve but in our experience it took years to get a UI in a place that addresses these issues effectively. Now, take these two challenges related to the speech engine and add a head wearable computer to the mix where voice is the only input and a micro display and mini speaker are the only feedback as the user interacts with the computer. This combination posed several new visual computing challenges including:
- Dealing with speech recognition engine personality
- Dealing with environmental issues (e.g. noise)
- How to address varying accents
- Processing different vocabulary
- What can the user say? How do you cue the user?
- How do you lay out an application and workflow that is voice optimized?
- How to address security concerns like logging into the application
We addressed many of these challenges by coming up with a standard operating model across the application so that wherever the user is within the application, the methods for getting what they need were consistently the same. We also limited the number of commands on each screen and carefully selected the vocabulary on each screen. This made the speech engine extremely accurate and efficient!
Additionally, we spent hours in brainstorming sessions, UX storyboarding and workflow sessions, planning meetings, design meetings, etc. figuring out what the best approach should be to creating the UI. The key to success in all of these meetings was to include, not just designers and engineers, but also members of other departments who knew little to nothing about the application. Inviting these folks to the table gave us a fresh perspective and helped us develop a system from a human-centered point of view. How many times have you picked up a new hardware device or purchased new software and immediately been able to tell it had not been designed with you in mind? We did not want that to happen with our product.
Lastly, how is a secure login created when voice is the only input? We did it by using a built in head tracker on the HC1 device and a mouse pointer to hover over each password item then selecting it with a voice command like “select item”.
Developing a voice driven user interface was certainly an exciting project but by no means was it always fun. In my sleep I can still hear echoes of one of my engineers saying repeatedly, ad nauseum, Window Close”; “Window Close”!! “Window Close”!!!!!