Articles/Differences Between UX and VX: What Really Makes An Effective Voice Experience?

Differences Between UX and VX: What Really Makes An Effective Voice Experience?

So, you wanna design voice interfaces, 'ey?

Recently within the digital community, there’s been a bit of discussion between the key differences of UX (User Experience) and VX (Voice Experience) design. As a UX designer who’s had the chance to work and develop some of the biggest industry VX projects, I’ve had the lucky opportunity to walk the tight rope between the two worlds.

Spoiler alert, they are the same world

This differentiation between the two ‘disciplines,’ boils down to a simple misconception. Voice itself.

The reality is, is that people don’t get voice yet. One of the pet peeves that I often experience is when I hear — both clients and designers — refer to voice as a platform. The sad reality is that it isn’t. Voice leverages existing web technology and gives users a new method of interaction. A more succinct distinction when describing voice is that it enables a new form of interaction that builds upon existing, digital systems. This by no means reduces the impact that voice is going to have on UX design, but does change the lens of how we should be approaching it as designers, and brands alike. This gives designers the opportunity to create products with a level of polish and human experience, that traditional input methods struggle to achieve.

Voice is easy to design, but extremely difficult to design well.

Here at VERSA, we’ve developed a model that informs our design outcomes, so that we’re always ensuring that we are designing for the optimal user experience within our voice projects.

There have been a few attempts in the past to model Jacob Neilson’s 10 usability heuristics onto voice, however, visibility of system status is simply not applicable when we’re talking about voice. These usability heuristics were created to evaluate GUIs (Graphical User Interfaces) or physical systems in mind, not VUIs (Voice User Interfaces). While there is a significant level of overlap, assessing or equivocating the success of VUIs through this means — can be argued, in some cases, disingenuous.

In the end, truly captivating voice experiences boil down to the following relationships - that go into building an impactful conversational experience.

Here at VERSA, we’ve developed a model that informs our design outcomes, so that we’re always ensuring that we are designing for the optimal user experience within our voice projects.

There have been a few attempts in the past to model Jacob Neilson’s 10 usability heuristics onto voice, however, visibility of system status is simply not applicable when we’re talking about voice. These usability heuristics were created to evaluate GUIs (Graphical User Interfaces) or physical systems in mind, not VUIs (Voice User Interfaces). While there is a significant level of overlap, assessing or equivocating the success of VUIs through this means — can be argued, in some cases, disingenuous.

In the end, truly captivating voice experiences boil down to the following relationships - that go into building an impactful conversational experience.

Without a doubt, this is voices greatest strength.

The ability to turn simple data inputs, into a joyful experience through conversational design is unparalleled by any other interaction method. As humans, we use our voice to communicate, the same lens should be applied to VX design. It allows us to bridge the gap between human to human conversation with that of digital systems. Experiences that include hints of emotionless responses are exemplified by voice, ultimately resulting in cold robotic outputs being met with even colder, user inputs.

Conversational design should always convey a level of transparency, empowering users with the confidence to step through an experience without having them question if their next phrase is going to be accepted as an input. If you looked up voice interface in the dictionary, chances are you’d be confronted with an image of a Viking Hurstwic, AKA a double-edged sword. This is because of two reasons:

Context should dictate responses

Conversational design should always convey a level of transparency, empowering users with the confidence to step through an experience without having them question if their next phrase is going to be accepted as an input. If you looked up voice interface in the dictionary, chances are you’d be confronted with an image of a Viking Hurstwic, AKA a double-edged sword. This is because of two reasons:

  1. Conversations are hard to predict in the best of times between humans
  2. A conversationally designed UI is built around the ability to map conversations.

These two truths populate two different sides of the same coin. Ultimately, it comes down to us as designers, to design experiences that support our user, at any stage of their voice journey. Users should always be able to ask for instruction. Experiences which are open have a level of elegance that can’t be mimicked, however, sometimes the most efficient responses are the best responses.

Efficiency of input

In a not so distant future, voice is everywhere. You log into your computer, start up your browser and are confronted with the results of two different surveys. Damn you think to yourself. Even though you love user data, excel at the best of times can be tedious and anger-inducing. The new intern Jimmy has nothing to do today. You want to sift through this data and pull out the tasty morsels. If only there was a way to easily decode the data with another person you ask yourself. Oh wait, there is!

You open your Alexa/Google Assistant/Bixby and simply say:

“Can you show the relationships between the two datasets”

A complex and time inducing task has been reduced to a single sentence.

Personality should be included to enhance an experience, without significantly impacting function or efficiency

In turn, when I ask Alexa for the weather, I don’t expect to be relayed to me in a funny voice. When I ask the Jim Carey skill for the weather, I would be struck if I was not presented the most over the top, metamorphic version of “13 Degrees with a chance of showers.”

The context of how and where users are using your experience should dictate the responses from the system.

This is the power of voice

I’m not saying our roles as UX designers are going to be replaced by AI-powered voice assistants. All I am saying is that complex, time inducing tasks that require multiple single inputs through traditional GUIs can be reduced down to a simple, single sentence, human input.

There’s a flip side to efficiency as well. Upon stating my request, I don’t need a smart assistant telling me “I hope you’re having a lovely day today, here are your results.” Sometimes the best response is the most efficient.

If I ask to turn my lights on, it’s totally fine just to hear a sound prompt telling me that my input has been heard, and for my lights to come on. I don’t necessarily have to have a conversation commenting on my turning on my lights.

In addition, voice is the only interface method that provides users with an unprecedented level of accessibility.