- Bradley Lehman, May 2008
I'm a performing musician and a composer. Some of the music I perform is fully written-out. Some of it is improvised from only a bass line, shorthand symbols, experience, and listening skills. Whatever is written on the page, it only matters if it affects the resulting sound in a performance. The performance is a crafted stream of sound, constrained to make sense as sound alone: with no visual component either to aid or distract from the delivery. Timing, articulation, pitch, and pacing are some of the most important tools.
Good radio programming is more compelling than television, because it engages my attentive imagination. Television makes things visually explicit, encouraging a more passive experience. The medium of radio forces the soundstream to stand alone, on merit.
Telephone systems (both touchtone and speech recognition) must deliver their interactive information only through sound. The system presents a context and some options, and the caller participates by choosing some path through them. Everything must be clear on the first time it's heard. As a musician I assess things by the sound. I listen for delivery, pacing, intelligibility, and the telephone system's ability to supply context. Poorly-chosen ideas, clunky pacing, and confused phrasing bother me as much on the phone as they do in mediocre music. Didn't the designers, programmers, or company care enough to supply a good product? Is telephone support important to their continued business, or not?
The more I listen to telephone systems, some good and some unimpressive, the more I feel the phrase crafting and editing should be done by sensitive musicians. So should the voice coaching when the recordings are made. A well-crafted soundstream is that important. It advertises a company's attention to detail, and commitment to customer service.
There are more to add to this list, as I continue to study and apply other people's theories of Best Practices. There is always more to learn, and to bring into practical approaches along with my own experiences. Disputes of style will always be with us, just as they are in music.
It's stimulating to identify practical examples that sound "wrong", anywhere that they may be found, and to seek strategies of improvement. It's also stimulating to hear problems that have been solved inspiringly well.
There are also some remarkably good and inspiring systems out there: remarkable for their unremarkability. They work fine and courteously, without creating silly user irritations. But, any little thing can go wrong and the designer needs to know about it.
There are at least 14 typical actions that the caller might take at every available decision point. (0,1,2,3,4,5,6,7,8,9,#,*, hang up, or wait through a timeout -- and that's not counting any of the double-keystroke actions, or the entry of multi-digit fields such as membership numbers or phone numbers.) The system has to have a contextually relevant and intelligent response to every possibility, following up with a prompt and a path that make sense. Good systems furthermore do something different at the third or fourth timeout as opposed to the first several, and they take some other intelligent branch if the caller has had three or four keyed errors on a single question.
It is in everyone's best interest when well-crafted prompts get the caller to cooperate, politely and without fuss. The caller gets some appropriately deliverable customer service, with (one hopes!) not too much wasted time and effort. The company gets a reasonable phone bill plus not-yet-fatally-dissatisfied customers.
Somebody at the company should keep testing the odd paths all the time, and listening to feedback from real-world users of the system. Give the callers a "Suggestion Box" on phone and/or web to leave comments about the phone system. Pay the designer to spend a couple of hours every week calling all the systems, checking the usability for any problems. Remember the movie The Doctor where William Hurt's character has a terrible bedside manner, until he becomes a patient in his own hospital? Well-designed service must show empathy for the users and their frustrations. The phone system ought never make the customers more upset than they already are, when calling in for help.
A corporate IVR/VUI designer is just as important as a corporate webmaster. The company's public image is at stake. Does the company really care about its customers? And keeping its customers?
I like re-prompts that phrase the questions in a slightly different way the second time, and with different inflection. That's how people talk when trying to elicit information from other people. It shows flexibility, and empathy with the other person's point of view (or confusion). The varied tone of voice shows concern, and a desire to help. It shows that the company is interested in communication with human callers.
If an IVR system merely plays the same small set of prompts over and over, on repeats, it shows that the designer or the company valued computer restrictions ahead of customers. Perhaps it was programming laziness, or a cost-cutting rush during development, or an unwillingness to prepare perfect phrases for each context in which they'll be used. Perhaps the builders didn't even bother to think deeply about usability. Perhaps they built or tested the thing visually, more than listening to it with a fresh mind. Clearly, no human-factors expert was brought in to push for improvements before release.
And why force the caller to conform with one and only one rigid path to get to the needed information? Spoken "menus" stink. The world doesn't fit into singly-connected lists of things. The callers can't see what's coming, but can only hear what arrived before they pressed something to move on. Human conversation doesn't go through (or back to!) stiff menus, and neither should automated support. Ideas and thoughts bounce around, with interconnections.
The caller's keystrokes and timeouts tell a story about a human need. The caller dialed in for some presumably legitimate reason: needing help with something. Callers aren't trained, and they're not looking at anything in particular. (They're certainly not looking at the same thing the system developers were looking at, onscreen or on paper!) Callers could be inattentive or in any sort of mood. They can press anything they want, or nothing. They can be patient or in a hurry. They can be multi-tasking. They can be unsure what they really want, or unsure what the system is able to deliver.
The system has to guide all these callers in useful directions, while the callers feel respected and in reasonable control of their own requests. The system must provide service that seems empathetic with that human need. It has to be flexible and resourceful in response...without the advantage of any normal human clues for interpreting intentions or emotions.
I'm a caller. I need service. Now. I dialed for help. The system can't sense how I'm feeling today, or how well I'm paying attention to it, but it has to handle me anyway. Give me multiple reasonable ways to get to my answer. If possible, give me multiple acceptable ways to frame my own request! There's a better chance I'll get served decently instead of hanging up. There's a better chance the company gets to keep my business. I'm highly educated, but I shouldn't have to use that as the customer of a doggone phone system. Prompt me courteously on a 5th-grade reading level, offering logical choices and sensitivity that would score with a kindergartner.
This is why IVR is difficult, time-consuming, and expensive to do well. [Tip of the iceberg: speech recognition is ten times harder yet.] It needs a conscientious and ruthless person to stay on top of all those possible problems of bad design. Everything must be designed and thoroughly tested from the user's point of view. Er, point of hearing. "View" is irrelevant, and menus are invisible fluff. The sound, pacing, and continued accuracy of the system are everything. The thing either makes crystal clear sense as a soundstream, or it needs improvement. Any broken navigation reflects badly on the company's customer service commitments.
And what about those Store-Locator features on phone systems? They're useful, but they could always be improved. Too many of them sound like brochure-ware, reading addresses exactly as they appear on paper or a computer screen. It doesn't work on IVR. If the caller is riding in a vehicle, or worse, driving around looking for the building, the fast reading of an address is informative but not helpful.
Try this experiment. Go to a company's web site, bring up the Store Locator, put in a zip code, and spend two seconds glancing at the screen listing four or five store locations. What are the two pieces of information your eye looks for first? Driving distance (if available) and City name, most likely. The eye jumps directly to the city name on the third line of the printed address. If you don't want to go to that city, you don't care about its street address or phone number. You don't bother reading them. Your eye skips automatically to the city of the next address.
Now, what do you hear on most IVR store locators? It dutifully reads you a recording starting with the street address, maybe also the name of a shopping center, and only later it tells you what city it's in! The ear hasn't heard the all-important city name or distance first to set mental context, and the ear can't skip ahead.
On the phone, the delivery also has to allow enough padding time before and during the address: giving the caller time and mental context to jot down the wanted information. Here is the way I did it for an Illinois government-office locator in 2007:
Read the following presentation aloud as someone writes down the information you're speaking! (The caller is also allowed to interrupt the address at any time by saying "NO" or pressing any key, and the system moves ahead to the next location found.)
OK, I have the office information. When you are ready to write it down, say yes. "YES"
I found more than one office, so I will read them to you one at a time.
After you have heard the information you want, you can simply hang up.
It is in the city of: Belleville. [Pause 1 second]
Family Community Resource Center. [Pause 1 second]
The street address is: 1-2-2-0 Centreville Avenue.
Belleville, Illinois. [Pause 1 second]
The ZIP code is: 6-2-2-2-0. [Pause 1 second]
The office's phone number is: 6-1-8, 2-5-7, 7-4-0-0. [Pause 2 seconds]
Do you want me to repeat that address? "NO"
Here is the next office. [pause 1/2 second]
It is in the city of: East Saint Louis. [Pause 1 second]
Family Community Resource Center. [Pause 1 second]
The street address is: 2-2-5 North 9th Street.
East Saint Louis, Illinois. [Pause 1 second]
The ZIP code is: 6-2-2-0-1. [Pause 1 second]
The office's phone number is: 6-1-8, 5-8-3, 2-3-0-0. [Pause 2 seconds]
Do you want me to repeat that address? "NO"
OK, that's all for now. [pause 1/2 second]
This office search is also available on the internet. Go to w-w-w....
Again, everything in a phone system needs to be designed so it makes usable sense as a sound stream. The callers won't be sitting at desks or looking at anything. They might be distracted by other simultaneous tasks. They might need to write down what they hear.
Poorly-designed systems expect callers to be empathetic with computer restrictions (or design restrictions!), patient and forgiving with bad phrasing, and able to figure out bizarre and unnecessary problems...just to get any service at all. The callers are unwilling and unpaid servants of the bad design. Their own needs have to be squashed into the several rigid things the system is able to deliver, on its own terms, at its own speed. Well, who should drive the requests during the call? The automated system, or the customers needing to be served, getting something of value from the call? Didn't they take the initiative to dial in? Can't they hang up in disgust whenever their needs aren't satisfied, and usually take their business elsewhere?
Good systems let the callers "get in, get out, and get on with their lives" (to paraphrase the slogan of a restaurant chain). The well-designed computer system is the customer's helpful and courteous servant, not the customer's master. The customers' needs must be heard and satisfied, intelligently. Dozens or hundreds of customers, all at once.
I am passionate, maybe even obsessive, about designing these systems well.
End of manifesto, for now....
© 2008 Bradley Lehman