Heeding Customer RequestsHeeding Customer Requests

T. Rowe Price implements a natural-language voice-recognition system that lets retirement-account customers check their balances and fund prices, request statements, and, eventually, initiate transactions, all by talking to a computer over the phone.

information Staff, Contributor

July 5, 2001

7 Min Read
information logo in a gray background | information

How would you prefer to access information--wade through a menu of computer-generated options, or ask someone a simple question and get a precise answer? Conventional interactive voice-response systems radically limit user interaction to touch tones or menu-driven responses. Fortunately, next-generation technology that understands natural language can remove these strictures and achieve the elusive goal of having applications conform to users instead of the other way around.

That approach is being realized at T. Rowe Price Inc., a financial-management company that administers corporate 401(k) plans. A pilot project that uses IBM's natural-language-understanding voice-recognition system will give retirement-plan customers voice access to applications that will perhaps rival interaction with customer-service representatives.

While interactive voice-response systems have partially addressed this need in the past, many people have found navigating through menus to be confusing and frustrating. By speaking to an automated phone attendant based on IBM's latest voice-recognition products, T. Rowe Price customers will be able to ask questions and get immediate answers about their retirement accounts and investment-fund offerings; eventually, they'll also be able to initiate transactions such as stock trades.

Increasing complexity was the main reason that led the Baltimore company to invest in the cutting edge of natural-language processing. As retirement plans have grown more complex, with added products and services, T. Rowe Price has responded by making inventory and account information available through an interactive voice-response system. However, about 2-1/2 years ago, company officials recognized that they had reached a point where the voice-response system was growing unwieldy. The menu on the old system had reached a level of complexity that was difficult for the average customer to navigate. It was great that the company was able to automate a lot of sophisticated features, but the more options there were, the less usable the system was--so T. Rowe Price looked for ways to simplify the system. Speech recognition was an attractive alternative.

Voice access to enterprise systems such as databases and order-processing applications will grow rapidly this decade, analysts say. "Voice is the next killer application poised at the start of a major technology investment cycle," says Marianne Wolk, a senior analyst at Robertson Stephens. Market analysts at the Kelsy Group predict that worldwide revenue from voice applications will reach $41 billion by 2005.

Speech technology that's widely available, including IBM's WebSphere Voice Server, lets companies build functional low-end speech-recognition applications that use complex grammars. However, support for complex grammars essentially means that you must hire a linguist to build a grammar that captures the superset of utterances users might make. The problem is that this type of system doesn't scale very well because it's hard for a linguist to predict everything customers might ask and then for a compiler to be able to provide the level of performance needed.

T. Rowe Price instead opted for slightly ahead-of-the-curve technology that understands natural language. It wanted a system that could let customers simply ask for what they want, instead of having to find it on a menu. The company had already been working with IBM on a voice-response system for its retail mutual-fund customers to do "direct-to-dialog" speech recognition. This is a fill-in-the-blank speech interface, where customers speak fund names or a keyword instead of navigating long lists.

Around that same time, IBM was also looking for a partner to test what it had developed for its natural-language-understanding speech recognition. IBM's technology lets the user speak unstructured, conversational dialog rather than specific commands.

IBM's technology is provided as a component of its ViaVoice Telephony technologies, which use statistically based models that eliminate the limitations of speech-recognition grammars used in many speech applications. The software effectively acts like a polite, intelligent customer-service representative. It's able to understand comments and requests, provide context-sensitive prompting, and doesn't even get mad if you interrupt when it's speaking. "We just thought it was a great fit, so we really jumped at the chance to work with IBM in developing a fully functional natural-language system," says Tom Kazmierczak, T. Rowe Price's VP of business operations.

Elements Of IBM WebSphere Voice Server

-VoiceXML Browser: Software component that parses and implements VoiceXML 1.0 standard

-Voice Recognition Engine: Software component that converts spoken words to text and acoustic data optimized for the telephony environment

-Text-To-Speech Engine: Software component that converts text input to spoken output for playback of recognized text and application prompts

-System Management: Tools for managing a WebSphere Voice Server deployment

-WebSphere Voice Server Developers Kit: Rapid development environment of voice-enabled Web applications

DATA: IBM

The pilot application will serve as an interface to the T. Rowe Price Plan Account Line, a toll-free service that lets 401(k) participants access their accounts over the phone. The application is fronted by a Genesys Telecommunications Laboratories computer telephony application that connects to the call center. It uses a WebSphere voice server, IBM DirectTalk 2.2, a natural-language-understanding server, and a ViaVoice Speech Recognition 1.1 voice-recognition server. WebSphere runs on an RS/6000 platform running AIX 4.2. This server is in turn connected to Corba-based middleware running on a Sun Microsystems server, which connects to a back-end mainframe running CICS, as well as a DB2 MVS database. Call-center sites are located in Owings Mills, Md.; Tampa, Fla.; and Colorado Springs, Colo.

The onus was on T. Rowe Price to be prepared for the hundreds of thousands of different ways that a caller can ask for certain things. For example, there are probably a couple of hundred different ways to ask for an account balance: What is my account balance? How much do I own? Can I retire yet?

"We leveraged the expertise in our call centers," Kazmierczak says. "We had our front-line representatives literally document all of the ways customers ask for an account balance."

So the company collected all of this data--more than 35,000 sentences and phrases. IBM then took that data and expanded on it. The application wound up with an engine that understands hundreds of ways that people can ask for different things.

"We've used IBM-patented statistical models to build a probabilistic perspective of how users interact with a [voice-response system]," says Sunil Soares, program director of product management at IBM Voice Systems. "We collected phrases by interviewing customer-service representatives and began a pilot. Based upon that pilot, we amended the phrases to get better hits." At the same time, if the users don't know what to say, the system will prompt them with suggestions. The company plans to conduct a focus group next month to test some of the options.

The pilot went live at the end of April with five customers. T. Rowe Price has months of data, but it has yet to fully replicate all of the functionality within its existing touch-tone system. This first phase is primarily an inquiry-only system; customers won't be able to initiate transactions yet. The company plans to launch a pilot that includes the transactions capability sometime this year. Eventually, it plans to make the application available to its entire client base of approximately 1.1 million plan participants.

"We want to make sure our customers are comfortable with this technology," Kazmierczak says. Customers were pretty comfortable asking for account balances, but showed some hesitation about actual trading, he says. T. Rowe Price will use customer satisfaction to measure the application's success. "At the end of the day, we ask ourselves if the customer is more satisfied with the service we're providing them," he says.

The 401(k) voice-recognition application program is scheduled to be made available to all of T. Rowe Price's customers by year's end. IW

Read more about:

20012001
Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights