After much initial promise, speech recognition technology has battled with
accuracy problems. But the next generation of tools that are now available are
increasingly capable of handling both different regional accents and everyday
talking patterns.
And analyst Gartner predicts the global
market will be worth $191m (£93m) by 2010.
Currently, the US accounts for 70 per cent of speech recognition spending. But
the proportion is dropping as Europe starts to follows its lead.
The technology has many potential uses. In the healthcare sector, it offers a
cost-effective way to manage the burden of medical reporting – see Hammersmith
Hospitals case study.
And it can also be used to handle incoming customer queries – see Vodafone
case study.
CASE STUDY: Hammersmith Hospitals
The Hammersmith Hospitals NHS Trust
has cut the time it takes to process x-ray reports by a third, thanks to speech
recognition software adopted as part of wider IT upgrades in the trust’s
radiology departments.
Feedback from the trust’s Charing Cross Hospital shows that 84 per cent of
x-ray reports are now filed within the target time of 24 hours. And report
completion time has dropped from more than six hours to less than two.
The system has significantly improved efficiency, said Hammersmith director
of imaging Professor Philip Gishen.
“We used to see how many reports we could get done in 24 hours,” said Gishen.
“Now we are able to measure in minutes how long it takes to finish a report
from the moment an x-ray first hits our digital imaging system.”
Before the upgrades took place, the trust’s four London hospitals relied on a
combination of analogue images and human transcription for handling x-ray
reports. Doctors worked from a hard copy of each scan, giving dictations to a
stenographer, who would then write up the official paperwork.
Images are now uploaded to the trust’s digital picture archiving
communication system. And after selecting the relevant records for a specific
patient, medical staff can dictate their findings directly using either a
headset or a handheld dictaphone device.
Unlike human transcribers, the dictation software is capable of noting down
text at a natural rate of speech. The reporting doctors are also able to edit
and format their text using brief vocal commands.
CASE STUDY: Vodafone
Vodafone’s use of speech recognition
technology is focused on providing a more human experience for customers using
its automated call centre services.
The company receives about six million customer calls every week, which are
handled by either human agents or by the company’s interactive voice response
network.
By equipping its automated systems with a new “persona” – using upgraded
voice recognition software – the company hopes to encourage the use of its
self-service systems.
Presenting a more human interface was a vital step to overcoming negative
perceptions, said Vodafone head of self service Mel Rowland.
“The thing people hate most about using an automated service is that it feels
as if they are talking to a robot,” said Rowland.
“Our customers pay their bill 12 times a year, so we want them to use the
service 12 times a year. It has to be a good experience, or they simply will not
use it,” she said.
The virtual agent, known as Vicky, is designed to be more realistic. It uses
and understands natural speech patterns, which cuts down on stilted, one-way
conversations.
Since it was introduced in April, the system has acted as a guide through
Vodafone’s customer service options.
And by the start of 2008, Vodafone customers will also be able to register
new pre-pay phones, or pay their monthly bills using Vicky’s automated services.
Speech recognition is a useful tool, but it is not used to avoid talking
directly to customers, says Vodafone. The firm will retain its human operators,
callers can still choose to speak directly to a person, and a trigger system
will automatically transfer them after three errors.
“There are 56 regional dialects in the UK, so there will be voices that the
system will struggle to understand,” said Rowland.
"If that happens, any information that is already completed is passed on to
the human agent, so the caller will not have to start the whole process over
again.”
The Vodafone and Hammersmith Hospitals speech recognition projects both use
systems provided by Nuance Communications.
Comments
Have your say on this article