How to drive the Vocollect™ voice engine for enterprise
hand-held devices, to serve your application in minutes.
Foreword….
I think you heard about Vocollect Voice® systems, to be the
best-in-class for enterprise voice-driven application implementation. I know,
also, that the power of that system is considered somewhat
"difficult" to control, requiring so many professionals and time to
be implemented.
During the last few years, Vocollect released new
instruments to make the Vocollect implementation engineer life easier: among
all, VoiceArtisan and Voice Interface Object, which are very powerful and flexible
toolkits to build robust voice-driven application.
Handheld devices
Yes, but Vocollect Voice do it's best when used on Vocollect-made
devices, which are designed for high quality audio performances. This was absolutely
true, until Vocollect released the state-of-the-art bluetooth headset, the SRX2,
and the new series of Vocollect VoiceClient and VoiceCatalyst for Handhelds
devices (MP). Thanks to the Vocollect® R&D on the audio streaming protocol
over bluetooth, and to the SoundSense® noise-suppression feature, now we have a
very good audio device also on most common enterprise handhelds from Intermec, Motorola
and Honeywell.
Handhelds terminal are not the best choice for an intensive voice-driven work, where the wearabilty of the hardware appliances take an important role. But, when voice-driven work take up just a portion of the working time, an handheld device is the right choice, as it can run classic non-voice application too. By the other hand, the effort of developing a direct interface between the voice application and the process management system for a simple, non-intensive workflow, may discourage IT people...Hereafter, a real life example I just developed to approach a request from a local software house.
The Need
The challenge was to run a pilot voice application, which had
to drive the picking process for web ordered items, in a grocery point of sales.
The customer is already running a web-based application, on a Motorola
handheld device. We had a small timeframe, and the software house who mantains
the web application had no time to develop a fresh new interface to serve the
vocollect voice dialogs. KFI did want to run that pilot, has the request for that kind of
application is growing up. So, we took in charge the technical challenge: the web
application, from within the web page, had to drive the basic voice dialog to
be used to guide the operators to locations, pick items, and put them into the right customer box.
The Solution
As we'are speaking about Web HTML pages, and the enterprise handheld
devices are equipped with Windows Embedded for Handhelds, we decided to
implement few simple Javascript functions, which task is to lauch minimal
voice dialog tokens. We decided for a "Say" simple method (a sentence
is spoken by the terminal), a "Navigate" method (drive the operator
to specified location), a "IdentifyItem" method (operator read a three-digit
check number, or, scan a barcode).
We had two possible choices to push commands to an application which is external to the web page: embed an ActiveX object in the "voice driving"
web page, or, using AJAX techniques to communicates with the external software service. Our choice has fallen on the first one, even though the AJAX interface was implemented too.
Powered by Vocollect™
To run the pilot we use an Intermec CN70 handheld terminal,
equipped with the Vocollect Voice® MP 2.0 and the Vocollect® SRX2 bluetooth headset.
To read barcodes data in a hands-free fashion, a Motorola RS507 ring scanner was used. To
build the voice application dialog, we leverage the flexibility of the
Vocollect Voice Interface Objects toolkit (VIO).
How does it work?
The solution is made by three components:
- The KFI ActiveX DLL object
- The KFI VIO Server
- The Vocollect® VoiceClient MP
The core of the solution is the KFI VIO Server component.
It listens for the "voice dialog element" request from the KFI VIOX activex (embedded in the HTML
page), and maps each request to a Vocollect VIO sequence of voice objects,
which actually make the voice dialog. Data returned from the voice dialog are
then re-routed to the calling ActiveX method, and captured by the Web page
javascript function.
The following is an example of a web page running a simple
picking script:
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html
xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>VIOX Test Page</title>
<script type="text/javascript"
language="JavaScript">
function runvoice()
{
Say("welcome!");
Navigate("aisle
12", "slot 23");
IdentifyItem("spaghetti
pasta, 3,45 euro");
IdentifyItem("tote
23");
}
function Say(prompt)
{
try
{
var myObject =
document.getElementById('MPcontrol').object;
myObject.callSpeaking(prompt);
var myresponse = myObject.vioResponse;
document.getElementById('fase_1').innerHTML =
"<p>callSpeaking " + myresponse + "</p>";
}
catch(e)
{
document.getElementById('fase_1').innerHTML =
e.Message;
}
}
function Navigate(aisle, slot)
{
try
{
var myObject =
document.getElementById('MPcontrol').object;
myObject.callNavigation(corsia, posto);
var myresponse = myObject.vioResponse;
document.getElementById('fase_2').innerHTML =
"<p>Navigate to " + myresponse + "</p>";
}
catch(e)
{
document.getElementById('fase_2').innerHTML =
e.Message;
}
}
function IdentifyItem(prompt)
{
try
{
var myObject =
document.getElementById('MPcontrol').object;
myObject.callBarcode(prompt);
var myresponse = myObject.vioResponse;
document.getElementById('fase_3').innerHTML =
"<p>Identify " + prompt + myresponse + "</p>";
}
catch(e)
{
document.getElementById('fase_3').innerHTML =
e.Message;
}
}
</script>
</head>
<body
onload="runvoice();">
<p>GROCERY DRIVE</p>
<p>SPAGHETTI PASTA</p>
<p>1 EACH</p>
<p>CUSTOMER 24332</p>
<div id="fase_1">
</div>
<div id="fase_2">
</div>
<div id="fase_3">
</div>
<div id="fase_4">
</div>
<object id="MPcontrol"
classid="CLSID:5301F651-5E1D-4D2C-960F-EC55CA020080"
width="0" height="0"></object>
</body>
</html>
As you can see, by using just three object methods, a full
picking dialog can be built. Generally speaking, by using no more than ten methods, a full voice-driven application can be implemented using standard web
scripting.
Change the front-end module, to fit your application!
The core KFI VIO Server is actually a multi-purpose module. It
has access to every handheld resource, as the bluetooth radio, the GPS, the
camera,… : voice applications and the front-end web module can also leverage
those resources.
The front-end module (KFI VIOX, in the case above explained)
is the element that is usually customized to fit the environment of the main
business application. Beyond ActiveX and AJAX implementation for Web pages, Java JNI extention DLL for Cre-Me JVM for windows mobile has been tested, giving access to voice features to java developed apps.
For more information please email to: a.magnoni@kfitr.it
Thanks!
