mercoledì 6 maggio 2015

Driving an embedded voice recognition engine using HTML page scripting.



How to drive the Vocollect™ voice engine for enterprise hand-held devices, to serve your application in minutes.


Foreword….

I think you heard about Vocollect Voice® systems, to be the best-in-class for enterprise voice-driven application implementation. I know, also, that the power of that system is considered somewhat "difficult" to control, requiring so many professionals and time to be implemented.
During the last few years, Vocollect released new instruments to make the Vocollect implementation engineer life easier: among all, VoiceArtisan and Voice Interface Object, which are very powerful and flexible toolkits to build robust voice-driven application.

Handheld devices

Yes, but Vocollect Voice do it's best when used on Vocollect-made devices, which are designed for high quality audio performances. This was absolutely true, until Vocollect released the state-of-the-art bluetooth headset, the SRX2, and the new series of Vocollect VoiceClient and VoiceCatalyst for Handhelds devices (MP). Thanks to the Vocollect® R&D on the audio streaming protocol over bluetooth, and to the SoundSense® noise-suppression feature, now we have a very good audio device also on most common enterprise handhelds from Intermec, Motorola and Honeywell.
Handhelds terminal are not the best choice for an intensive voice-driven work, where the wearabilty of the hardware appliances take an important role. But, when voice-driven work take up just a portion of the working time, an handheld device is the right choice, as it can run classic non-voice application too. By the other hand, the effort of developing a direct interface between the voice application and the process management system for a simple, non-intensive workflow, may discourage IT people...

Hereafter, a real life example I just developed to approach a request from a local software house.

The Need

The challenge was to run a pilot voice application, which had to drive the picking process for web ordered items, in a grocery point of sales. The customer is already running a web-based application, on a Motorola handheld device. We had a small timeframe, and the software house who mantains the web application had no time to develop a fresh new interface to serve the vocollect voice dialogs. KFI did want to run that pilot, has the request for that kind of application is growing up. So, we took in charge the technical challenge: the web application, from within the web page, had to drive the basic voice dialog to be used to guide the operators to locations, pick items, and put them into the right customer box.

The Solution

As we'are speaking about Web HTML pages, and the enterprise handheld devices are equipped with Windows Embedded for Handhelds, we decided to implement few simple Javascript functions, which task is to lauch minimal voice dialog tokens. We decided for a "Say" simple method (a sentence is spoken by the terminal), a "Navigate" method (drive the operator to specified location), a "IdentifyItem" method (operator read a three-digit check number, or, scan a barcode).

We had two possible choices to push commands to an application which is external to the web page: embed an ActiveX object in the "voice driving" web page, or, using AJAX techniques to communicates with the external software service. Our choice has fallen on the first one, even though the AJAX interface was implemented too.

Powered by Vocollect™

To run the pilot we use an Intermec CN70 handheld terminal, equipped with the Vocollect Voice® MP 2.0 and the Vocollect® SRX2 bluetooth headset. To read barcodes data in a hands-free fashion, a Motorola RS507 ring scanner was used. To build the voice application dialog, we leverage the flexibility of the Vocollect Voice Interface Objects toolkit (VIO).

How does it work?

The solution is made by three components:
- The KFI ActiveX DLL object
- The KFI VIO Server
- The Vocollect® VoiceClient MP




The core of the solution is the KFI VIO Server component. It listens for the "voice dialog element" request from the KFI VIOX activex (embedded in the HTML page), and maps each request to a Vocollect VIO sequence of voice objects, which actually make the voice dialog. Data returned from the voice dialog are then re-routed to the calling ActiveX method, and captured by the Web page javascript function.
The following is an example of a web page running a simple picking script:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >

<head>

    <title>VIOX Test Page</title>
      
    <script type="text/javascript" language="JavaScript">
      
       function runvoice()
       {
           Say("welcome!");
          
           Navigate("aisle 12", "slot 23");
          
           IdentifyItem("spaghetti pasta, 3,45 euro");
          
           IdentifyItem("tote 23");
       }
        
       function Say(prompt)
       {
             try
             {
                    var myObject = document.getElementById('MPcontrol').object;
                    myObject.callSpeaking(prompt);
                    var myresponse = myObject.vioResponse;
                   
                    document.getElementById('fase_1').innerHTML = "<p>callSpeaking " + myresponse + "</p>";
                   
             }
             catch(e)
             {
                    document.getElementById('fase_1').innerHTML = e.Message;
             }

       }  
      
       function Navigate(aisle, slot)
       {
             try
             {
                    var myObject = document.getElementById('MPcontrol').object;
                    myObject.callNavigation(corsia, posto);
                    var myresponse = myObject.vioResponse;
                   
                    document.getElementById('fase_2').innerHTML = "<p>Navigate to " + myresponse + "</p>";
                   
             }
             catch(e)
             {
                    document.getElementById('fase_2').innerHTML = e.Message;
             }

       }  
      
       function IdentifyItem(prompt)
       {
             try
             {
                    var myObject = document.getElementById('MPcontrol').object;
                    myObject.callBarcode(prompt);
                    var myresponse = myObject.vioResponse;
                   
                    document.getElementById('fase_3').innerHTML = "<p>Identify " + prompt + myresponse + "</p>";
                   
             }
             catch(e)
             {
                    document.getElementById('fase_3').innerHTML = e.Message;
             }

       }  
      
      
       </script>   
  
   
</head>

<body onload="runvoice();">

       <p>GROCERY DRIVE</p>
       <p>SPAGHETTI PASTA</p>
       <p>1 EACH</p>
       <p>CUSTOMER 24332</p>

       <div id="fase_1">
       </div>
       <div id="fase_2">
       </div>
       <div id="fase_3">
       </div>
       <div id="fase_4">
       </div>
      
       <object id="MPcontrol" classid="CLSID:5301F651-5E1D-4D2C-960F-EC55CA020080" width="0" height="0"></object>
          
</body>
</html>


As you can see, by using just three object methods, a full picking dialog can be built. Generally speaking, by using no more than ten methods, a full voice-driven application can be implemented using standard web scripting.

Change the front-end module, to fit your application!

The core KFI VIO Server is actually a multi-purpose module. It has access to every handheld resource, as the bluetooth radio, the GPS, the camera,… : voice applications and the front-end web module can also leverage those resources.
The front-end module (KFI VIOX, in the case above explained) is the element that is usually customized to fit the environment of the main business application. Beyond ActiveX and AJAX implementation for Web pages, Java JNI extention DLL for Cre-Me JVM for windows mobile has been tested, giving access to voice features to java developed apps.

More information?

For more information please email to: a.magnoni@kfitr.it

Thanks!


















Nessun commento:

Posta un commento