Key features:

  • Voice dialing (contact names or phone numbers),
  • Address book and contact management,
  • Managing short text messages,
  • Reading incoming text messages,
  • Call log management,
  • Easy calendar and alarm handling,
  • Changing system settings (date/time, network...),
  • Launching applications
  • Support for South Slavic Languages

How it works

Dialog manager (DM)

This module is responsible for entire behavior of the system. It uses the output of the speech recognizer as its input, and performs the appropriate action.

A set of tasks is defined, with a precise specification of the information required for their execution. Some examples of these tasks are: calling a contact, sending a text message, managing the calendar and the log, starting an application, changing system settings, etc.

Natural Language Understanding (NLU)

The NLU module converts the user’s query into a form suitable to the dialog manager. For instance, if the user query is recognised as “I want to send an SMS message to Vesna Petrović”, the dialog manager will receive: “command: SEND_SMS; contact: Vesna Petrović”.

Natural Language Generation (NLG)

The function of this module is dual to the function of NLU. Namely, it converts the information from the format suitable to the dialog manager to a sentence of the natural language.

Implementation of speech technologies on mobile platforms

Until recently speech recognition was limited to small vocabularies and to a PC platform. The vocabulary of this product is significantly larger, and the software is optimized so as to conform to resource limitations of portable devices.

As to speech synthesis, our previous solutions were of high quality, but also restricted to a PC. We have now developed a less resource demanding version compatible to the operating systems of smart phones. This was possible with a slight degradation of synthesis quality, acceptable from the point of view of the target application.

The accuracy of speech recognition

It is well known that a number of recognizers by renowned manufacturers function with insufficient accuracy, even for major languages, which leads to user frustration and dissatisfaction.

In order to increase recognition accuracy, the language model used in ASR is particularly tailored to suit the functionality used at a particular moment, and in case of uncommon words (e.g. infrequent proper nouns) the users will always be allowed to resort to typing.