HMI - Natural language User Interfaces

  • Natural language User Interfaces

    Natural language user interfaces (LUI or NLUI) are a type of Human Machine Interaction (HMI) where linguistic phenomena such as verbs, phrases and clauses act as UI controls for creating, selecting and modifying data in software applications.

    At TALP research in NLUI have been undertaken from several point of view:


    HUMAN-MACHINE INTERACTION: Speech and audio processing in multimodal interfaces

    In the field of human-machine interaction, it is important that computers adapt to human needs: they should form an integral part of the way humans communicate without requiring exacting efforts by users. This implies a need for multimodal user interfaces that have robust perceptive capacities and that use non-intrusive sensors. The TALP Center has experience on a set of acoustic scene analysis systems that have a number of perceptive and cognitive functionalities. To do so, it is speech and audio processing technologies that make it possible to identify speakers, recognize speech, localize and separate acoustic sources, detect and classify noise, etc.

    An intelligent room was built in building D5 at the Center to test ground for these applications. It is equipped with audio and video equipment and is designed for lecturers to give presentations and seminars. We worked to make advances in the multimodal approach, specifically in the integration of audio and video platforms, which was done by taking advantage of a collaboration that is already in place with the Image Processing Group from the Department of Signal Theory and Telecommunications at the UPC. This research was undertaken under the auspices of the European framework project CHIL and the CICyT ACESCA project.

     


    HUMAN-MACHINE DIALOG: A computer system intended to converse with a human, with a coherent structure

     

    The development of efficient speech dialog systems involves choosing suitable dialog strategies that are able to ask the right questions and return the information requested by users. The problem here lies in the fact that there are no set methods or clear criteria for outlining a good strategy. The criteria applied by the UPC for designing a dialog are based on extremely simple concepts:

    • To ensure users do not get lost.
    • To answer users’ questions directly.
    • To offer users the option of correcting themselves at any time.
    • To avoid misunderstandings.

     

    Besides these basic principles of design, there are two significant factors that condition the development of dialogs: firstly, the range of application scenarios that must be resolved (i.e. the design of a dialog is determined by a system’s scope of application) and secondly, the performance of the speech recognition systems used.

    In view of the above concepts, in most cases the dialog systems developed favor a certain style of control. These systems guarantee improved robustness by minimizing the number of mistakes or omissions made by users, in exchange for less freedom and a loss of naturalness.

    Some of the most common strategies that form part of dialog that are designed to increase robustness and naturalness in these kinds of deterministic systems include:

    • The use of natural text generators.
    • The use of keywords, such as help, correct and repeat.
    • The use of reinforcement/simplification strategies for recognition, such as only obtaining one piece of data per utterance.
    • The use of an automatic help strategy.
    • The use of implicit confirmations in each utterance.
    • The use of explicit confirmations in key questions only. 
       

Scroll to Top