Back to search

IKTPLUSS-IKT og digital innovasjon

Graph-based Neural Models for Dialogue Management

Alternative title: Grafbaserte nevrale nettverk for dialogssytring

Awarded: NOK 8.0 mill.

Project Number:

300921

Project Period:

2020 - 2024

Location:

Partner countries:

Spoken language is a natural form of communication for humans. Since childhood, we have all learned how to understand and produce speech in order to interact with one another, and much of our waking life is spent in social interactions mediated through language. In many ways, the human brain is "wired" for spoken dialogue. The versatility of speech for human-computer interactions has given rise to technologies such as virtual assistants (such as Siri, Cortana, Google Home or Amazon Echo), in-car voice control and human-robot interaction. These computer programs are all examples of (spoken) dialogue systems. Inside these systems, a module called the dialogue manager is responsible for making decisions on what to say or do at a given time (such as responding to the user or executing a movement). To make these decisions, the dialogue manager records what it knows about the interaction in a data structure called the "dialogue state". Dialogue managers are often built with machine learning methods (such as deep neural networks) trained on dialogue data. Current models have, however, several shortcomings. The first is that they constrain the dialogue state to a fixed, predefined set of variables, making it difficult to represent rich contexts that evolve over time. Second, these models depend on large amounts of data to learn useful behaviours. This is problematic for applications where data is scarce and expensive to obtain. GraphDial investigates an alternative approach to dialogue management based on graphs as core representation for the dialogue state. Graphs are well suited to encore rich interactions that include multiple entities (such as places, persons, objects, tasks or utterances) and their relations. Furthermore, GraphDial also works on the use of weak supervision for dialogue management. Weak supervision is an emerging AI paradigm designed to provide machine learning models with indirect training data extracted from heuristics or domain knowledge. Those graph-based models are validated in practice through the development of a robotic receptionist that can answer various requests, such as when a given employee is available or where a meeting is to take place. The project has so far led to the release of a new dataset (GraphWOZ) along with a novel approach to retrieval-based response generation that takes as starting point a knowledge graph that represents the current dialogue state and is regularly updated with new observations.

The GraphDial project sets out to advance the state-of-the-art in dialogue management. Current approaches to dialogue management often make use of neural models trained on dialogue data using supervised or reinforcement learning. These approaches have led to enhanced performance across a broad range of tasks, but also have two important shortcomings: (1) They rely on quite restrictive representations of the dialogue state (often based on fixed numbers of slots to fill); (2) They are dependent on large amounts of training data to learn the model parameters. The project will investigate an alternative approach to dialogue management based on the use of probabilistic graphs as core representation for the dialogue state. Graphs are well suited to capture rich interaction contexts including multiple entities and relations. They also facilitate the use of relational abstractions covering large portions of the state space in a compact and human-readable manner. Another central topic for the project is the use of weak supervision to train neural models for dialogue state tracking and action selection. Weak supervision allows machine learning models to be trained with indirect data extracted from noisy labelling functions or domain knowledge. The use of weak supervision is particularly attractive for dialogue management due to the difficulty of obtaining annotated data in most dialogue domains. The project intends to integrate a range of weak supervision signals, including user responses to grounding or clarification acts, heuristic rules and global constraints on the graph structure. To achieve these objectives, GraphDial will feature collaborations with leading researchers in the field of spoken dialogue systems, statistical relational learning, graph neural networks and human-robot interaction. The project will also be in contact with two Norwegian companies involved in the development of conversational technology to facilitate the dissemination of project results.

Publications from Cristin

No publications found

Funding scheme:

IKTPLUSS-IKT og digital innovasjon