• Aucun résultat trouvé

Visually-Grounded Dialogue Models: Past, Present, and Future

N/A
N/A
Protected

Academic year: 2022

Partager "Visually-Grounded Dialogue Models: Past, Present, and Future"

Copied!
1
0
0

Texte intégral

(1)

Visually-Grounded Dialogue Models:

Past, Present, and Future

Raquel Fern´ andez

University of Amsterdam, The Netherlands raquel.fernandez@uva.nl

Abstract

The past few years have seen an increasing interest in developing neural- network-based agents for visually-grounded dialogue, where the conversation participants communicate about visual content. I will start by discussing how visual grounding can be integrated with traditional task-oriented dialogue system components. Most current work in the field focuses on reporting numeric results solely based on task success. I will argue that we can gain more insight by (i) analysing the linguistic output of alternative systems and (ii) probing the representations they learn. I will also introduce a new dialogue dataset we have developed using a data-collection setup designed to investigate linguistic common ground as it accumulates during visually- grounded interaction.

Références

Documents relatifs

Finally, the guesser model takes the generated dialogue D and the list of objects O and predicts the correct object..

We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the

This study suggests that GABAergic neurons produced in the human subpallium can continue to express ASCL after reaching the cortex and that ASCL1(+) GABA neurons present in the

Installé derrière les cagettes au fond du camion et en pleine confiance, peut être que dans mon quartier enfin, je pourrais contre-attaquer, j'ai fuis depuis si longtemps,

There are several modi- fied live vaccines currently on the market for horses, including, but not limited to, vaccines for equine herpes, equine influ- enza, equine encephalitis,

• The goal of this research is to study how a visually grounded neural network segments a continuous flow of speech hoping it could shed light on how children perform the same

We propose to illustrate the relevance of the aforementioned criteria through the reassem- bly of fractured objects. In particular, we deal with a real-world dataset composed of

Tactile paper maps (also called raised-line maps) have long been used to present spatial information to VI people.. They were used both as a learning device during education and as