Abstract Raffaella Bernardi May28 2018
The dialogue state tracker has received a lot of attention within the Dialogue System community. Its crucial role in representing the context as well as the common ground of the conversational agents is well recognized. Recently, visual dialogue models have been proposed which interestingly need to contextualize the dialogue on a visual scenario too. So far no attention has been properly given to the interplay between the dialogue state tracker with the other components of a dialogue systems. We show that by introducing a grounded dialogue state tracker jointly trained with the other components of a dialogue system, the model's performance on carrying-out a task oriented dialogue improves of around 10% accuracy and the model acquires better linguistic skills. Furthermore, we show that using a two phase training paradigm, that first let the model learns from human dialogues and then let it learns its own dialogue policy, brings to a further increase of around 10% accuracy. Finally, we experiment with the addition of a dialogue management component jointly trained with the other modules. We show that by deciding the next action, the quality of the dialogues improves since un-necessary questions are avoided. We evaluate our model on the GuessWhat!? dataset.