Abstract Julian Hough 22 Jan 2018
Despite being marginalized by mainstream linguistic research, we claim that disfluency is a core part of dialogue content. We support this claim from a computational modelling perspective, by showing how repairs and edit terms are amenable to modelling by statistical sequence models, in line with current automatic approaches to other linguistic phenomena. From the work on the DUEL project, we present the joint task of incremental disfluency detection and utterance segmentation on dialogue data, and a simple deep learning system which performs it on transcripts and speech recognition results. We show how the constraints of the two tasks interact. Our joint-task system outperforms the equivalent individual task systems, provides competitive results and is suitable for future use in conversation agents in the psychiatric domain.
Most relevant paper: Hough, J., & Schlangen, D. (2017). Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech. In Proceedings of the Annual Meeting of the European Chapter of the Association for Computational Linguistics (EACL).