Difference between revisions of "Language features to look at in Dementia patients"

From IMC wiki
Jump to: navigation, search
(Computational Features)
(Computational Features)
Line 154: Line 154:
 
|-
 
|-
 
| Revisions
 
| Revisions
| Orimaye et al 2014, Croisile et al 1996
+
| Orimaye et al 2014, Onofre de Lire et al 2011
 
|-
 
|-
 
| Other-repair
 
| Other-repair
Line 178: Line 178:
 
|rowspan="2"| depth of tree from e.g. Stanford parser? then min/max/mean/std per person per turn
 
|rowspan="2"| depth of tree from e.g. Stanford parser? then min/max/mean/std per person per turn
 
| Lower syntactic index (CHECK)
 
| Lower syntactic index (CHECK)
| Croisile et al 1996
+
| Onofre de Lire et al 2011
 
|-
 
|-
 
| Less complex sentences
 
| Less complex sentences
| Croisile et al 1996
+
| Onofre de Lire et al 2011
 
|-
 
|-
 
|rowspan="3"| Inter-turn repetition
 
|rowspan="3"| Inter-turn repetition
 
|rowspan="3"| Mean words per turn repeated from previous turn(s); maybe weighted (inversely) by distance?
 
|rowspan="3"| Mean words per turn repeated from previous turn(s); maybe weighted (inversely) by distance?
 
| Repetition
 
| Repetition
| ref
+
| Croisile et al 1996, Onofre de Lire et al 2011
 
|-
 
|-
 
| Repeating questions
 
| Repeating questions
Line 197: Line 197:
 
| Mean repeated words per turn; maybe weighted (inversely) by distance?
 
| Mean repeated words per turn; maybe weighted (inversely) by distance?
 
| Repetition
 
| Repetition
| ref
+
| Croisile et al 1996, Onofre de Lire et al 2011
 
|-
 
|-
 
|rowspan="2"| Pronoun use
 
|rowspan="2"| Pronoun use

Revision as of 13:24, 13 July 2015

About

In this page is a list of language features for consideration when processing patient transcript data. Where possible indications will be made detailing what software tools are available to facilitate the analysis of language features.

Linguistic Features

Feature Possible computational implementation
Lack of speech initiative / introducing new topics/change topics LDA or similar to assign topics, then measure change (e.g. KL divergence) in distributions over windows
Topic shifts / lack of topic maintenance
Lack of contribution to topics needs explicit topic segmentation - then mean/min/max/std number of contributions per segment
Paraphasia words in wrong and senseless combinations
Incomplete talk/ conversational discontinuity transcribed incomplete words "d-"
Lacking coherence /intelligibility maybe language model high surprisal?
Repair Julian Hough's self-repair classifier, STIR - mean/max repairs per turn; use lexicon of repair indicator words; incomplete words
Word finding difficulties lexical retrieval
Object naming difficulties
Disfluency
Elaboration difficulty
Use of incomplete words transcribed incomplete words "d-"
Comprehension of the speech of others Chris Howes' other-repair classifier from PPAT?
Short conversational turns Mean/min/max/std-dev words per utterance, maybe normalised over whole document
Less words per turn
Repetition Mean repeated words per turn; maybe weighted (inversely) by distance?
Reference errors
More pronoun use Simple python code to count usage levels of pronouns for Patients / HCPs?
Impairment in greeting presence/absence of initial greetings (use manually defined list), with/without pauses, other words?
Speech outflow
Circumlocutions Maybe look for hedges and fillers e.g "You know, that thing that does x / that thing with the y that looks like z"
Slowness to respond mean/min/max number of inter-turn "Pause"s
Gist-level processing (summary, main idea, lesson task) Maybe look at text summarisation of HCP dialogue and see how it matches up to patient utterances after topic modelling is performed on both
Detail-level processing  ?
Reduced lexicon richness see e.g. Hirst papers on author dementia from vocabulary changes in books
Empty phrases/speech ratio of pronouns to nouns
Slow rate of speech mean/min/max number of inter-turn "Pause"s and intra-turn "#pause"s
More requestives than assertives Use a standard e.g. Switchboard-trained DA tagger to give mean/min/max query vs statement; or build simple POS-sequence rule-based version
Repeating questions

Computational Features

Feature Method Related linguistic phenomena Reference
Topic variability LDA or similar to assign topics, then measure change (e.g. KL divergence) in distributions over windows Introducing new topics/change topics ref
Topic shifts / lack of topic maintenance Watson et al 1999
Topic contribution Segment topics via WindowDiff metho (lexical or LDA topics); then mean/min/max/std number of contributions per person per segment Lack of contribution to topics ref
Lexical surprisal train language model on e.g. Switchboard/BNC (use STIR models?), measure surprisal at each word; then mean/min/max/std surprisal per person Paraphasia words in wrong and senseless combinations ref
Lacking coherence /intelligibility Watson et al 1999
Lexicon size Lexical type counts, breadth of lexical probability distribution Lexicon richness see e.g. Hirst papers on author dementia from vocabulary changes in books
Incomplete words Count transcribed incomplete words "d-"; mean/min/max/std rate per person Incomplete talk/ conversational discontinuity Watson et al 1999
Use of incomplete words ref
Incomplete turns/contributions Turn-final transcribed incomplete words? Or maybe parse tree features? Incomplete talk/ conversational discontinuity ref
Predicates POS-tag, count verb/noun/adj/adv classes; mean/max/min/std per person, also normalised per utterance Number (not average) of predicates (CHECK) Orimaye et al 2014
Self-repair Julian Hough's self-repair classifier, STIR - mean/max/min/std repairs per turn; can also use simpler classifier via STIR's lexicon of repair indicator words Repair (SISR) Watson et al 1999
Word finding difficulties lexical retrieval Croisile et al 1996
Object naming difficulties ref
Disfluency ref
Elaboration difficulty Watson et al 1999
Revisions Orimaye et al 2014, Onofre de Lire et al 2011
Other-repair Chris Howes' other-repair classifier from PPAT? Mean/max/min/std rate of other-repair per person Comprehension of the speech of others ref
Reference errors ref
Turn length Mean/min/max/std-dev words per utterance per person, maybe normalised over whole document and over other person? Short conversational turns ref
Less words per turn Orimaye et al 2014
Less complex sentences Croisile et al 1996
Syntactic complexity depth of tree from e.g. Stanford parser? then min/max/mean/std per person per turn Lower syntactic index (CHECK) Onofre de Lire et al 2011
Less complex sentences Onofre de Lire et al 2011
Inter-turn repetition Mean words per turn repeated from previous turn(s); maybe weighted (inversely) by distance? Repetition Croisile et al 1996, Onofre de Lire et al 2011
Repeating questions ref
(See also other-repair features)
Intra-turn repetition Mean repeated words per turn; maybe weighted (inversely) by distance? Repetition Croisile et al 1996, Onofre de Lire et al 2011
Pronoun use Mean/max/min/std pronouns per turn per person; maybe normalised over number of nouns More pronoun use Jarrold et al 2014
Empty phrases/speech ref
Reduced lexicon richness e.g. Hirst
Relative contribution Number of words and turns per person; also normalised over other person Reduced number of utterances Orimaye et al 2014
Slow rate of speech ref
Empty phrases/speech ref
Pauses mean/min/max number of inter-turn and intra-turn "Pause"s Slowness to respond ref
Slow rate of speech ref
(see also self-repair phenomena)
Filled pauses Count transcribed filled pauses; mean/min/meax/std rate per person per turn Circumlocutions ref
(see also self-repair phenomena)
Hedges Define lexicon manually & count (you know, thing etc); mean/min/meax/std rate per person per turn Circumlocutions ref
(see also self-repair phenomena)
Greetings Count of initial greetings (use manually defined list), with/without pauses, other words? Impairment in greeting ref
Backchannels Define lexicon manually & count; mean/min/max/std rate per person ref
Speech outflow  ?
Gist-level processing (summary, main idea, lesson task) Maybe look at text summarisation of HCP dialogue and see how it matches up to patient utterances after topic modelling is performed on both
Detail-level processing  ?
More requestives than assertives Use a standard e.g. Switchboard-trained DA tagger to give mean/min/max query vs statement; or build simple POS-sequence rule-based version