Capturing patterns of Capturing patterns of linguistic interaction in a linguistic interaction in a
parsed corpusparsed corpusA methodological case study
Sean WallisSurvey of English Usage
University College London
Capturing linguistic Capturing linguistic interaction...interaction...• Parsed corpus linguistics
• Intra-structural priming
• Experiments– Attributive AJPs before a noun– Embedded postmodifying clauses– Sequential postmodifying clauses– Speech vs. writing
• Conclusions
• The handout explains the analytical method in more detail(so read it later!)
Parsed corpus linguisticsParsed corpus linguistics
• An example tree from ICE-GB (spoken)
S1A-006 #23
Parsed corpus linguisticsParsed corpus linguistics
• Three kinds of evidence may be obtained from a parsed corpusFrequency evidence of a particular known
rule, structure or linguistic eventCoverage evidence of new rules, etc.Interaction evidence of the relationship
between rules, structures and events
• This evidence is necessarily framed within a particular grammatical scheme– How might we evaluate this grammar?
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in
structures
• Consider– a phrase or clause that may (in principle)
be extended ad infinitum• e.g. an NP with a noun head
N
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in
structures
• Consider– a phrase or clause that may (in principle)
be extended ad infinitum• e.g. an NP with a noun head
– a single additive step applied to this structure
• e.g. add an attributive AJP before the head
N
AJP
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in structures
• Consider– a phrase or clause that may (in principle) be
extended ad infinitum• e.g. an NP with a noun head
– a single additive step applied to this structure• e.g. add an attributive AJP before the head
– Q. What is the effect of repeatedly applying this operation to the structure?
shipN
N
AJP
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in structures
• Consider– a phrase or clause that may (in principle) be
extended ad infinitum• e.g. an NP with a noun head
– a single additive step applied to this structure• e.g. add an attributive AJP before the head
– Q. What is the effect of repeatedly applying this operation to the structure?
shipNAJP
tall
N
AJP
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in structures
• Consider– a phrase or clause that may (in principle) be
extended ad infinitum• e.g. an NP with a noun head
– a single additive step applied to this structure• e.g. add an attributive AJP before the head
– Q. What is the effect of repeatedly applying this operation to the structure?
shipNAJP
very greentallAJP
N
AJP
Intra-structural primingIntra-structural priming
• Priming effects within a structure – Study repeating an additive step in structures
• Consider– a phrase or clause that may (in principle) be
extended ad infinitum• e.g. an NP with a noun head
– a single additive step applied to this structure• e.g. add an attributive AJP before the head
– Q. What is the effect of repeatedly applying this operation to the structure?
shipNAJP
very greentallAJP
N
AJP
AJP
old
Experiment 1: analysis of Experiment 1: analysis of resultsresults• Sequential probability analysis
– calculate probability of adding each AJP– error bars: Wilson intervals– probability falls
• second < first• third < second
– decisions interact
– Every AJP addedmakes it harderto add another
0.00
0.05
0.10
0.15
0.20
0 1 2 3 4 5
probability
Experiment 1: explanations?Experiment 1: explanations?
• Feedback loop: for each successive AJP,
it is more difficult to add a further AJP logical-semantic constraints
• tend to say the tall green ship • do not tend to say tall short ship or green tall ship
communicative economy• once speaker said tall green ship, tends to only say
ship memory/processing constraints
• unlikely: this is a small structure, as are AJPs
Experiment 1: speech vs. Experiment 1: speech vs. writingwriting• Spoken vs. written subcorpora
– Same overall pattern– Spoken data tends to have fewer attributive AJPs
• Support for communicative economy or memory/processing hypotheses?
– Significance tests• Paired 2x1 Wilson tests
(Wallis 2011)• first and second
observed spoken probabilities are significantly smallerthan written
0.00
0.05
0.10
0.15
0.20
0.25
0 1 2 3 4 5
probability
written
spoken
Experiment 2: preverbal AVPsExperiment 2: preverbal AVPs
• Consider adverb phrases before a verb– Results very different
• Probability does not fall significantly between first and second AVP
• Probability does fall between third and second AVP
– Possible constraints• (weak) communicative • (weak) semantic
– Further investigationneeded
0.00
0.05
0.10
0 1 2 3 4
probability
Experiment 3: postmodifying Experiment 3: postmodifying clausesclauses• Another way to specify nouns in English
– add clause after noun to explicate it• the ship [that was in the port]• the ship [called Ariadne]
– may be embedded• the ship [that was in the port [we visited last week]]
– or successively postmodified• the ship [called Ariadne][that was in the port]
Experiment 3: (i) Experiment 3: (i) embeddingembedding
• Probability of adding a further embedded postmodifying clause falls with size– All data
• second < first• third < first
– Spoken• second < first
– Written• third < second
• Compare with effect ofsequential postmodification of same head
0.00
0.05
0.10
0 1 2 3 4
probability
written
spoken
all
Experiment 3: (ii) Experiment 3: (ii) sequentialsequential
• Probability of sequential postmodifying falls - and - for spoken data, falls, then rises– All data
• second < first
– Spoken• third > second
0.00
0.05
0.10
0.15
0 1 2 3 4 5
probability
written
spoken
Experiment 3: (ii) Experiment 3: (ii) sequentialsequential
• Probability of sequential postmodifying falls - and - for spoken data, falls, then rises– All data
• second < first
– Spoken• third > second
– Option: count conjoins separatelyor treat as single item
• Either way, results showsimilar pattern
– Negative feedback: the ‘in for a penny’ effect
0.00
0.05
0.10
0.15
0 1 2 3 4 5
probability
written
spoken
Experiment 3: (iii) Experiment 3: (iii) embedembed vs. vs. seqseq• Embedded vs. sequential postmodification
• embedding > sequence (second level)
– It is slightly easier tomodify the latest headthan a more remoteone:
• semantic constraints?• backtracking cost?
– Third level• embedding < sequence
(if counting conjoins)• long sequences seem to be easier to construct than
comparable layers of embedding
0.00
0.05
0.10
0.15
0 1 2 3 4 5
probability
embedding
sequential
ConclusionsConclusions
• A method for evaluating interactions along grammatical axes– General purpose, robust, structural– More abstract than ‘linguistic choice’ experiments– Depends on a concept of grammatical distance
along an axis, based on the chosen grammar
• Method has philosophical implications– Grammar viewed as outcome of linguistic choices– Linguistics as an evaluable observational science
• Signature (trace) of language production decisions
– A unification of theoretical and corpus linguistics?
Potential applicationsPotential applications
• Corpus linguistics– Optimising existing grammatical framework
• e.g. coordination, compound nouns
– Comparing genres/languages/periods
• Theoretical linguistics– Comparing different grammars, same language
• Psycholinguistics– Search for evidence of language production
constraints in spontaneous speech corpora• speech and language therapy• language acquisition and development
ReferencesReferences
Nelson, G., Wallis, S. & Aarts, B. (2002) Exploring natural language. Benjamins.
Pickering, M. & Ferreira, V. (2008) Structural priming. Psychological Bulletin 134, 427–459.
Wallis, S.A. (2011) Comparing χ² tests for separability. Survey of English Usage.
• For explanation of the analysis method see the handout!
• For more detail and a draft of the full paper see http://corplingstats.wordpress.com