+ All Categories
Home > Documents > Information Density and Word Order. Why are some word orders more common than others? In the...

Information Density and Word Order. Why are some word orders more common than others? In the...

Date post: 29-Dec-2015
Category:
Upload: frederick-clarke
View: 221 times
Download: 2 times
Share this document with a friend
Popular Tags:
35
Information Density and Word Order
Transcript
Page 1: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Information Density and Word Order

Page 2: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Why are some word orders more common than others?

• In the majority of languages (with dominant word order) subjects precede objects

• (SOV,SVO) > VSO > (VOS, OVS) > OSV

Page 3: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Genetically encoded bias?• Single common ancestor (SOV)?

• General linguistic principles– Theme-first – Verb-object bodning– Animate-first

• Great, but why do these principles work?

Why are some word orders more common than others?

Page 4: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Constant information transmission rate– Slower for unexpected, high entropy content– Faster for predictable, low entropy content

• The basic word order of a language influences the average transmission rate

• Thus languages that are closer to the UID ideal will be more common compared to others further away from it

Uniform information density hypothesis

Page 5: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Word-order model

• Simple world with – 13 objects (O)

• 5 people• 8 food/drink items

– 2 relations (R) • eat/drink

• Events in this world consist of one relation and two objects– (o1, r, o2)

• And appear with a certain probability P

Page 6: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Base entropy (the initial state of the observer before words are spoken)

• After each word, observers adjust their expectations for the following ones, reaching an entropy of zero after the third word of the event

Word-order model

Page 7: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Each event has an information profileI1 = H0 − H1 , I2 = H2 − H1 , I3 = H2

• Where Hn are entropy trajectories of each word

• UID suggests a straight line from base entropy to zero entropy such that each word conveys 1/3 of the total information

Word-order model

Page 8: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 9: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Word-order model• UID deviation score

• Deviation of toy-world events from the “ideal information profile” according to UID

VSO > VOS > SVO > OVS > SOV > OSV

Page 10: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Corpus study

• Child-directed speech (English and Japanese corpora)

• Utterances involving singly transitive verbs• Ignored adjectives, plurality, tense etc• English: VSO (0.38), SVO (0.41), VOS (0.48),

SOV (0.64), OSV (0.78), OVS (0.79) • Japanese: SVO (0.66), VSO (0.71), SOV (0.72),

VOS (0.72), OSV (0.82), OVS (0.83)

Page 11: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 12: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Experiment• Languages must be optimal with respect to the

frequencies of events in the real world• Judgement tasks for pairs of sentences (which one is

more probable?)• VSO (0.17), SVO (0.18), VOS (0.20), SOV (0.23), OVS (0.23),

OVS (0.24).

Page 13: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Discussion

• Object-first word orders are rare• Object-first word orders have least uniform

information density (first word carries too much information)

• SOV is not as compatible with the UID as it is frequent in real languages – perhaps due to other important factors beside UID

• TFP and AFP favor SOV, SVO (highest ranked in the results) and VSO – perhaps UID provides some justification at least for some word order rankings

Page 14: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Conclusion

• Findings consistent with a weaker hypothesis that word order is optimal wrt the frequency speakers choose to discuss events (not wrt to how often these events really occur)

• UID may not provide explanation for all of the word order rankings, but does explain several aspects of the empirical distribution of word orders

Page 15: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

A Noisy Channel Account of Crosslinguistic Word Order Variation

• In 96.3% of studied languages S precede O• SVO (English) and SOV (Japanese) are more

prevalent than VSO• People construct sentences from and agent

perspective – why SVO/SOV then?• Innate universal grammar – independent of

communicative or performance factors

Page 16: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Why SOV/SVO

• Communicative-based explanation• SOV default for the human language– Preference for S to precede O– Preference for the V to appear in the end of the

clause• SVO arises from SOV as a result of

communication/memory pressures that sometimes outweigh the second preference

Page 17: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Shanon’s communication theory

• Comprehension and production operate via a noisy channel

• Speakers are under constraints to chose utterances that will ensure maximal meaning recoverability by the listener

• When does word order affect how easily meaning can be recovered?– The girl kicks the ball. (people should adhere to SOV)– The girl kicks the boy. (potential confusion resolved perhaps by the position of the noun wrt to the verb)

Page 18: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Method

• Study investigates whether gestured word order across languages (English-SVO, Japanese, Korean-SOV) is depending on semantic reversibility of the event– Initial bias to SOV– Initial bias to native language – Communicative or memory pressures

• English – Shift to SVO (second and third factors)

• Japanese&Korean – Shift to SVO (only due to the third factor)

Page 19: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Method

• Brief silent animations of intransitive/transitive events– First verbally described the animations– Then hand-gestured the meanings of the events

• Verbal and gesture responses were coded for the relative position of the agent, action, and patient

Page 20: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 21: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Experiment 1

• Animate/inanimate patients (reversible or non-reversible sentences)

• More SVO word orders should be produced if reversible

• Results – uniformly SVO for verbal responses– Gestured S before O for animate patients– Gestured V before O for human patients (as

expected)– Overwhelmingly gestured SOV for non-reversible

events

Page 22: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 23: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Experiment 1&2 – Japanese/Korean

• English participants’ results can be explained without resorting to noisy-channel hypothesis– Participants may shift from SOV to native (SVO)

due to increased ambiguity in reversible events • Thus, tested participants with a SOV native

language– Expected shift to SVO in reversible events

• Experiment 2 – used more complex structuresThe old woman says that the fireman kicks the girl

Page 24: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 25: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• If participants use native word-order (SOV)– Then they should gesture both levels of

embedded events with the same order: S1 [S2O2V2] V1

• In case of reversible events SOV creates maximal potential confusion– Then they should gesture using SVO:

S1 V1 [S2V2O2]

Experiment 1&2 – Japanese/Korean

Page 26: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 27: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Exp 1 results – native language word-order– J&K speakers verbalized patient before action (100%)– Gestured patient before action in both animate and inanimate

patients• Exp 2 results – shift to SVO

– J speakers never verbalized SVO; K speakers rarely – Both J&K speakers almost always gestured top-level verb in 2nd

position between the top-level subject and the embedded subject– In the embedded clause patients were gestured before the action

almost always, but more often in non-reversible events (both for J&K speakers)

• Results predicted by noisy-channel but not by the combination of SOV default and native-language order

Experiment 1&2 – Japanese/Korean

Page 28: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Experiment 3

• Alternative explanation of previous results– Minimizing syntactic dependency distances– Number of words between a syntactic head (verb)

and its dependents (subject and object)– Shorter dependencies are easier

• Shift from SOV to SVO given that SVO allows for shorter dependency distances

Page 29: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Experiment 3 - method• Animations of a boy and a girl interacting with one of a set

of objects:– Circle/star/heart which was either– Spotted/striped (surface); in a box/pail (container);

wearing a top/witch’s hat (headwear)– Giving/putting/intransitive event

• Participants were to gesture each event and the features of the object

• If sensitive to distance b/n agent and verb, then higher SVO gesture order for longer patient descriptions

• No such shift predicted by noisy channel – patient is not a possible agent of the verb, adding modifiers will not affect the recoverability of who is doing what to whom

Page 30: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 31: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

• Gestured patient before action for most of events

• Verbalized action before patient for most of events

• Even with long productions still gestured patient before action, consistently with the noisy-channel hypothesis and not with the dependency-distance hypothesis

Experiment 3 - results

Page 32: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Discussion

• English speakers have a strong SOV preference for non-reversible events even when the inanimate patient has up to 3 features to be gestured

• SOV seems to be the preferred word order in human communication

• For reversible events the preference for SOV disappears in favor of SVO

• Although SOV-natives gesture SOV in simple events, they revert to SVO for more complex ones

• This shift to SVO occurs in order to maximize meaning recoverability

Page 33: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Discussion

• Case marking is often used in SOV– Mitigates the confusability of subject and object, helping to

retain the default SOV• If no case marking is used, then SVO shift• Large majority of SOV languages are case marked, whereas few of

SVO are• Used location in space as possible case marking in the experiments

– Of the case-marked gestures most had SOV order• Animacy-dependent case marking

– Many languages mark only animate direct objects• Non SVO languages have more word-order flexibility than SVO

– Contain other mechanisms for disambiguation – So fixed word orders mostly SVO

Page 34: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.
Page 35: Information Density and Word Order. Why are some word orders more common than others? In the majority of languages (with dominant word order) subjects.

Conclusion

• No need for sophisticated innate machinery to explain word-order variation

• Many aspects of crosslinguistic word-order variance are easily explained by communicative or memory pressures


Recommended