+ All Categories
Home > Documents > Multiword Expressions - Stanford...

Multiword Expressions - Stanford...

Date post: 09-Apr-2020
Category:
Upload: others
View: 30 times
Download: 1 times
Share this document with a friend
58
Multiword Expressions Ling 7800-065: Sign-Based Construction Grammar Instructor: Ivan A. Sag ([email protected]) URL: http://lingo.stanford.edu/sag/LI11-SBCG 1 / 42
Transcript

Multiword Expressions

◮ Ling 7800-065: Sign-Based Construction Grammar

◮ Instructor: Ivan A. Sag ([email protected])

◮ URL: http://lingo.stanford.edu/sag/LI11-SBCG

1 / 42

Purely Compositional Analysis Won’t Suffice for MWEs

◮ The Overgeneration Problem:telephone booth/box, but also *telephone closetcall/phone/ring up, but also *telephone up

◮ The Idiomaticity Problem:The meaning of kick the bucket is unrelated to the meaningsof kick, the, or bucket.

◮ Parsing Problems:E.g. in (*the) step, on (*the) sale, by and large, No can do.

2 / 42

Purely Non-Compositional Analysis (‘Words with Spaces’)

Won’t Suffice for MWEs

◮ The Flexibility Problem:look up the tower is ambiguous (“glance up at” vs. “consult areference about” (the tower))look the tower up is unambiguous (“consult a reference . . . ”)

◮ The Lexical Proliferation Problem:Light verb constructions often come in families:take a walk, take a hike, take a trip/flight...Individual listing results in considerable loss of generality andlack of prediction.

3 / 42

A Taxonomy of MWEs [adapted from Bauer (1983)]

◮ Lexicalized Phrases:

Fixed ExpressionsSemi-Fixed ExpressionsSyntactically Flexible Expressions

◮ Institutionalized PhrasesCompositional phrases cooccuring with markedly highfrequency (in a given context).

4 / 42

Fixed Expressions

◮ by and large, in short, kingdom come, every which way

◮ ad hoc (cf. ad nauseum, ad libitum, ad hominem,...), PaloAlto (cf. Los Altos, Alta Vista,...), etc.

5 / 42

Fixed Expressions are Fully Lexicalized

◮ They undergo neither morphosyntactic variation (cf. *inshorter) nor internal modification (cf. *in very short)).

◮ A simple words-with-spaces representation is sufficient.

6 / 42

Semi-Fixed Expressions

◮ kicks/kick/kicked/kicking the bucket (“die”), part(s) ofspeech, perjure him*(self)/them*(selves)

◮ adhere to strict constraints on word order and composition,

◮ but undergo some degree of lexical variation, e.g. in the formof inflection, variation in reflexive form, and determinerselection.

◮ can be treated as a word complex which is lexically variable atparticular positions.

7 / 42

Semi-Fixed Expressions:

U.S. Sports Team Names

◮ the (Oakland) Raiders

◮ an/the [[(Oakland)Raiders] player]

◮ the [Raiders and 49ers].

◮ the league-leading (Oakland) Raiders.

◮ an [[(Oakland) Raider] spokesman]

◮ *the (Oakland) 49ers

8 / 42

Syntactically-Flexible Expressions:

Semantically Decomposable Idioms

◮ take advantage (of), pull strings, keep tabs on, jump on (the)bandwagon

◮ Syntactic Flexibility:Strings had been pulled to get Sandy the job.It was the close tabs they kept on our parents that upset usmost.

◮ Internal Quantifiability:The FBI kept closer tabs on Kim than they kept on Sandy.They took more advantage of the situation than they shouldhave.

9 / 42

◮ Internal Modifiability:Many Californians jumped on the bandwagon that Perot hadstarted.She left no legal stone unturned.

◮ Nunberg, Sag and Wasow (1994):Only semantically decomposable idioms are flexible (inEnglish) and only semantically decomposable idioms allowinternal quantification and modification.

Flexibility is highly variable.

10 / 42

The Transformational Myth

about Flexible Idioms

◮ Parts of idioms (e.g. pull and strings) are uniformly insertedin underlying phrase markers.

◮ Transformational operations define the space in which parts ofidioms may be separated, e.g.:

Strings were pulled (to get the job).Strings seem to have been pulled (to getChris the job).What strings did Pat pull to get Chris the job?

11 / 42

McCawley’s (1981) Paradox

◮ If the parts of idioms are uniformly inserted in underlyingphrase markers, then one of the following examples should beill-formed:

◮ Pat pulled [[the strings] [that got Chris the job]].◮ [[The strings] [that Pat pulled ]] got Chris the job.

◮ Both are well-formed, therefore no uniform assumptions aboutlexical insertion and the transformational analysis of relativeclauses predicts the observed data.

12 / 42

A Further Problem

◮ Even if some solution could be found to McCawley’s paradox,the fundamental finding of Nunberg et al. (1994) would stillremain mysterious.

◮ Why should semantic decomposability correlate with theability to undergo transformational movement?(Transformations are semantically blind)

13 / 42

Where to Draw the Line?

◮ No soap opera worth its bubbles would spill all the beans inone episode if it could dribble them out over many.(Riehemann 2001)

◮ Microsoft released more details on its Zune player and service,but the cat still has its back paws stuck in the bag.Microsoft’s official announcement of its much-hyped Zunemusic player came today, just ... [ABC News - Sep 14, 2006]

14 / 42

◮ Syntactic Variability 1(after Riehemann 2001)

◮ modification: Diana spilled the royal beans.◮ open slots: lose X’s way,◮ passive: The beans were spilled.◮ raising (control): The hatchet appears to have been buried.

(The piper wants to be paid.)◮ topicalization: The other beans, she’ll probably spill later.

15 / 42

◮ Syntactic Variability 2

◮ distribution over several clauses:(The McCawley Paradox)

◮ pronominal reference and ellipsis:I thought the hatchet had been buried, but it appears not tohave been .They thought the cat was out of the bag, but it wasn’t .

16 / 42

Further Considerations 1

◮ properties shared between idiomatic and literal words:kick, kicked, kicking,...

◮ no literal interpretation: close up shop, tend shop, ...

◮ restricted flexibility: caught in the middle, taken aback, fitto be tied, caught short, written in stone... (passive only)

17 / 42

Further Considerations 2

◮ idiom families: lose one’s mind (marbles, wits); get off one’sass (tush(ie), rear (end), butt, duff, tuchus...); throw someoneto the dogs (lions, wolves,...)...

18 / 42

◮ Syntactically ‘Deep’ Dependencies◮ Adjectives and Specifiers: bark up the wrong tree, give me

some skin...◮ Adverbs and Adjuncts: to put it mildly, skate on thin ice, ...◮ Headless Idioms: get/set/start/keep/have the ball rolling, up

the creek without a paddle,...◮ What’s X Doing Y?: What’s this fly doing in my soup?

19 / 42

Decomposable Idioms

◮ The relationship between words in decomposable idioms canbe captured using a partially semantic mechanism [Nunberg etal. (1994)].

◮ Flat semantic representations like the MRS representationsproposed by Copestake et al. (1995, 2006) are especially wellsuited to this.

◮ cat out of the bag can be described in terms of the followingsemantic relationships, where i cat and i bag are the meaningscorresponding to the idiomatic senses of cat “secret” and bag“hiding place”:[

i cat(x) ∧ i bag(y) ∧ out(x , y)]

20 / 42

◮ Every dog chased some cat.

◮ top h0

h1 : every(x , h3, h2), h3 : dog(x), h7 : cat(y),h5 : some(y , h7, h6), h4 : chase(e, x , y)

◮ Let h0 = h1 and h2 = h5 and h6 = h4.(Every dog has wide scope.)

◮ Let h0 = h5 and h6 = h1 and h2 = h4.(Some cat has wide scope.)

21 / 42

Idiomatic Constructions

◮ cat out of bag :=

SEM

RELS

h1:i cat rel(x), h2:i bag rel(y)

h3:out rel(x,y)...

.

◮ i cat :=[

SEM[

RELS 〈 h1:i cat rel(x) 〉]

]

& / cat n1.

◮ i bag :=[

SEM[

RELS 〈 h2:i bag rel 〉]

]

& / bag n1.

22 / 42

Two Fundamental Problems:

◮ We need to ensure that all elements are present.

◮ We need to make sure that idiom chunks don’t appearelsewhere.

*(i) Sandy was out of the bag.*(i) Kim objected to those strings.*(i) We liked the tabs.*(i) The strings were offensive.

23 / 42

Locality Problem 5:

Control in Serbo-Croatian (Zec 1987), HalkomelemSalish, ...

NPi promise [COMP hei ,∗j VP]

NP persuade NPi [COMP hei ,∗j VP]

24 / 42

Locality Problem 6:

English Idioms with Pronominal Genitives

◮ Hei lost [hisi/*herj marbles].

◮ Theyi kept [theiri/*ourj cool].

◮ Ii/*Kim/*You lost [myj way].

25 / 42

External Argument (XARG)

The feature XARG is used to specify a distinguished element (e.g.subject, possessor, or object) within a given phrase. The value ofXARG is either a sign or else the distinguished element none (cf.other analyses in HPSG).

26 / 42

FORM 〈 his, book 〉

SYN

CAT

[

noun

XARG 1

]

VAL 〈 〉

1

[

FORM 〈 his 〉

SYN NP[GEN +]

]

FORM 〈 book 〉

SYN

CAT

[

noun

XARG 1

]

VAL 〈 〉

27 / 42

FORM 〈 lose 〉

SYN

CAT verb

VAL

NPi ,

NP[

LID way-rel

XARG NP[pro]i

]

28 / 42

Problem 7: Modifier Transparency in Idioms

Kim took [unfair [advantage]] of the situation.

Kim spilled [the [political [beans]]].

Solution: LID is passed up from head-daughter to motherwhen a modifier is present.

Related to this are transparent nouns:

Kim took [the [kind [of [unfair [advantage of the situation]]]]that was typical of the bourgeoisie of that era].

Hilary would keep [that [kind [of [a [promise]]]]].

29 / 42

Problem 9: Semantically Decomposable Idioms

◮ Each idiomatic verb requires the presence of the appropriateselected idiom chunk.

(Keep tabs on/*of...; *Keep advantage of; Pullstrings/*twine...)

◮ We must ensure that the idiom chunks occur only in thepresence of an appropriate selector.

(*The tabs bothered me.; *We objected to their umbrage...)

30 / 42

Syntactically-Flexible Expressions:

Semantically Decomposable Idioms

◮ take advantage (of), pull strings, keep tabs on, jump on (the)bandwagon

◮ Syntactic Flexibility:Strings had been pulled to get Sandy the job.It was the close tabs they kept on our parents that upset usmost.

◮ Internal Quantifiability:The FBI kept closer tabs on Kim than they kept on Sandy.They took more advantage of the situation than they shouldhave.

31 / 42

◮ Internal Modifiability:Many Californians jumped on the bandwagon that Perot hadstarted.She left no legal stone unturned.

◮ Nunberg, Sag and Wasow (1994):Only semantically decomposable idioms are flexible (inEnglish) and only semantically decomposable idioms allowinternal quantification and modification.

Flexibility is highly variable.

32 / 42

The Transformational Treatment of Flexible Idioms

◮ Parts of idioms (e.g. pull and strings) are uniformly insertedin underlying phrase markers.

◮ Transformational operations define the space in which parts ofidioms may be separated, e.g.:

Strings were pulled (to get the job).Strings seem to have been pulled (to getChris the job).What strings did Pat pull to get Chris the job?

33 / 42

McCawley’s (1981) Paradox

◮ If the parts of idioms are uniformly inserted in underlyingphrase markers, then one of the following examples should beill-formed:

Pat pulled [[the strings] [that got Chris the job]].

[[The strings] [that Pat pulled ]] got Chris the job.

◮ Both are well-formed, therefore no uniform assumptions aboutlexical insertion and the transformational analysis of relativeclauses predicts the observed data.

34 / 42

A Further Problem

◮ Even if some solution could be found to McCawley’s paradox,the fundamental finding of Nunberg et al. (1994) would stillremain mysterious.

◮ Why should semantic decomposability correlate with theability to undergo transformational movement?(Transformations are semantically blind)

35 / 42

◮ spill the beans,

◮ keep tabs on,

◮ pull strings,...

36 / 42

strans-v-lxm

form 〈 pull 〉

arg-st

⟨[

syn NP[ ]

sem [ind i ]

]

,

[

syn NP[lid i-strings-fr]

sem [ind j ]

]⟩

syn [cat [lid X ]]

sem

frames

X :

pullingstrings -fr

agent i

entity j

form 〈 strings 〉

arg-st 〈 〉

syn [cat [lid X ]]

sem

ind s

frames

X :

[

i-strings-fr

entity i

]⟩

37 / 42

◮ They’re pulling strings to get you the job.

◮ We have pulled strings more than once.

◮ We pulled strings to get invited.

◮ He pulls strings whenever he can.

38 / 42

Kay and Sag, to appear

◮ specify that frames are classified as idiomatic frames (of typei-frame) or canonical frames (of type c-frame)

◮ idiomatic predicators (e.g. pullstrings -fr, spillbeans -fr) areclassified as c-frames.

◮ The i-frame analysis is motivated by the basic fact that anidiomatic argument (e.g. strings in its idiomatic sense) canonly appear in construction with the right governor (e.g. pullin its appropriate idiomatic sense).

◮ The reason why examples like these only allow a nonidiomaticinterpretation, and are therefore hard to contextualize, is thatthe listemes for the verbs in these sentences select argumentswhose lid value must be of type c-frame:

◮ Leslie found the strings that got Pat the job.◮ We resented their tabs.◮ The beans impressed us.

39 / 42

◮ The motivation for classifying idiomatic predicators asc-frames is that, in spite of their idiomatic meanings, theyproject phrases (typically VPs or Ss) that freely appear innonidiomatic environments.

That is, their distribution shows none of the restrictions thatidiomatic arguments must obey, e.g.

◮ I think [Kim spilled the beans].

◮ They tried to [pull strings to get Lee the job].

◮ [With [my kids [keeping tabs on the stock market]]], I canfinally think of retiring.

◮ [Taking care of homeless animals] is rewarding.

40 / 42

◮ In this analysis, both idiomatic arguments and idiomaticpredicators correspond to the idiomatic meanings of the partsof an mwe: pullstrings -fr might be glossed as ‘exert’ andi-strings-fr as ‘influence’.

◮ Because the idiomatic meaning is distributed over the parts inthis way, it is possible to modify or quantify these parts usingthe very same analysis that is responsible for the modificationand quantification of nonidiomatic expressions.

41 / 42

Syntactic Flexibility:

◮ *iThe bucket had been kicked many times in that community.

◮ *i It was the bucket(s) that they had kicked that upset usmost.

◮ *iEuropeans will kick more buckets this year than last.

◮ *iMany Californians were kicking the bucket that theGeorgetown kool aid made available to them.

42 / 42

Kick the bucket: two listemes

strans-v-lxm

form 〈 kick 〉

syn [cat [vf ¬pas]]

arg-st

⟨[

syn NP[ ]

sem [ind i ]

]

,

syn

[

cat [lid i-bucket-fr]

mkg the

]

sem

frames

⟨[

death-fr

protagonist i

]⟩

cn-lxm

form 〈 bucket 〉

syn [cat [lid i-bucket-fr]]

sem

[

ind none

frames 〈 〉

]

43 / 42

◮ Although lexemes licensed by this bucket listeme havei-bucket-fr as their lid value, they have no frame on theirframes list and an ind value of none.

◮ Hence the bucket contributes nothing to the semanticcomposition of the sentence and provides nothing for amodifier to modify or for a quantifier to restrict.

◮ This can predict the absence of idiomatic readings for suchmodifications and quantifications.

◮ The failure of idiomatic kick to passivize can be accounted forby the constraint requiring that the vf value not be pas or bypositing an intransitive verb lexeme type.

◮ This could be replaced by a less stipulative account, shouldone be properly motivated.

44 / 42

◮ Note that although the idiomatic bucket provides no semanticargument for an internal modifier, the idiomatic bucket maynonetheless be modified by metalinguistic elements, which donot make reference specifically to the common noun’smeaning or index.

◮ Thus we find contrasts like the following:◮ *iKim kicked the awful bucket.◮ *iThey kicked the bucket that they knew was inevitable.◮ They kicked the proverbial bucket.

◮ The buck stops here

45 / 42

A Persistent Default Analysis (Sag 2006)

PHON 〈 pull 〉

SYN

VAL

[

SYN NPi

]

,

[

LID i strings rel

SYN NPj

]⟩

SEM[

RELS 〈 h0:i pull rel(i , j) 〉]

46 / 42

PHON 〈 pulled , some , strings 〉

SYN [VAL 〈 NPi 〉 ]

SEM|RELS

h0:i pull rel(i , j)

h3:some rel(j , h1, h2), h1:i strings rel(j)

PHON 〈 pulled 〉

SYN[

VAL 〈 NP, NP 〉]

SEM|RELS 〈h0:i pull rel(i , j)〉

PHON 〈 some, strings 〉

SYN NPj

LID i strings rel

SEM|RELS

h3:some rel(j , h1, h2)

h1:i strings rel(j)

[

PHON 〈some〉

SYN D

]

PHON 〈strings〉

SYN NP

LID i strings rel

SEM|RELS 〈h1:i strings rel(

47 / 42

Motivation for Persistent Defaults

◮ Kim baked. [bread, cake, etc.; not ham, etc.]

◮ Sandy drinks. [alcohol]

◮ They’ve eaten. [a meal]

◮ They climbed all day. [upward]

48 / 42

Lascarides, Copestake, Briscoe, and Asher 1996, etc.

PHON 〈 bake 〉

SYN

VAL

[

SYN NPi

]

,

[

SYN NPj

SEM|RELS /p i flour bsd rel

]⟩

SEM[

RELS 〈 h0:bake rel(i , j) 〉]

49 / 42

relation

. . .

strings rel

i strings rel l strings rel

50 / 42

PHON 〈 strings 〉

SYN

CAT

[

noun

LID 0 [strings rel /p l strings rel ]

]

VAL 〈 〉

SEM

[

INDEX i

RELS 〈 h0: 0 (i) 〉

]

51 / 42

[

PH 〈Pat,pulled,the,strings,that,got,Kim,the,job〉

SYN S

]

[

PH 〈Pat〉

SYN NP

] [

PH 〈pulled,the,strings,that,got,Kim,the,job〉

SYN VP

]

[

PH 〈pulled〉

SYN V

] [

PH 〈the,strings,that,got,Kim,the,job〉

SYN NP[LID i strings rel ]

]

0

[

PH 〈the,strings〉

SYN NP[LID i strings rel ]

] [

PH 〈that,got,Kim,the,job

SYN|GAP l 0

that

[

PH 〈got,Kim,the,job

SYN|GAP

52 / 42

[

PH 〈the,strings,that,Pat,pulled,got,Chris,the,job〉

SYN S

]

[

PH 〈the,strings,that,Pat,pulled〉

SYN NP[LID i strings rel ]

]

0

[

PH 〈the,strings〉

SYN NP[LID i strings rel ]

] [

PH 〈that,Pat,pulled〉

SYN RC[GAP l 0 ]

]

that

[

PH 〈Pat,pulled〉

SYN S[GAP l 0 ]

]

[

PH 〈Pat〉

SYN NP

] [

PH 〈pulled〉

SYN VP[GAP l 0 ]

]

[

PH 〈got,Chris,the,job〉

SYN VP

]

53 / 42

Idiom Parts without a Literal Sense

PHON 〈 umbrage 〉

SYN

CAT

[

noun

LID 0 i umbrage rel /p †

]

VAL 〈 〉

SEM

[

INDEX i

RELS 〈 h0: 0 (i) 〉

]

54 / 42

Syntactically-Flexible Expressions:

Semantically Decomposable Idioms

◮ take advantage (of), pull strings, keep tabs on, jump on(the) bandwagon

◮ Syntactic Flexibility:Strings had been pulled to get Sandy the job.It was the close tabs they kept on our parents that upsetus most.

◮ Internal Quantifiability:The FBI kept closer tabs on Kim than they kept onSandy.They took more advantage of the situation than theyshould have.

55 / 42

◮ Internal Modifiability:Many Californians jumped on the bandwagon that Perothad started.She left no legal stone unturned.

◮ Nunberg, Sag and Wasow (1994):Only semantically decomposable idioms are flexible (inEnglish) and only semantically decomposable idioms allowinternal quantification and modification.

56 / 42

Unresolved Issues

◮ Can Binding Theory be Purely Local?

◮ Are Local Analyses Adequate for All Kinds of Idiomaticity?(See Sailer and Richter’s talk on Friday)

◮ Are There Other Non-Local Phenomena that Pose Difficulties?

57 / 42

Conclusions

◮ The Head Feature Principle, the Nonlocal Feature Principleand other principles of HPSG/SBCG provide a theory of theextension of local domains.

◮ Particular proposals about features (GAP, LID, XARG, etc.),together with those principles, constitute hypotheses aboutwhat information is systematically transmitted outside of localdomains.

◮ HPSG/SBCG provides a comfortable home for such analyses.

◮ Other frameworks (to the best of my knowledge) have not asyet provided explicit hypotheses about how such theoreticalissues are to be resolved.

58 / 42


Recommended