Computational Morphology and its Implications for Theoretical Morphology Richard Sproat University...

transcript

Computational Morphology and its Implications for Theoretical Morphology

Richard SproatUniversity of Illinois at Urbana-Champaign

PASCAL MorphoChallenge

Venice

April 12, 2006

Computational Morphology/Theoretical Morphology 2

“Item-and-arrangement” versus “Item and process”

• Charles Hockett (1954) “Two models of grammatical description”:– Item-and-arrangement: words are

composed of morphemes that are put together by a kind of “word syntax”

– Item-and-process: words are built up via the application of rules that add phonological and morphosyntactic information

Stump’s classification

Lexical Inferential

Incremental Lieber Steele

Realizational Halle&Marantz Stump,

Beard’s LMBM

hoot+s[3sg] Ø’s / hoot[3sg]

hoots = 3sg because of -s

-s is introduced due to 3sg

Affix is a lexical entry that introduces morphosyntactic features

Affix introduced because of morphosyntactic features

Computational morphology• Nearly all morphological operations can be expressed

in terms of regular relations.– Only possible exception is reduplication

• Regular relations are relations over pairs of strings that can be constructed solely by the operations of:– Concatenation: if R, S are regular relations then so is R• S– Union: if R, S are regular relations then so is RUS– Kleene closure: if R is a regular relation then so is R* (0 or

more instances of R concatenated with itself)

• Regular relations are closed under composition: if R, S are regular relations, then so is R○S

• Implemented with finite-state transducers

Transducers and composition(Johnson, 1972; Koskenniemi, 1983; Kaplan & Kay, 1994; Mohri & Sproat, 1996)

•Consider 3-letter alphabet {a,b,c}•Given a rule a b, the equivalent transducer is:

Another rule

bc / _ b

The two rules composed

bc / _ bab

Composition and morphology

• Composition is the most general computational mechanism that handles morphological operations (Roark and Sproat, 2006)

• Affixation (which is more typically handled using concatenation) can also be handled using composition

• Composition, and other closure properties of regular relations imply that there is no fundamental difference between morphological theories.

Affixation as composition

Any string over the alphabet Insert

Is this Rube-Goldbergesque?

• No! Because many affixes either impose requirements on their base or modify their base.

• Cf. Yowlumne (aka Yawelmani) (Archangeli, 1984)

Yowlumne gerundial -inay

• -inay requires the template CVC(C)

Composing the base with 1 will modifythe base and add [+GER]

CVC(C)

Some morphological operations

• Subsegmental morphology• Truncation• Infixation• Root-and-pattern morphology• Reduplication• Morphomic requirements (Aronoff, 1994)

• All of these can be handled using composition

German diminutives

Koasati truncation (Lombardi & McCarthy, 1991)

Two kinds of infixation

• Extrametrical infixation– E.g. Bontoc

• Positively circumscribed infixation– E.g. Ulwa

Bontoc infixation (Seidenadel, 1907)

Ulwa infixation (CODIUL, 1989)

Root & pattern morphology (McCarthy 1979)

Root & pattern morphology

Root & pattern morphology: related approaches

• Beesley & Karttunen (2000) propose an approach using compile-replace plus merge

• Kiraz (2000) proposes a multitape solution

• But all of these are equivalent to composition

d V V r V Sd u u r i sSurface form is a regular expression

Reduplication: Gothic (Wright 1910)

• Prefix a syllable of the form (A)Cai to the stem, where C is a consonant position and A is an optional appendix

• Copy the onset of the stem to the C position. If there is a pre-onset appendix /s/, copy this to the appendix position

Bambara reduplication (Culy, 1985)

This is apparently beyond the power offinite-state methods.

Factoring reduplication

• Prosodic constraints

• Copy verification transducer C

Gothic index transducer

Factoring reduplication

• Then reduplication in Gothic can be modeled as:

α o C• More generally, one can model reduplication

as the following composition, where P implements the prosodic constraints, C the copy constraints, and A optional phonological adjustments:

P o C o A

Other approaches

• Walther (2000a, 2000b) proposes a special kind of transducer involving– Repeat arcs: move backwards in a string and

repeat– Skip arcs: skip over portions of the string

• Cohen-Sygal & Wintner (forthcoming) introduce finite state registered automata, extending FSA’s with registers

• These methods generally seem to presume exact copies

Non-exact copies

• Dakota (Inkelas & Zoll, 1999):

Non-exact copies

• Basic and modified stems in Sye (Inkelas & Zoll, 1999):

“they will fall all over”

Morphological Doubling Theory(Inkelas & Zoll, 1999)

• In contradistinction to the more common “correspondence” theory:– Reduplication involves doubling at the

morphosyntactic level– Phonological doubling is thus expected,

but not required

Gothic reduplication under Morphological Doubling Theory

• Composition also elegantly accounts for other phenomena such as prosodic circumscription (McCarthy and Prince, 1990) or morphomic requirements (Aronoff, 1994).

• Composition of regular relations can model rules

• It can also model affixation• It doesn’t matter if you describe affixation as

lexical-incremental or inferential-realizational

Morphomic requirements (Aronoff, 1994)

Latin 3rd Stem

So?• 3rd stem is not morphologically uniform:

– It differs across different verb classes and some verbs have idiosyncratic third stems

• It is not semantically coherent:– Forms that require the 3rd stem are a motley crew

• Yet there is clearly a notion of 3rd stem:– If you tell me the 3rd stem of a verb, I can tell you how

the agentive noun, the supine, the perfect participle … are formed

• 3rd stem has a purely morphological function

3rd stem is just prosodically induced affixation

• Assume we have a transducer T that forms the 3rd stem of a verb:– of course, T will have to allow for a lot of

idiosyncratic changes

Σ* >3st:ε Σ*

Summary so far

• Most or all morphological operations can be handled with composition

• We wish to show next that this fact, along with general properties of regular languages and relations, allows us to dispense with distinctions between morphological theories.

Return to Stump (2001)

• In (Roark & Sproat, 2006) we reanalyze Stump’s analyses of:– Sanskrit nominal declensions– Swahili verbal declensions– Breton double plurals

• All of which purport to show the need for an realizational-inferential account.

• Here we will consider:– A simple example from Beard & Volpe’s analysis

of English agentive nominals– A quick overview of the Sanskrit case.

English Agentive Nominals (cf. Beard & Volpe, 2005)

• read-er, stand-ee, correspond-ent, record-ist, cook

• ent / [+ent][+noun,+agentive] __ $

• Call the set of all agentive rules R• We can define a new ‘metarule’ R′ that is the

union of all rules in R:

Feature [+noun,+agentive]

• Presumably this is also introduced by rule: call this rule M

• Then given a base B, the base with that feature specification added is given by B○M

• Then the appropriate suffixed form is given by [B○M]○R′

• But this can be written, by associativity, as B○[M○R′]

• Finally, [M○R′] can be precomposed; call this R′′

So what?

• R′′:– Introduces the morphosyntactic feature

[+noun,+agentive]– Introduces the affixal morphology as

appropriate to the base

• In short, R′′ encodes a lexical-incremental model of morphology.

Sanskrit declensions

Issues with Sanskrit

• Nouns have two or three stems – strong, middle and (optionally) weakest

• A different series of stem alternations cross-cuts this: guna, vrddhi, and zero:– “foot”: pād-, pad-, pd-

– strong stems may be guna or vrddhi

– middle stems may be zero, or a lexeme-specific stem

– weakest stems may be zero or lexeme-specific stem

guna zero

vrddhi lexeme-class particularlexeme-classparticular

Further issues

• Stump argues for Indexing Autonomy Hypothesis:– A stem’s index is independent of the form used for the stem– Sanskrit nominal declensions are morphomic in Aronoff’s

• Also involved are rules of referral whereby a particular form is systematically used to represent more than one slot in the paradigm.– For example, in Latin the ablative and dative plural in nominal

paradigms are identical no matter what form is used for the particular paradigm

• So we have several layers of complexity here, which would seem to make an “item-and-arrangement” approach impossible

Computational analysis

Refactoring

But this is just an item-and-arrangement analysis

Summary

• Theoretical distinctions between different approaches to morphology seem to the issue of how cleanly one can describe a given phenomenon.

• But it is not clear that they relate to important differences in underlying mechanisms.

Why morphological theory?

• Morphology has tended to develop highly articulated theories that are (often) intended to represent the morphological component of some putative ‘language faculty’.

• Need a set of mechanisms to account for complex morphological systems – e.g. Sanskrit.

• Need to account for observed universals– These might related to built-in predispositions, but equally

well might relate to historical change; cf. Blevins (2004)

• Linguistic phenomena are complex: how can children learn them?– Clearly relates to learning mechanisms

Whither morphological theory?• Assumptions underlying linguistic theory have not

changed much in the last 50 years– Arguments against statistical learning methods are

based on antiquated notions of what statistical methods are capable of

• Meanwhile there have been significant advances in machine learning over the past 10-20 years.

• Some of this has made it into computational linguistics in the form of grammar induction methods (cf. Klein and Manning, 2004; Smith 2006)

Morphological theory redux

• Computational arguments (above) suggest there may not be as much difference between morphological theories as people like to think

• Recent work on induction of morphology suggests that we need to revisit our assumptions.

• Issues of the future will likely be:– What historical mechanisms explain the observed

patterns across the world’s languages?– What general learning mechanisms can account for

children’s learning of morphology?

Computational Morphology and its Implications for Theoretical Morphology Richard Sproat University...

Documents