Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | martin-gregory |
View: | 215 times |
Download: | 0 times |
[1].Handling Structural Divergences and Recovering Dropped Arguments
in a Korean/English Machine Translation System
[2].Learning to express motion events
in English and Korean: The influence of language-specific lexicalization patterns
2004 Fall
Presented by Yeongmi Jeon
Handling Structural Divergences and Recovering Dropped Arguments
in a Korean/English Machine Translation System
Chung-hye Han, Martha Palmer
(IRCS/CIS, UPenn)
Benoit Lavoie, Richard Kittredge,
Tanya Korelsky, Myunghee Kim
(CoGenTex, Inc.)
Owen Rambow
(ATT Labs-Research)
Nari Kim
(Konan Technology, Inc.)
AMTA ’2000
Oct. 12 - 14, 2000
Outline of the Talk
• Linguistic issues
• System overview• Deep Syntactic Structure (DSyntS)• Parser output conversion• Handling structural divergences: Transfer• Dropped argument recovery
Linguistic Issues in Korean/English MT-1
Word Order
SOURCE: chuka kongkwupmul-eul 103 ceonwiciweontaetae-eke saryeongpu-ka cueossta
GLOSS: additional supply-Acc 103rd forward support battalion-Dat headquarters-Nom gave
TARGET: Headquarters gave 103rd forward support battalion additional supplies.
OUTPUT: Headquarters gave an additional supply to a 103rd forward support battalion.
Linguistic Issues in Korean/English MT-2
Dropped arguments and Morphology
SOURCE: IBP hwail-eul keomsaekhaci moshaess-tamyeon cikeum tasi ponaekessta.
GLOSS: IBP file-Acc retrieve could_not- if now again will_send
TARGET: If (NP1) could not retrieve IBP file, (NP2) will send again (NP3) now.
OUTPUT: If one can not retrieve an IBP file, one will send it again now.
Overview of the System
Deep Syntactic Structure-1
• Dependency structure based on Meaning Text Theory (Mel’cuk 1988).
• Nodes are labeled by lexemes.
• Directed arcs with dependency relation labels: I, II, III, ATTR.
Critical to the success of translation!!!
• Grammatical information is represented as features on the node labels.
• Well suited to MT:
Abstracts away from superficial grammatical differences between languages, such as linear order and the usage of function words.
DSyntS-2: example ‘John often eats beans.’
Predicate-Argument Lexicon-1: English
• Subcategorization information for verbs and adjectives.
Critical for recovery of dropped arguments!!!
Predicate-Argument Lexicon-2: Korean
• Arguments are listed with case or adverbial postpositions.
-case postpositions: nominative, accusative.
-adverbial postpositions: {e-Ke}(‘to’), {Ro} (‘to’), {e-Seo} (‘from’).
Critical for conversion!!!
Conversion-1
Generic dependency structure (Yoon et. al. 1997) )
MTT-based DSyntS
-STEP 1: Rewriting feature labels.
Conversion-2
-STEP 2: Making dependency relationships more explicit.
Korean predicate-argument lexicon is used as a guide.
Conversion-3
-STEP 3: Promoting features to lexemes and vice versa.
Conversion-4: from Korean Parser Output to DSyntS
Transfer-1
• Based on DSyntS grammars that are independently motivated by source and target languages.
• Transfer rules relate DSyntS subtrees.
• Map source DSyntS subtrees to target DSyntS subtrees.
• Use of variables allows generalization of rule application.
• Features on DSyntS nodes constrain rule application.
Transfer-2
• Simplest case: The related subtrees are reduced to a single node.
• Structural divergence is represented in the transfer lexicon by including contextual information in the related subtrees.
Transfer-3: Multi-word
Transfer of predicative adjectives
Transfer-4: from Inflection to a Lexeme
Transfer-5: More Complex Example
Korean complex NP whose head noun is lexicalized as an auxiliary noun { Keos} in the context of a copular English to-infinitive.
Transfer-6: from Korean DSyntS to English DSyntS
Argument Recovery-1
• Dropped arguments must be recovered in order to obtain grammatical English sentences.
• Add default pronouns for missing arguments using grammatical and lexical knowledge.
- English predicate-argument lexicon is critical.
• This is performed just before English realization, by preprocessing the English DSyntS obtained from transfer.
Argument Recovery-2: Rules
• Insertion of Missing Actant I:
• Determining whether pronouns are animate or not:
Argument Recovery-3: Before
‘If (NP1) could not retrieve IBP file, (NP2) will send (NP3) again now.’
Argument Recovery-4: After
‘If one cannot retrieve an IBP file, one will send it again now.’
Conclusion and Future Work• Transfer based on predicate argument structures of each l
anguage.
Allows us to use off-the-shelf parsers.
• The development of a TreeBank for a Korean-English parallel corpus.
• Use syntactically annotated corpus for automatic extraction of transfer rules.
• Explicit annotation of empty arguments as well as the incorporation of a discourse model for a more principled recovery of implicit arguments.
Current Status
• Parallel corpus: military language training manual
-50,000 word tokens, 3800 word types, 5000 sentences.
• Predicate-argument lexicon
-1000 entries.
• Transfer lexicon
-4000 entries.
• Grammatical analysis -simple clause (declaratives, imperatives, interrogatives),
-complex clause (subordination, coordination),
-scrambling, empty argument, adjective phrase,
-noun phrase (compound nouns, NP modifiers, relative clauses, complex noun phrases),
-verb phrase (auxiliary verbs, light verbs, compound verbs),
-negation, copular sentence, adverb modification, etc.
Learning to express motion events in English and Korean: The influence of language-spec
ific lexicalization patterns
Soonja Choi and Melissa Bowerman
Outline of the Talk
• Introduction
• Semantic components of a motion event• English:
-Conflation of Motion with Manner or Cause
• Korean: Mixed conflation pattern
-Spontaneous motion
-Caused motion
Introduction-1
• Encoding of motion events
-provides core structuring principles to
many meanings
-different in many languages
• Language acquiring -two sources : nonlinguistic knowledge , semantic
organization of the language
-want to know how they interact in acquiring of a language
Introduction-2
• 4 basic components of (dynamic) motion event-Motion, Figure, Ground, Path
• Additional components -Manner, Cause, Deixis
• Fundamental typological differences [Talmy] in how a motion event is expressed
-3 patterns 1> [Motion + [Manner|Cause] ] - [Path]2> [Motion + Path] - [Manner|Cause]3> [Motion + Figure] - [Path] - [Manner|Cause]
English
• Usual pattern = [ Motion + [Manner | Cause] ] – [Path] [Motion + Manner]
The rock SLID/ROLLED/BOUNCED down ( the hill )
[Motion + Cause]
The wind BLEW the napkin off the table
[Motion + Deixis]: (towards vs. away from the speaker)
John CAME/WENT into the room
• The same verb conflations in both intransitive, transitive sentences
• Path - marked in the same way in both intransitive, transitive sentences
Korean-1: Basic
• Different encoding patterns for transitive, intransitive verbs
• Path markers are also verbs: No dedicated system of morphemes
- <cf.> prepositions or particles in English
- 3 locative case endings: are suffixed to a Ground nominal and function like prepositions
EY “at, to”, -LO “toward”, -EYSE “from”
• Basic word order: subject-object-verb
• Verb phrase: one or more “full” verbs- The final verb bears all the inflectional suffixes
- Compound verb: connected by a “connecting” suffixes
Korean-2: Spontaneous motion
• Main verb: usually KATA “go” or OTA “come”
• Pattern = [Manner] - [Path] - [Motion+Deixis]
• Path verbs- Do not express posture changes
<cf.> up, down in English for changes of location and postures
• Posture changes with monomorphemic verbs- ANCTA “sit down”, NWUPTA “lie down”- [Path]-[posture verbs]: serialized events
OLLA ANCTA “get on to a higher surface and sit down"
Korean-2: Spontaneous motion verb-1
Korean-2: Spontaneous motion verb-2
Korean-3: Caused Motion Verbs-1
Korean-3: Caused Motion Verbs-2
Korean-3: Caused motion-1
• Pattern = [Motion+Path] • Path
- Different forms - Different meanings: Require finer distinction in actions
<ex.> KKITA/PPAYTA Path category“putting in/on/together”
result in a fitting relationship = KKITAloose = NEHTAsurface contact = NOHTA,
PWUTHITA- Incorporate aspects of Figure and Ground also
: different verbs for different Figures or Ground
Korean-3: Caused motion-2
• Deixis- No deictic transitive verb
<cf.> take, bring in English , KATA, OTA in Korean intransitive
- Special encoding
take = KACY-E "have" - KATA "go"
bring = KACY-E "have" - OTA "come"
Korean-3: Caused motion-3
• [Manner|Cause]-[Path]- Possible but less frequent than in English- Reason = Different restrictions on obligatory information
English: Better spell out Path completely John threw his keys TO his desk ( x )
John threw his keys ONTO his desk ( o )
Korean: Path can often be omitted
if Manner or Cause supplied
if the relationship between Figure and Ground can be easily inferred locative case endings are sufficient
Conclusion
Conclusion
English - The same verb conflation patterns in both spontaneous motion expressions and caused motion expressions
- Encodes Path separately with the same markers for both kinds of motions
Korean- Different lexicalization patterns for spontaneous and caused motion
- Path markers (verbs) are different for two kinds of motions and have narrower usage ranges