Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | jewel-little |
View: | 220 times |
Download: | 1 times |
Analysis for Spoken Language Translation Using
Phrase-Level Parsing and Domain Action Classification
Chad Langley
Language Technologies InstituteCarnegie Mellon University
June 9, 2003
June 9, 2003 Chad Langley 2
Outline
• Interlingua-Based Machine Translation• NESPOLE! MT System Overview• Interchange Format Interlingua• Hybrid Analysis Approach• Evaluation
– Domain Action Classification– End-to-End Translation
• Summary
June 9, 2003 Chad Langley 3
Interlingua-BasedMachine Translation
Interlingua
Japanese
Arabic
Chinese
EnglishFrench German
Italian
Arabic
Chinese
English
French German
Italian
Japanese
Korean
Analyzers
Generators
Spanish
June 9, 2003 Chad Langley 4
Interlingua-Based MTat Carnegie Mellon
• Long line of research in interlingua-based machine translation of spontaneous conversational speech– C-STAR I (appointment scheduling)– Enthusiast (passive SpanishEnglish)– C-STAR II (travel planning)– LingWear (wearable tourist assistance)– Babylon (handheld medical assistance)– NESPOLE! (travel & tourism and medical
assistance)
June 9, 2003 Chad Langley 5
NESPOLE! Overview• Human-to-human speech-to-speech machine
translation over the Internet• Domains:
– Travel & Tourism– Medical Assistance
• Languages:– English – Carnegie Mellon University– German – Universität Karlsruhe– Italian – ITC-irst– French – Université Joseph Fourier
• Additional Partners– AETHRA Telecommunications– APT Trentino Tourism Board
June 9, 2003 Chad Langley 6
NESPOLE! Architecture
• Mediator connects users to translation server• Language specific servers for each language
exchange Interchange Format to perform translation
June 9, 2003 Chad Langley 7
NESPOLE! Language Servers
• Analysis Chain: Speech Text IF• Generation Chain: IF Text Speech• Connect source language analysis chain to
target language generation chain to translate
June 9, 2003 Chad Langley 8
NESPOLE! User Interface
June 9, 2003 Chad Langley 9
Interchange Format Overview
• Interchange Format (IF) is a shallow semantic interlingua for task-oriented domains
• Captures speaker intention rather than literal meaning
• Abstracts away from language-specific syntax and predicate-argument structure
• Represents utterances as sequences of Semantic Dialogue Units (SDUs)
June 9, 2003 Chad Langley 10
Interchange Format Representation
• IF representation consists of four parts1. Speaker2. Speech Act3. Concepts4. Arguments
speaker : speech_act +concept* (arguments*)
• Domain Action combines domain-independent speech act and domain-dependent concepts
}Domain Action
June 9, 2003 Chad Langley 11
Interchange Format Specification
• Defines the sets of speech acts, concepts, and arguments– 72 speech acts + 3 “prefix” speech acts– 144 concepts– 227 top-level arguments
• Defines constraints on how components can be combined– Domain actions are formed compositionally based on
the constraints for combining speech acts and concepts
– Arguments must be licensed by at least one element of the domain action
June 9, 2003 Chad Langley 12
Example
“Hello. I would like to take a vacation in Val di Fiemme.”
hello i would like to take a vacation in val di fiemme
c:greeting (greeting=hello)
c:give-information+disposition+trip
(disposition=(who=i, desire),
visit-spec=(identifiability=no, vacation),
location=name-val_di_fiemme_area))
ENG: Hello! I want to travel for a vacation at Val di Fiemme.
ITA: Salve. Io vorrei una vacanza in Val di Fiemme.
June 9, 2003 Chad Langley 13
Why Hybrid Analysis?
• Goal: A portable and robust analyzer for task-oriented IF-based speech-to-speech MT
• Previous IF-based MT systems used full semantic grammars to parse complete DAs– Useful for parsing spoken language in restricted
domains– Difficult to port to new domains
• Continue to use semantic grammars to parse small domain-independent DAs and phrase-level arguments
• Train classifiers to identify DAs
June 9, 2003 Chad Langley 14
Hybrid Analysis Approach
Use a combination of grammar-based phrase-level parsing and machine learning to produce interlingua (IF) representations
June 9, 2003 Chad Langley 15
Hybrid Analysis Approachhello i would like to take a vacation in val di fiemmec:greeting (greeting=hello)c:give-information+disposition+trip (disposition=(who=i, desire), visit-spec=(identifiability=no, vacation), location=name-val_di_fiemme_area))
June 9, 2003 Chad Langley 16
greeting= disposition= visit-spec= location=
hello i would like to take a vacation in val di fiemme
June 9, 2003 Chad Langley 17
SDU1 SDU2
June 9, 2003 Chad Langley 18
greeting give-information+disposition+trip
June 9, 2003 Chad Langley 19
Argument Parsing
• Parse utterances using phrase-level grammars
• SOUP Parser: Stochastic, chart-based, top-down robust parser designed for real-time analysis of spoken language
• Separate grammars based on the type of phrases that the grammar is intended to cover
June 9, 2003 Chad Langley 20
Grammars
• Argument grammar– Identifies arguments defined in the IFs[arg:activity-spec=]
(*[object-ref=any] *[modifier=good] [biking])
– Covers "any good biking", "any biking", "good biking", "biking", plus synonyms for all 3 words
• Pseudo-argument grammar– Groups common phrases with similar meanings into
classess[=arrival=] (*is *usually arriving)
– Covers "arriving", "is arriving", "usually arriving", "is usually arriving", plus synonyms
June 9, 2003 Chad Langley 21
Grammars
• Cross-domain grammar– Identifies simple domain-independent DAss[greeting]
([greeting=first_meeting] *[greet:to-whom=])
– Covers "nice to meet you", "nice to meet you donna", "nice to meet you sir", plus synonyms
• Shared grammar– Contains low-level rules accessible by all
other grammars
June 9, 2003 Chad Langley 22
Segmentation• Goal: Split utterances into Semantic Dialogue
Units so Domain Actions can be assigned• Potential SDU boundaries occur between
argument parse trees and/or unparsed words• An SDU boundary is present if there is a parse
tree from the cross-domain grammar on either side of a potential boundary position
• Otherwise, use a memory-based classifier to determine if an SDU boundary is present
June 9, 2003 Chad Langley 23
Segmentation Classifier• The segmentation classifier is a memory-based
classifier implemented using TiMBL• Input: 10 features based on word and parse
information surrounding a potential boundary • Output: Binary decision about presence of
SDU boundary• Training Data: Potential SDU boundaries
extracted from utterances manually annotated with SDU boundaries and parsed with the phrase-level grammars
June 9, 2003 Chad Langley 24
Segmentation Features• Preceding parse label (A-1)• Probability a boundary follows A-1 (P(A-1•))• Preceding word (w-1)• Probability a boundary follows w-1 (P(w-1•))• Number of words since last boundary• Number of argument parse trees since last
boundary• Following parse label (A1)• Probability a boundary precedes A1 (P(•A1))• Following word (w1)• Probability a boundary precedes w1 (P(•w1))
June 9, 2003 Chad Langley 25
Segmentation Features
• Probability features are estimated using counts from the training data– P(A-1•) = C(A-1•) / C(A-1)– P(w-1•) = C(w-1•) / C(w-1)– P(•A1) = C(•A1) / C(A1)– P(•w1) = C(•w1) / C(w1)
• 3 segmentation training examples in “hello i would like to take a vacation in val di fiemme”– 1 positive (between “hello” and “i”) – 2 negative (between “to” and “take”; between
“vacation” and “in”)
June 9, 2003 Chad Langley 26
Evaluation: Segmentation• Data: English and German in Travel & Tourism
and Medical Assistance domains• TiMBL parameters: IB1 (k-NN) algorithm with
Gain Ratio feature weighting, k=5 and unweighted voting
• Evaluated using 20-fold cross validation with “in-turn” examples
English Travel
German Travel
English Medical
German Medical
Accuracy 92.64% 93.04% 96.23% 93.26%
Training Examples 35690 46170 42187 7792
In-Turn Examples 23844 31234 27873 5522
Turn Boundary Examples 11846 14936 14314 2270
June 9, 2003 Chad Langley 27
Domain Action Classification
• Goal: Identify the DA for each SDU using TiMBL memory-based classifiers
• Split DA classification into two subtasks (Speech Act and Concept Sequence)– Reduces the number of classes for each
classifier– Allows for different approaches and/or
feature sets for each task– Allows for DAs that did not occur in the data
• Also classify the complete DA directly
June 9, 2003 Chad Langley 28
DA Classification Data English
Travel German Travel
English Medical
German Medical
SDUs 8289 8719 3664 2294 Domain Actions 972 1001 462 286 Speech Acts 70 70 50 43 Concept Sequences 615 638 305 179 Vocabulary Size 1946 2815 1694 1112
Corpus Information
English Travel
German Travel
English Medical
German Medical
DA 19.2% acknowledge
19.7% acknowledge
25.1% give-information+exp+h-s
27.2% acknowledge
SA 41.4% give-information
40.7% give-information
59.7% give-information
35.3% give-information
CS 38.9% No concepts
40.3% No concepts
35.0% +experience+health-status
47.3% No concepts
Most Frequent DAs, SAs, and Concept Sequences
June 9, 2003 Chad Langley 29
DA Classifiers
• SA, CS, and DA classifiers implemented using TiMBL memory-based learner
• Input: Binary features indicate presence or absence of argument and pseudo-argument labels in the phrase-level parse (200-300 features)– CS classifier also uses the corresponding SA
• Output: Best class (SA, CS, or DA)• Training Data: SDUs manually annotated
with IF representations and parsed with the argument parser
June 9, 2003 Chad Langley 30
DA Classification
• Data: English and German in Travel & Tourism and Medical Assistance domains
• TiMBL parameters: IB1 (k-NN) algorithm with Gain Ratio feature weighting, k=1
• 20-fold cross validation
English Travel
German Travel
English Medical
German Medical
Speech Act 69.82% 67.57% 77.73% 68.61% Concept Sequence 69.59% 67.08% 64.71% 69.93%
Domain Action (SA + CS) 49.63% 46.50% 51.53% 50.91% Domain Action (direct) 49.69% 46.51% 51.56% 51.18%
June 9, 2003 Chad Langley 31
Comparison of Learning Approaches
• Learning Approaches– Memory-Based Learning (TiMBL)– Decision Trees (C4.5)– Neural Networks (SNNS)– Naïve Bayes (Rainbow)
• Important Considerations– Accuracy– Speed of training and classification– Accommodation of discrete and continuous features
from multiple sources– Production of ranked list of classes– Online server mode
June 9, 2003 Chad Langley 32
Comparison of Learning Approaches
• 20-fold cross validation setup• All classifiers used same feature set
(grammar labels)• SNNS may perform slightly better but
prefer TiMBL when all factors are taken into account
English German TiMBL 69.82% 67.57% C4.5 70.41% 67.90% SNNS 71.52% 67.61% Rainbow 51.39% 46.00%
Speech Act classifier accuracy.
English German TiMBL 49.69% 46.51% C4.5 48.90% 46.58% SNNS 49.39% 46.21% Rainbow 39.74% 38.32% Domain Action classifier accuracy.
English German TiMBL 69.59% 67.08% C4.5 68.47% 66.45% SNNS 71.35% 68.67% Rainbow 51.64% 51.50%
Concept Sequence classifier accuracy.
June 9, 2003 Chad Langley 33
Adding Word Information
• Grammar label unigrams do not exploit the strengths of naïve Bayes classification
• Test naïve Bayes classifiers (Rainbow) trained on word bigrams
English German Domain Action 48.59% 48.09% Speech Act 79.00% 77.46% Concept Sequence 56.87% 57.77%
Rainbow accuracy with word bigrams.
• Words provide useful information for the task, especially for Speech Act classification
June 9, 2003 Chad Langley 34
Adding Word Information• Add word-based features to the TiMBL
classifiers1. Binary features for the top 250 words sorted by
mutual information2. Probabilities computed by Rainbow
English German TiMBL + words 78.59% 75.98% TiMBL + Rainbow 81.25% 78.93%
Words+Parse SA classifier accuracy.
English German TiMBL + words 56.48% 54.98%
Word+Parse DA classifier accuracy.
English German TiMBL SA + TiMBL CS
49.63% 46.50%
TiMBL+Rainbow SA + TiMBL CS
57.74% 53.93%
DA accuracy of SA+CS classifiers.
June 9, 2003 Chad Langley 35
Using the IF Specification
• Use knowledge from the IF specification during DA classification– Ensure that only legal DAs are produced– Guarantee that the DA and arguments
combine to form a valid IF representation
• Strategy: Find the best DA that licenses the most arguments– Trust parser to reliably label arguments– Retaining detailed argument information is
important for translation
June 9, 2003 Chad Langley 36
Using the IF Specification
• Check if the best speech act and concept sequence form a legal DA
• If not, test alternative combinations of speech acts and concept sequences from ranked set of possibilities
• Select the best combination that licenses the most arguments
• Drop arguments not licensed by the best DA
June 9, 2003 Chad Langley 37
Evaluation:IF Specification Fallback
• Test set contained 292 SDUs from 151 utterances• 182 SDUs required classification• 4% had illegal DAs• 29% had illegal IFs• Mean arguments per SDU: 1.47
Changed
Speech Act 5%
Concept Sequence 26%
Domain Action 29%
Arguments dropped per SDU
Without fallback 0.38
With fallback 0.07
June 9, 2003 Chad Langley 38
End-to-End Translation• Speech input through text output
– Reflects combined performance of speech recognition, analysis, and generation
• Travel & Tourism domain• English-to-English and English-to-Italian
– Test Set: 232 SDUs (110 utterances) from 2 unseen dialogues
• German-to-German and German-to-Italian– Test Set: 356 SDUs (246 utterances) from 2 unseen
dialogues
• Analyzer used Segmentation, Speech Act, and Concept Sequence classifiers with IF specification fallback strategy
June 9, 2003 Chad Langley 39
End-to-End Translation
• Each SDU graded by 3 human graders as very good, good, bad, or very bad
• Acceptable = very good + good• Unacceptable = bad + very bad• Majority vote among 3 graders (i.e. A
translation was considered acceptable if it received at least 2 Acceptable grades)
• Speech recognition hypotheses were also graded as if they were paraphrases produced by the translation system
June 9, 2003 Chad Langley 40
End-to-End Translation (Travel & Tourism)
English WAR
German WAR
56.4% 51.0%
Speech Recognition Word Accuracy Rates
English Output
Italian Output
SR Hypotheses 66.7% --
Translation from SR Hypotheses
50.4% 50.2%
Acceptable end-to-end translation for English travel input
German Output
Italian Output
SR Hypotheses 61.6% --
Translation from SR Hypotheses
53.4% 51.7%
Acceptable end-to-end translation for German travel input
June 9, 2003 Chad Langley 41
Work in Progress• Evaluation of end-to-end translation for
medical assistance domain• Evaluation of portability from the Travel
& Tourism domain to the Medical Assistance domain
• Data ablation studies
June 9, 2003 Chad Langley 42
Summary• I described an effective method for identifying
domain actions that combines phrase-level parsing and machine learning.
• The hybrid analysis approach is fully integrated into the NESPOLE! English and German MT systems.
• Automatic classification of domain actions is feasible despite the large number of classes and relatively sparse unevenly distributed data– <10000 training examples– Most frequent classes have >1000 examples– Many classes have only 1-2 examples
June 9, 2003 Chad Langley 43
Summary• Word and argument information can be
effectively combined to improve domain action classification performance.
• Preliminary indications are that the approach is quite portable.– English and German NESPOLE! systems
were ported from Travel & Tourism to Medical Assistance.
• Annotation: ~125 person hours• Grammar Development: ~140 person hours
June 9, 2003 Chad Langley 44
June 9, 2003 Chad Langley 45
Hybrid Analysis Approachhello i would like to take a vacation in val di fiemmec:greeting (greeting=hello)c:give-information+disposition+trip (disposition=(who=i, desire), visit-spec=(identifiability=no, vacation), location=name-val_di_fiemme_area))
SDU1 SDU2
greeting= disposition= visit-spec= location=
hello i would like to take a vacation in val di fiemme
greeting give-information+disposition+trip
June 9, 2003 Chad Langley 46
DA Classification DataCumulative Coverage of 100 Most Frequent
DAs, SAs, and CSs (English Travel Data)
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
1 11 21 31 41 51 61 71 81 91
DA
SA
CS