+ All Categories
Home > Documents > POSTECH Dialog-Based Computer Assisted Language Learning System

POSTECH Dialog-Based Computer Assisted Language Learning System

Date post: 04-Jan-2016
Category:
Upload: keaira
View: 28 times
Download: 1 times
Share this document with a friend
Description:
POSTECH Dialog-Based Computer Assisted Language Learning System. Intelligent Software Lab. POSTECH Prof. Gary Geunbae Lee. Contents. Introduction Methods DB-CALL System Example-based Dialog Modeling Feedback Generation Translation Assistance Comprehension Assistance - PowerPoint PPT Presentation
77
POSTECH Dialog-Based Computer Assisted Language Learning System Intelligent Software Lab. POSTECH Prof. Gary Geunbae Lee
Transcript
Page 1: POSTECH Dialog-Based Computer Assisted Language Learning  System

POSTECH Dialog-Based Computer Assisted Language Learning System

Intelligent Software Lab. POSTECH

Prof. Gary Geunbae Lee

Page 2: POSTECH Dialog-Based Computer Assisted Language Learning  System

Contents Introduction Methods

DB-CALL System Example-based Dialog Modeling Feedback Generation Translation Assistance Comprehension Assistance

Language Learner Simulation User Simulation Grammar Error Simulation

Discussion

Page 3: POSTECH Dialog-Based Computer Assisted Language Learning  System

RESEARCH BACKGROUND

BACKGROUND

• Globalization makes English more important as a world language• Extremely high cost of native speaker tutors• Most language learning software are dedicated to pronunciation practice• Dialog-based Computer-assisted Language Learning will be an excellent solution

ISSUES

• DB-CALL system should be able to understand student’s poor and non-native expressions• DB-CALL system should have high domain scalability to support various practical scenarios• DB-CALL system should provide educational functionalities which help students improve

their linguistic ability

Page 4: POSTECH Dialog-Based Computer Assisted Language Learning  System

PREVIOUS WORKS ON DB-CALL Let’s Go (CMU, 02-04)

Providing bus schdule information for CMU Non-native students

Adaptation the acoustic model and language model to non-native speakers

Edit-distance based corrective feedback

Page 5: POSTECH Dialog-Based Computer Assisted Language Learning  System

PREVIOUS WORKS ON DB-CALL

SPELL (Edinburgh, 05) Restourant Domain Scenario-based virtual

space Incorporating mal-rules

into the ASR grammar

Page 6: POSTECH Dialog-Based Computer Assisted Language Learning  System

PREVIOUS WORKS ON DB-CALL

DEAL (KTH, 07) Trade Domain Finite State Network-

based limited dialog management

When leaners get stuck, the system provides hints

Page 7: POSTECH Dialog-Based Computer Assisted Language Learning  System

POSTECH DB-CALL System

Crawler

Descrip-tion

Extractor

+

Parallel Sentence Extractor

+

<parallel><source>~~~~~~~</source><target

<parallel><source>~~~~~</source><target>~~~~~</target></parallel><Alignment Info><s2t>~~~~~~~~</s2t><t2s>~~~~~~~~</t2s><composition>~</composition><Additional><url>~~~~~~</url>

Example 1

Description 1

Example 2

Description 2

Example 3

Description 3

… …

ESL Dialog Tutoring

User Input

Tutor: ----------User: ----------Tutor: ----------User: ----------Tutor: ----------User: ----------Tutor: ----------User: ----------Tutor: ----------

> Expression> Description…

> Korean EXP> English EXP…

Try this ex-pression

Page 8: POSTECH Dialog-Based Computer Assisted Language Learning  System

DB-CALL System

Page 9: POSTECH Dialog-Based Computer Assisted Language Learning  System

1. Example-based Dialog Modeling

Page 10: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION Spoken Dialog System

Applications Human-Robot Interface, Telematics, Tutoring, ...

Page 11: POSTECH Dialog-Based Computer Assisted Language Learning  System

PROBLEM & GOAL PROBLEM

How to determine the next system action Knowledge-based approach

Plan recipe / ISU rule / Agenda Data-driven approach

Statistical approach Supervised Learning based on state approximation Reinforcement Learning based on MDP/POMDP

Example-based approach

GOAL To develop a simple and practical approach to dia-

log modeling for multi-domain dialog systems

Page 12: POSTECH Dialog-Based Computer Assisted Language Learning  System

IDEA

Dialog State Space

Domain = Building_GuidanceDialog Act = WH-QUESTIONMain Goal = SEARCH-LOCROOM-TYPE=1 (filled), ROOM-NAME=0 (unfilled)LOC-FLOOR=0, PER-NAME=0, PER-TITLE=0Previous Dialog Act = <s>, Previous Main Goal = <s> Discourse History Vector = [1,0,0,0,0]Lexico-semantic Pattern = ROOM_TYPE 이 어디 지 ?System Action = inform(Floor)

Dialog Corpus

USER: 회의 실 이 어디 지 ?[Dialog Act = WH-QUESTION][Main Goal = SEARCH-LOC][ROOM-TYPE = 회의실 ]SYSTEM: 3 층에 교수회의실 , 2 층에 대회의실 , 소회의실이 있습니다 . [System Action = inform(Floor)]

Turn #1 (Domain=Building_Guidance)

Dialog Example

Indexed by using semantic & discourse features

Having the simi-lar state

),(argmax* heSe iEei

Lee et al., (2006), A Situation-based Dialogue Management using Dialogue Examples, IEEE ICASSP

Page 13: POSTECH Dialog-Based Computer Assisted Language Learning  System

ALGORITHM

Query Generation Making SQL statement using Discourse

History and SLU results.

Example Search Trying to search semantically close

dialog examples in example DB given the current dialog state.

Example Selection Selecting the best example to max-

imize the utterance similarity mea-sure based on lexical and discourse information.

Noisy Input(from ASR/SLU)

ExampleSearch

ExampleSelection

QueryGeneration

Example DB

ContentDB

DiscourseHistory

NLG

RelaxationStrategy

SystemTemplate

Page 14: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS Real user evaluation

10 undergraduates

Evaluation Metric STR (Success Turn Rate)

# of successful turns / # of total turns

TCR (Task Completion Rate) # of successful dialogs / # of total dialogs

AvgUserTurn Average user’s turn length per dialog

Lee et al., (2009), Example-based Dialog Modelng for Practical Multi-domain Dialog Systems, SPECOM

System #Dialogs AvgUserTurnSTR(%)

TCR(%)

Car Navigation 50 4.54 86.25 92.00

Weather Informa-tion

50 4.46 89.01 94.00

EPG 50 4.50 83.99 90.00

Chatbot 50 5.60 64.31 -

Multi-domain 15 6.08 78.77 86.67

Page 15: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Lee et al., (2009), Example-based Dialog Modelng for Practical Multi-domain Dialog Systems, SPECOM

System Exact match Partial match No example

Car Navigation 50.22 44.49 5.29

Weather Informa-tion

69.49 25.00 5.51

EPG 58.33 37.22 4.45

Chatbot 50.71 14.29 35.00

Multi-domain 69.23 24.62 6.15

Example match rate of each dialog system

Page 16: POSTECH Dialog-Based Computer Assisted Language Learning  System

ROBUST DIALOG MANAGEMENT PROBLEM

How to overcome errors in the real world

ROBUST DIALOG MANAGEMENT Error handling

Recovering ASR/SLU errors by interacting with the user at the conversational level

N-best support Estimating the current state with uncertanity

ASR SLU DM

Noise reductionAdaptationN-best & lattice & CN

Robust parsingData-driven app.

Error handlingN-best support

+ERROR +ERROR

Lee et al., (2008), Robust management with n-best hypotheses using dialog examples and agenda, ACL

Page 17: POSTECH Dialog-Based Computer Assisted Language Learning  System

GOAL & IDEA To increase the robustness of EBDM with prior

knowledge1) Error Handling

If the system knows what the user will do next

Dynamic Help Generation

LOCATION

OFFICE PHONE NUMBER

ROOM ROLE

GUIDE

FOCUS NODE

NEXT_TASK

AgendaHelpS: Next, you can do the subtask 1) Asking the room's role, or 2)Asking the office phone num-ber, or 3) Selecting the desired room for navi-gation.

UtterHelpS: Next, you can say 1) “What is it?”, or 2) “What’s the phone number of [ROOM_NAME]?”, or 3) “ Let’s go there.

Page 18: POSTECH Dialog-Based Computer Assisted Language Learning  System

GOAL & IDEA To increase the robustness of EBDM with prior

knowledge2) N-best supportIf the system knows which subtask will be more probable next

Rescoring N-best hypotheses (h1~hn)

LOCATION

OFFICE PHONE NUMBER

FLOOR

ROOMNAME

h2

h1

h3

h4

Subtask System Utterance System Action

LOCATIONThe director’s room is Room No. 201.

Inform(RoomNumber)

N-best User Utterances Subtask P(hi|S)

U1 (h1)What are office rooms in this building?

ROOM NAME

0.2

U2 (h2) What is the floor? FLOOR 0.4

U3 (h3) Where is it? LOCATION 0.3

U4 (h4)What is the phone num-ber?

OFFICEPHONE NUMBER

0.5(More proba-

ble)

Page 19: POSTECH Dialog-Based Computer Assisted Language Learning  System

ALGORITHM

ASR SLUFromUser

w1

w2

wn

u1

u2

un

EBDM

V1

V2 V3

V6V7

V4

V5

V8

V9

s1

s2

sn

Discourse Interpretation

Focus Stack

V1

V2

ArgmaxNode

ArgmaxExample

am*

V3 V4 V6

V6

e1 e2 ek

ej*

V6

Page 20: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENT SET-UP Simulated User Evaluation

Test set : 1000 simulated dialogs (<20 user turns) Domain : Intelligent robot for building guidance Using 5-best recognition hypotheses

Evaluation Metric TCR

# of successful dialogs / # of total dialogs AvgUserTurn

Average user’s turn length per dialog AvgScore

20 * TCR + (-1) * AvgUserTurn

Page 21: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Lee et al., (2009), Hybrid Approach to Robust Dialog Management using Agenda and Dialog Examples, CSL, (Submitted)

0 10 20 30 40 503

5

7

9

11

13

15

17

P-E

P-ER

P-EA

P-EAR

WER (%)

Avera

ge S

core

Legends Methods

P-E Using only Examples

P-ER Using Examples + Recovery

P-EA Using Examples + Agenda Graph

P-EAR Using Examples + Agenda Graph + Recovery

The average score of different methods

Page 22: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Lee et al., (2009), Hybrid Approach to Robust Dialog Management using Agenda and Dialog Examples, CSL, (Submitted)

1 2 5 10 15 20 30 50 1002

4

6

8

10

12

14

16

18

WER0WER10WER20

n-best size

Avera

ge S

core

The average score of the P-EAR system according to n-best size

Page 23: POSTECH Dialog-Based Computer Assisted Language Learning  System

DEMO VIDEO PC demo

Page 24: POSTECH Dialog-Based Computer Assisted Language Learning  System

DEMO VIDEO Robot demo

Page 25: POSTECH Dialog-Based Computer Assisted Language Learning  System

2. Feedback Generation

Page 26: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION

Recast Feedback

User Input

Tutor: ----------User: ----------Tutor: ----------User: ----------

> Expression> Description…

> Korean EXP> English EXP…

Tutor: What is the purpose of you trip?User: My purpose business

Tutor: Sorry, I don’t understand. What did you say?User: I am here on business

Try this expression“I am here on business”

Clarifica-tion Re-quest

Recast Feedback

Learner Uptake

Tutoring Process

Page 27: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION

Expression Suggestion

User Input

Tutor: ----------User: ----------Tutor: ----------User: ----------

> Expression> Description…

> Korean EXP> English EXP…

Tutor: What is the purpose of you trip?

Tutor: Sorry, I can’t hear you.User: I am here on business

Try this expression“I am here on business”

TIMEOUT

Expression Sugges-

tion

Learner Uptake

Tutoring Process

Page 28: POSTECH Dialog-Based Computer Assisted Language Learning  System

PROBLEMS How to recognize user intentions despite numerous errors

in their utterances The mal-rule based technique used in previous studies doesn’t

work on low level learners due to multiple errors Some utterances even seem to have a meaning that dif-

fers from what they intended to say Intended meaning : When does the bus leave? learner’s utterance : Which time I have to leave?

How to choose appropriate user intentions to suggest when a timeout is expired

The system should take into consideration the dialog con-text as human tutors do

Performing Intention-based soft pattern-matching to gen-erate correct feedback

Page 29: POSTECH Dialog-Based Computer Assisted Language Learning  System

MATHODS Context-aware & Level-specific Intention

Recognition Intention-based pattern matching

Level 1Utterance Model

Level 1Data

Learner’s Utterance

Dialog State –basedModel

Level 2Utterance Model

Level NUtterance Model

Level 2Data

Level NData

Dialog State

Learner‘s Intention

ExampleExpresssion DB

Example Search

Example Ex-pressions

Pattern Matching

Feedback

Intention Recognizer Dialog Manager

Dialog StateUpdate

Page 30: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENT SET-UP Primitive data set

Immigration domain 192 dialogs, 3517 utterances (18.32 utt/dialog) Annotation

Manually annotated each utterance with the speaker’s intention and component slot-values

Automatically annotated each utterance with the dis-course information

Page 31: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Utterance Model Hybrid Model

Page 32: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Level-spec Hybrid

Level-spec Utterance

Level-ignore Hybrid

Level-ignore Utterance

Page 33: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS

Page 34: POSTECH Dialog-Based Computer Assisted Language Learning  System

Demo: POSTECH DB-CALL initial version 2008

Page 35: POSTECH Dialog-Based Computer Assisted Language Learning  System

3. Translation Assistance

Page 36: POSTECH Dialog-Based Computer Assisted Language Learning  System

ArchitectureExample format

Web

Parallel Sentence

Example

Extraction

ESL Dialog system / Other Applications

QueryExpression

Search Engine

Interface(function call)

<parallel><source>~~~~~~~</source><target>~~~~~~~~</target></parallel>

<Alignment Info><s2t>~~~~~~~~</s2t><t2s>~~~~~~~~</t2s><composition>~~~~<composition>

<Additional><url>~~~~~~</url>

<parallel><source>~~~~~~~</source><target>~~~~~~~~</target></parallel>

<Alignment Info><s2t>~~~~~~~~</s2t><t2s>~~~~~~~~</t2s><composition>~~~~<composition>

<Additional><url>~~~~~~</url>

<parallel><source>~~~~~~~</source><target>~~~~~~~~</target></parallel>

<Alignment><s2t>~~~~~~~~</s2t><t2s>~~~~~~~~</t2s><composition>~~~~</composition></Alignment>

<Additional><url>~~~~~~</url></Additional>

Analysis

Page 37: POSTECH Dialog-Based Computer Assisted Language Learning  System

Building Bilingual Example Word alignment Widely used in Statistical Machine Translation

IBM Model 1~5, Symmetrization heuristics Word alignment presents a correspondence of

each word/phrase in a given bilingual example Example word alignment ( GIZA++ )

Page 38: POSTECH Dialog-Based Computer Assisted Language Learning  System

4. Comprehension Assistance

Page 39: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION

ESL pob-cast

website

Expression-description

DB

Dialog Sys-tem

Description Suggestion System

English Expression-Description Example Suggestion System When the user asks for a unfamiliar English ex-

pression, the system present its description to help understanding

Expression detection

Recommend

sentence

description

Page 40: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION Expression-Description Pair Extraction Sys-

tem To present the expression example and its descrip-

tion, the system extracts expression-description pair from ESL podcast site

Phrase Description

routine test … we mean it's a normal,regular test that the doctor runs many, many different times with differentpatients, not a special test.

Treatment “Treatment” is anotherword for what the doctor gives you or does to you to help you.

Page 41: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXAMPLE[script]

[description]

Page 42: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXAMPLE[script]

[description]

Page 43: POSTECH Dialog-Based Computer Assisted Language Learning  System

Language Learner Simulation

Page 44: POSTECH Dialog-Based Computer Assisted Language Learning  System

1. User Simulation

Page 45: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION User Simulation For Spoken Dialog System

Developing `simulated user’ who can replace real users

Application Automated evaluation of Spoken Dialog System

Detecting potential flaws Predicting overall behaviors of system

Learning dialog strategy in reinforcement learning framework

Page 46: POSTECH Dialog-Based Computer Assisted Language Learning  System

PROBLEM & GOAL PROBLEM

How to model real user User Intention simulation User Surface simulation ASR channel simulation

GOAL Natural Simulation Diverse Simulation Controllable Simulation

Page 47: POSTECH Dialog-Based Computer Assisted Language Learning  System
Page 48: POSTECH Dialog-Based Computer Assisted Language Learning  System

IDEA – User Intention Simulation

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Discourse Factors + Knowledge + Events …

Dialog is sequential behaviors Especially, user intention

User Intention simulation should take care of various discourse information

User

Sys

User

Sys

User

Sys

Page 49: POSTECH Dialog-Based Computer Assisted Language Learning  System

User Intention Simulation- Linear Conditional Random Field model

Turn Turn TurnTurn

Assumption An user utterance has only one intention

UI : User Intention State State=[dialog_act, main_goal, named_entities]

DI : Previou Discourse Information System Response + Discourse History

UI

DI

UI

DI

UI

DI

UI

DI

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 50: POSTECH Dialog-Based Computer Assisted Language Learning  System

ALGORITHM

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 51: POSTECH Dialog-Based Computer Assisted Language Learning  System

User Surface Simulation PROBLEM

How to generate user surface utterance which ex-press given user intention

Approach 2-phase user utterance generation

1-phase : candidate generation 2-phase : rescoring

UtteranceUtteranceUtteranceUtterance

..

UserUtterance

Model

Simula-tion

Selected Utter-ance

Selected Utter-ance

Selected Utter-ance

Rescor-ing

1 - phase 2 -phase

Page 52: POSTECH Dialog-Based Computer Assisted Language Learning  System

1 phase - Generation

Dialog_Act _X_ Main_Goal

S1

W1

S2

W2

S3

W3

S4

W4

S5

W5

Structure Tag Transi-tion

Emission Prob.

Structure Tags : Component Slot Names + Part of Speech Tags S : member of Structure Tags given space W : member of vocabulary given space

Genera-tion

Genera-tion

Genera-tion

Genera-tion

Genera-tion

Genera-tion

Genera-tion

Genera-tion

Page 53: POSTECH Dialog-Based Computer Assisted Language Learning  System

2phase - Rescoring PROBLEM

Rescoring and Selecting the good utterances Criteria

Human-like utterance Natural word transition

APPROACH Structure and Word interpolated BLEU score

SWB score Notice that

Evaluation on system generated utterances on utterance simulation and machine translation shares the same task

SWB = β * Structure_Sequence_BLEU + (1- β)* Word_Sequence_BLEU, where 0 ≤ β ≤1

We set beta as 0.2 since Korean language is an agglutina-tive language so that it is relatively free to the structural grammar.

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 54: POSTECH Dialog-Based Computer Assisted Language Learning  System

ALGORITHM

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 55: POSTECH Dialog-Based Computer Assisted Language Learning  System

ASR Channel Simulation PROBLEM

How to simulate ASR channel Knowledge-based approach Statistical Approach

It is difficult to collect ‘speech’ data for target domain.

WER controllable simulation

APPROACH Linguistic Knowledge based simulation

Step 1 : Determining error position Step 2 : Generating Error types on error marked words Step 3 : Generating ASR Errors ( Substitution, Deletion, Insertions) Step 4 : Rescoring and selecting erroneous utterance

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 56: POSTECH Dialog-Based Computer Assisted Language Learning  System

Error Type Distribution Determining Error types

Based on the results of English Speech Recogni-tion We assume that Korean speech recognition has similar

error distribution generally.

Greenberg et al., 2000

Page 57: POSTECH Dialog-Based Computer Assisted Language Learning  System

Error Generation Insertion error

Insert random word before the ‘insertion error mark’

Deletion error Just delete it

Substitution Error Based on Sequence Alignment Algorithm

Syllable-and Phone-based Alignment Selecting some candidates in a dictionary Dynamic local alignment algorithm :

Needleman and Wunsch (1970) Get the similarity score

Similarity = α * Syllable_Alignment_Score + (1- α)* Phoneme_Alignment_Score, where 0 ≤ α ≤1

Vowel Confusion Matrix example

Page 58: POSTECH Dialog-Based Computer Assisted Language Learning  System
Page 59: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENT SET-UP Korean Car navigation Dialog system

SLU : Jeong and Lee (2006) DM : Lee et al. (2009)

Word Error Rate : 0.0 ~ 0.4 5000 dialog samples at each WER setting

Page 60: POSTECH Dialog-Based Computer Assisted Language Learning  System

Intention

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 61: POSTECH Dialog-Based Computer Assisted Language Learning  System

D-BLEU ( Discourse BLEU) is a metric for measuring naturalness of simulated dialogs in the sense of n-gram precision based on BLEU metric calculation.

Intention

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 62: POSTECH Dialog-Based Computer Assisted Language Learning  System

Utterance

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 63: POSTECH Dialog-Based Computer Assisted Language Learning  System

ASR channel

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 64: POSTECH Dialog-Based Computer Assisted Language Learning  System

ASR channel

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 65: POSTECH Dialog-Based Computer Assisted Language Learning  System

Overall prediction

Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.

Page 66: POSTECH Dialog-Based Computer Assisted Language Learning  System

2. Grammar Error Simulation

Page 67: POSTECH Dialog-Based Computer Assisted Language Learning  System

INTRODUCTION Language learner simulation requires us to in-

vent grammar error simulation on top of the general user simulation

SLU

Dialog Manager

System Utterance Generator

Dialog System

Non-native ASR

TTS

Grammar Errors Simula-tor

User Utterance Simulator

User Intention Simulator

ASR Errors Simula-tor

Language Learner Simulator

Page 68: POSTECH Dialog-Based Computer Assisted Language Learning  System

REALISTIC ERROR

He wants to go to a movie theater

He wants to to a movie theater

He want go to movie theater

VS.

Page 69: POSTECH Dialog-Based Computer Assisted Language Learning  System

PROBLEMS How to incorporate expert knowledge about

error characteristics of Korean language learners into the statistical model Subject-verb agreement errors Omission errors of the preposition of prepositional

verbs Omission errors of articles Etc.

Page 70: POSTECH Dialog-Based Computer Assisted Language Learning  System

MARKOV LOGIC NETWORK

Sungjin Lee, Gary Geunbae Lee. Realistic grammar error simulation using markov logic. ACL 2009

Page 71: POSTECH Dialog-Based Computer Assisted Language Learning  System

METHOD The generation procedure involves three steps:

Generating probability over error types for each word through MLN inference

Determining an error type by sampling the generated prob-ability for each word

Creating an ill-formed output sentence by realizing the cho-sen error types

He wants to go to a movie theater

v_agr_subprp_lex_del

at_del

none

0.0000.0000.000

0.921

0.3710.0000.000

0.449

0.0000.2840.000

0.604

0.0000.0000.000

0.866

0.0000.2690.000

0.605

0.0000.0000.355

0.506

0.0000.0000. 000

0.781

0.0000.0000.000

0.798

none v_agr_sub prp_lex_del none none at_del none none

He want go to movie theater

1 step

2 step

3 step

Inference

Sampling

Realization

Page 72: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENT SET-UP Data Sets

NICT JLE Corpus Dividing the 167 error annotated files into 3 level

groups: Beginner(1-4) : 2,905 Intermediate(5-6) : 3,296 Advanced(7-9) : 2,752

Evaluation 10-fold cross validations performed for each group

The validation results were added together across the rounds

Page 73: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS Advanced

DKL(Real || Proposed)=0.068 vs. DKL(Real || Baseline)=0.122

Page 74: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS Intermediate

DKL(Real || Proposed)=0.075 vs. DKL(Real || Baseline)=0.142

Page 75: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS Beginner

DKL(Real || Proposed)=0.075 vs. DKL(Real || Baseline)=0.092

Page 76: POSTECH Dialog-Based Computer Assisted Language Learning  System

EXPERIMENTAL RESULTS Human Judgment

Evaluated 100 randomly chosen sentences con-sisting of 50 sentences each from the real and simulated data

The sequence of the test sentences was mixed so that the human judges did not know whether the source of the sentence was real or simulated

Two-level scale (0: Unrealistic, 1: Realistic)

Sungjin Lee, Gary Geunbae Lee. Realistic grammar error simulation using markov logic. ACL 2009

Page 77: POSTECH Dialog-Based Computer Assisted Language Learning  System

Q & A


Recommended