+ All Categories
Home > Education > Evaluation of hindi english mt systems, challenges and solutions

Evaluation of hindi english mt systems, challenges and solutions

Date post: 13-Apr-2017
Category:
Upload: sajeed-mahaboob
View: 392 times
Download: 0 times
Share this document with a friend
34
HUL 455 Evaluation of Hindi-English MT systems: Challenges and Solutions A Presentation by: Sajeed Mahaboob 2011ME1111
Transcript

PowerPoint Presentation

HUL 455

Evaluation of Hindi-English MT systems: Challenges and Solutions

A Presentation by: Sajeed Mahaboob2011ME1111

1

MACHINE TRANSLATION

Translation can be defined as the act or process of translating, especially from one language into another. MT investigates the use of computer software to translate text or speech from one language (SL) to another language (TL).It is Automated system.2

2

It analyzes text from Source Language (SL), processed it and produces equivalent text in Target Language (TL).It should be without human intervention.MT systems are supposed to break the language barrier.3

3

Methods and strategiesDirect MethodTransfer MethodInterlingual Method4

4

Direct methodThe majority of MT systems of the 1950s and 1960s were based on this approach.Designed in all details specifically for one particular pair of languages.Word by word matches of the SL and TL.5

5

Transfer methodTwo stages that consist of underlying representations for both SL and TL texts.The first stage converts SL texts into SLtransfer representations.The second stage converts these into TLtransfer representations.

6

Interlingual methodConvert SL texts into semantico-syntacticrepresentations common to more than onelanguage. From such interlingual representationstexts would be generated into otherlanguages.

7

7

MT in INDIA: WHY do we need ?Multilingual country where the spoken language changes after every 50 miles.22 official languages and approximately 2000 dialects are spoken.State governments carry out their official work in their respective regional language.Translating documents manually is very time consuming and costly.8

8

English-Hindi MT systemsMANTRA MT (1997)Developed for information preservation. The text available in one Indian language is made accessible in another Indian language with the help of this system.It uses XTAG based super tagger and light dependency analyzer for performing the analysis of the input English text. The system produces several outputs corresponding to a given input.9

9

MANTRA MT(1999)It translates English text into Hindi in a specific domain of personal administration that includes gazette notifications, office orders, office memorandums and circulars.Uses the Tree Adjoining Grammar (TAG) formalism to represent the English and Hindi grammar.It uses tree transfer for translating from English to Hindi.The system was tested for the translation of administrative documents such as appointment letters, notification and circular issued in central government from English to Hindi.10

10

EnglishHindi Translation SystemA system based on transfer based translation approach, which uses different grammatical rules of source and target languages and a bilingual dictionary for translation.The translation module consists of pre-processing, English tree generator, post-processing of English tree, generation of Hindi tree, Post-processing of Hindi tree and generating output.The domain of the system was weather narration.11

Evaluation of English-hindi mt systemsLow accuracy, fluency andacceptability ofoutput of any machine translation system adversely affect the reliability and usage of that system. Evaluation taskcan ascertain how and in what ways arethe results of these systems lacking.Evaluation is one of the most important part in the development of MT systems and one cant claim MT systems success without evaluation !The need and demand for evaluating an MT system is always at a higher priority.Here, we are evaluating the output ofHindi-English language pair through two MT systems : Bing and Google.

12

12

Google MT/Translator is based on statistical and machine learning approaches based on parallel corpora. It is running for 73 languages pairs.

Bing (Microsoft) MTis also based onstatistical and machine learning approaches based on parallel corpora.Italso useslanguagespecificrule-based componentstodecode andencodesentencesfrom onelanguage toanother.Linguisticallyinformedstatisticalmachine translation.BingMT isrunningfor44 parallel languages pairs.13

13

Evaluation StrategiesEvaluation strategies are mainly divided into two sections : (a) Automatic evaluation (b) Manual or Human evaluation.Automatic evaluation of any MT system is very difficult and is not as effective as human metrics are. There are several tested MT evaluation measures frequently used, for example: BLEU, mWER, mPER and NIST.Human evaluation metrics are considered to be time taking and costly. But they are the best strategies to improve any MT systems accuracy ! !It is a common scenario where more than one translation of a sentence exists. At this level a human translator cum evaluator can judge the output correctly.14

14

challenges during evaluationSentences from the health and cuisine domains of the ILCI3 corpora are used for evaluating the MT systems.These sentences are entered in each of the systems in bulk and the output is crawled, and discrepancies are marked.In the resulting English output, several problems are noted particularly with respect to gender agreement, structural mapping, Named Entity Recognition (NER) and plural marker morphemes.15

15

During the evaluation process the following kinds of challenges are encountered.1. Tokenization2. Morph Issue3. Structural/grammatical Differences4. Errors with Gender agreement5. Parser Issues16

16

Tokenization

(i) With/Without Punctuation :(a) She goes by. (BO) He is. (GO)(b) He is (BO) He is (GO)Manual Translation: She goes.Examples (a) and (b) above exhibit how the use of a punctuation mark can significantly affect translation. This variation in results is seen only in Bing. Google exhibits consistency.17

17

Transliteration issue:(b) - A naun-stick frying pan and heat (BO) A Non - stick frying pan and heat (GO)Manual Translation: Heat the non-stick fry pan18

18

Morph Issue(i) Unknown words: One minute into the match and put chuare (BO) Mix and cook one minute, add Cuare (GO)Manual Translation: Put date palm, stir and cook for a minute.19

19

(ii) Error with Paradigm fixation: 1000 Cancer is a group of more than 1000 berryman (BO)Cancer is a group of more than 1000 illnesses (GO)

1000 Cancer is a group of more than 1,000 diseases (BO)Cancer is a group of more than 1000 illnesses (GO)Manual Translation: Cancer is a group of more than 1000 diseases.20

20

Structural/grammatical Differences. . . ?What is the VIP? (BO)VIP what is it? (GO)Manual Translation: What is the VIP? Errors with Gender agreement She goes by. (BO)He is. (GO)Manual Translation: She goes.21

21

Parser Issues 40 Due to the weakness of the muscles of the eye lens cannot read or change their size does proximity to work while the light rays have it 40 years behind the retina and above in age (BO)NO OUTPUT (GO)22

22

Human evaluation strategy has been adopted to evaluate the Bing (Microsoft) and Google MT (Hindi-English) output.

Methodology of MT testing:

For testing MT systems, 1,000 sentences were used. Their outputs were then distributed into three different human evaluators who marked MT outputs based on comprehensibility and fluency approaches.23

23

Instructions for Evaluators to Evaluate :

Read the target language translated output first.Judge each sentence for its comprehensibility.Rate it on the scale 0 to 4.Read the original source sentence only to verify the faithfulness of the translation (only for reference).Do not read the source language sentence first.If the rating needs revision, change it to the new rating.24

24

Guidelines of evaluation(on 5 point scale (over 0-4)):

The following score is to be given to a sentence by looking at each output sentence:(A) For Comprehensibility4= All meaning3= most meaning2 = much meaning1= little meaning0= none.25

(B)For fluency4= for Flawless or Perfect: (like someone who knows the language)3= for Good or Comprehensible but has quite a few errors: (like someone speaking Hindi getting all its genders wrong)2 = for Non-native or Comprehensible but has quite a few errors: (like someone who can speak your language but would make lots of error. However, you can make sense out of what is being said.)1= for Diffluent or Some parts make sense but is not comprehensible over all: (like listening to a language which has lot of borrowed words from your language- you understood those words but nothing more)0=for Incomprehensible or Non-Sense: (If the sentence does not make any sense at all - It is like someone speaking to you in a language you do not know)26

Evaluation Method27

Where Si is the score of ith sentence, for instance, If N=10, and suppose the scores obtained for the each of the 10 sentences are : S1=3, S2=3, S3=2 S4=1, S5=4, S6=0, S7=0, S8=1, S9=0, S10=0 This gives the following histogram :Number of sentences with score 4 = 1Number of sentences with score 3 = 2Number of sentences with score 2 = 1Number of sentences with score 1 = 2Number of sentences with score 0 = 3Weighted sum =14, then this produces:Comprehensibility = 40 % (Because 4 out of 10 sentences gain with a score of 2, 3, or 4.)Fluency = 14/10= 1.4 (on a scale of 0-4) 36% (on the max possible scale of 100)28

28

Table 1: Score Table to Compute Comprehensibility

Table 2: Score Table to Compute Fluency29

29

30

30

Hence, we have evaluated Bing & Google MT systems. When we examined and evaluated these systems, we found many errors. And when, we evaluated MT systems, the fluency was found to be very low but it was almost comprehensible. On comparison, Google was found to be better than Bing MT in comprehensibility.31

SuggestionsWhile giving the input sentences tokenize them and avoid the use full stop marker in final place. Both MT systems should improve their morph dictionary through corpus data and make linguistics rules for paradigm fixation(how to analyze inflectional and derivational category), and if MT systems are trained with large number of words and sentences then parsing issues might be resolved.Then, these systems will improve and the errors will decrease up to some extent. Following these steps, we can increase the Bing and Google MT systems in fluency as well as in comprehensibility.32

32

Referenceshttp://www.shodhganga.inflibnet.ac.inhttp://www.navbharattimes.indiatimes.comhttp://www.academia.eduLecture slides33

Thanks for your patience34


Recommended