+ All Categories
Home > Documents > Pair Hidden Markov Model for Named Entity Matching

Pair Hidden Markov Model for Named Entity Matching

Date post: 04-Jan-2016
Category:
Upload: stuart-maldonado
View: 36 times
Download: 2 times
Share this document with a friend
Description:
Pair Hidden Markov Model for Named Entity Matching. Peter Nabende, Jörg Tiedemann, John Nerbonne Department of Computational Linguistics Center for Language and Cognition Groningen, University of Groningen, Netherlands {p.nabende, j.tiedemann, j.nerbonne}@rug.nl. Introduction. - PowerPoint PPT Presentation
Popular Tags:
20
Pair Hidden Markov Model for Named Entity Matching Peter Nabende, Jörg Tiedemann, John Nerbonne Department of Computational Linguistics Center for Language and Cognition Groningen, University of Groningen, Netherlands {p.nabende, j.tiedemann, j.nerbonne}@rug.nl
Transcript
Page 1: Pair Hidden Markov Model for Named Entity Matching

Pair Hidden Markov Model for Named Entity Matching

Peter Nabende, Jörg Tiedemann, John NerbonneDepartment of Computational Linguistics

Center for Language and Cognition Groningen,

University of Groningen, Netherlands

{p.nabende, j.tiedemann, j.nerbonne}@rug.nl

Page 2: Pair Hidden Markov Model for Named Entity Matching

Introduction• Three types of named entities: entity names, temporal

expressions and number expressions– Entity names refer to organization, person, and location names

• There exists many entity names across different languages necessitating proper handling

• Bi-lingual lexicons comprise a very tiny percentage of entity names

• An MT system would perform poorly for unseen entity names that have translations or transliterations in a target language

• In addition to MT, similarity measurement between cross-lingual entity names is important for CLIR and CLIE applications

2

Page 3: Pair Hidden Markov Model for Named Entity Matching

Recent Work on Named entity matching• Divided into approaches that consider phonetic information and those

that do not• Lam et al. (2007) argue that many NE translations involve both

semantic and phonetic clues– Their approach is formulated as a bipartite weighted graph matching problem

• Hsu et al. (2006) measure the similarity between two transliterations by comparing physical sounds– A Character Sound Comparison (CSC) method is used that involves the

construction of a speech sound similarity database and a recognition stage

• Pouliquen et al. (2006) compute similarity between pairs of names using letter n-gram similarity without using phonetic transliterations

• We propose the use of a pair-HMM that has been successfully used for Dutch dialect similarity measurement by Wieling et al. (2007), and for word similarity by Mackay and Kondrak (2005)

3

Page 4: Pair Hidden Markov Model for Named Entity Matching

pair-HMM• The pair-HMM belongs to a family of models called Hidden Markov

Models (HMMs)• The pair-HMM originates from work on Biological Sequence Analysis by

Durbin et al. (1998)• Difference with standard HMMs lies in the observation of a pair of

sequences or a pairwise alignment instead of a single sequence (Fig. 1 and Fig. 2)

Fig. 1: An Instantiation of standard HMM

Fig. 2: An Instantiation of pair-HMM

4

Page 5: Pair Hidden Markov Model for Named Entity Matching

pair-HMM used in previous NLP Tasks

Y

ε

MEND

X1-ε- τXY -λ

1-2δ- τM

δ

δ

τM

τXY

ε

τXY

1-ε- τXY -λ

xi

xi

yj

yj

λ

λ

Fig. 3: pair-HMM used in previous work (Wieling et al. (2007) ; Mackay and Kondrak (2005))

5

Page 6: Pair Hidden Markov Model for Named Entity Matching

Proposed pair-HMM

τyλy

Y

εy

MEND

X1-εx- τx –λx

1-δx- δx-τm

δx

δy

τm

τx

εx

1-εy- τY –λy

xi

xi

yj

yj

λx

Fig. 4: Diagram illustrating proposed modifications to parameters of the pair-HMM

6

Page 7: Pair Hidden Markov Model for Named Entity Matching

Name Matching using pair-HMM• The pair-HMM is used to compute similarity scores for two

input sequences of strings• The similarity scores can be used for different purposes

– In this paper, for identification of pairs of highly similar strings

• The model uses the values of initial, transition, and emission parameters that can be determined through a training process

• Example on next slide illustrates different parameters required for computing the similarity scores

7

Page 8: Pair Hidden Markov Model for Named Entity Matching

Name Matching using pair-HMM

“peter” p e t e r

“пётр” п ё т р

State sequence M M M X M END

score = P(M0) * P(e(p:п) * P(M-M) * P(e:ё) * P(M-M) * P(t:т) * P(M-X) * P(e:_) * P(X-M) * P(r:р) * P(M-END)

TABLE 1

Illustration of an alignment between same name representation in different languages

• Equation illustrates the parameters needed to calculate the score for the alignment above

8

Page 9: Pair Hidden Markov Model for Named Entity Matching

Parameter estimation for pair-HMMs

• Arribas-Gil et al. (2005) reviewed different parameter estimation approaches for pair-HMMs: – numerical maximization approaches, and Expectation Maximization

(EM) algorithm with its variants (Stochastic EM, Stochastic Approximation EM)

• An EM approach using the Baum-Welch algorithm had already been implemented and is maintained for the pair-HMMs that have been adapted in this work

9

Page 10: Pair Hidden Markov Model for Named Entity Matching

pair-HMM training software• Wieling et al.s’ (2007) pair-HMM training software was

adapted

• The software was modified to consider use of different alphabets

• Alphabets are generated automatically from the available data set that is to be used for training– For English-Russian dataset, we obtained 76 symbols for the English

language alphabet and 61 symbols for the Russian language alphabet

– For English-French dataset, we had 57 symbols for both languages

• Another modification to Wieling’s version of the pair-HMM training software was converting the software so that it uses less files having the names to be used for training

10

Page 11: Pair Hidden Markov Model for Named Entity Matching

pair-HMM Training Data• Training data comprises pairs of names from two different

languages

• English-French and English-Russian name pairs were obtained from the GeoNames data dump and Wikipedia data dump– full names with spaces in between were not considered, if there were

full names, they had to be split and used with their corresponding matches in the other language

• For the entity name matching task, 850 distinct English-French pairs of names were extracted, and 5902 distinct English-Russian pairs of names were extracted. – For English-French, 600 pairs were used for training (282 iterations)

– For English-Russian, 4500 pairs were used for training (848 iterations)

11

Page 12: Pair Hidden Markov Model for Named Entity Matching

pair-HMM scoring algorithms• Two algorithms implemented in the pair-HMM have been

used for scoring:– Forward algorithm

• Takes all possible alignments into account to calculate the probability of the observation sequence given the model

– Viterbi algorithm• Considers only the best alignment when calculating the probability of the

observation sequence given the model

– There are also log versions for the two algorithms that compute the log value for the probability of the observation sequence

12

Page 13: Pair Hidden Markov Model for Named Entity Matching

Evaluation Measures• Two measures have been considered for evaluating the pair-HMM

algorithms:

Average Reciprocal Rank (ARR)

Equations for Average Rank (AR) and ARR follow from Voorhees and Tice (2000):

1

1( )

N

iAR R i

N 1

1 1

( )

N

iARR

N R i

13

• The computation for ARR, however, depends on the complexity of the evaluation set

i English French Rank (R(i))

klausen chiusa 5

kraków cracovie 1

TABLE 2RANKING EXAMPLE AFTER USING FORWARD-LOG ALGORITHM

Page 14: Pair Hidden Markov Model for Named Entity Matching

Evaluation MeasuresCross Entropy (CE)

Used to compare the effectiveness of different language models and useful when we do not know actual probability that generated some data

For the pair-HMM, CE is specified by:

1 2

1 1 1 1( , )

1( , ) lim ( : ,..., : ) log ( : ,..., : )le le le le

lex V y V

H p m p x y x y m x y x yle

which is approximated to:

1 2

1 1( , )

1 1( ) log ( : ,..., : )le le

x V y V

H m m x y x yn le

14

Page 15: Pair Hidden Markov Model for Named Entity Matching

Results ARR Results

Algorithm ARR (for N = 164)

Viterbi-log 0.8099

Forward-log 0.8081

TABLE 3ARR RESULTS FOR ENGLISH-FRENCH DATA

Algorithm ARR (for N = 966)Viterbi 0.8536Forward 0.8546Viterbi-log 0.8359Forward-log 0.8355

TABLE 4ARR RESULTS FOR ENGLISH-RUSSIAN DATA

15

Page 16: Pair Hidden Markov Model for Named Entity Matching

Results• ARR results show no significant difference between the

accuracy of the two algorithms

Cross Entropy Results

Algorithm name-pair CE (for n = 1000)Viterbi 32.2946Forward 32.2009

TABLE 5CROSS ENTROPY RESULTS FOR ENGLISH-RUSSIAN

DATA

16

• There is no significant difference in the accuracy of the Viterbi and Forward algorithms

Page 17: Pair Hidden Markov Model for Named Entity Matching

Conclusion• A pair-HMM has been introduced for application in matching

entity names

• The evaluation carried out so far is not sufficient to give critical information regarding the performance of the pair-HMM

• The results show no significant differences between the Viterbi and Forward algorithms

• However, ARR results from the experiments are encouraging

• It is feasible to use the pair-HMM in the generation of transliterations

17

Page 18: Pair Hidden Markov Model for Named Entity Matching

Future Work

• It should be interesting to create other structures associated with the pair-HMM; for example, so as to incorporate contextual information

• The pair-HMM needs to be evaluated against other models– Alignment-based discriminative string similarity as proposed in

Bergsma and Kondrak(2007) for the task (cognate identification) will be considered

18

Page 19: Pair Hidden Markov Model for Named Entity Matching

THANKS !

Questions ?

19

Page 20: Pair Hidden Markov Model for Named Entity Matching

References1. W. Lam, S-K. Chan and R. Huang, “Named Entity Translation Matching and Learning: With

Application for Mining Unseen Translations,” ACM Transactions on Information Systems, vol. 25, issue 1, article 2, 2007.

2. C-C. Hsu., C-H. Chen, T-T. Shih and C-K. Chen, “Measuring Similarity between Transliterations against Noise and Data,” ACM Transactions on Asian Language Information Processing, vol. 6, issue 2, article 5, 2005.

3. M. Wieling, T. Leinonen and J. Nerbonne, “Inducing Sound Segment Differences using Pair Hidden Markov Models. In J. Nerbonne, M. Ellison and G. Kondrak (eds.), Computing and Historical Phonology: 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology Workshop, Prague, pp. 48-56, 2007.

4. W. Mackay and G. Kondrak, “Computing Word Similarity and Identifying Cognates with Pair Hidden Markov Models,” Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL), pp. 40-47, Ann Arbor, Michigan, 2005.

5. R. Durbin, S.R. Eddy, A. Krogh and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Protein and Nucleic Acids. Cambridge University Press, 1998.

6. A. Arribas-Gil, E. Gassiat and C. Matias, “Parameter Estimation in Pair-hidden Markov Models,” Scandinavian Journal of Statistics, vol. 33, issue 4, pp. 651-671, 2006.

7. E.M. Voorhees and D.M. Tice. The TREC-8 Question Answering Track Report. In English Text Retrieval Conference (TREC-8), 2000.

8. C-J. Lee, J.S. Chang and J-S.R. Juang. A Statistical Approach to Chinese-to-English Back Transliteration. In Proceedings of the 17th Pacific Asia Conference, 2003.

9. S. Bergsma and G. Kondrak. Alignment-Based Discriminative String Similarity. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages. 656-663, Prague, Czech Republic, June 2007.

20


Recommended