Post on 08-Jan-2016
description
transcript
Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Advisor : Dr. Hsu
Presenter : Chien Shing Chen
Author: Wei-Hao Lin and Hsin-His Chen
Foreign Name Backward Transliteration in Chinese-English Cross-Language Image Retrieval
Proceedings of 2003 Workshop of Cross Language Evaluation Forum,Norway, August, 2003.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Outline
Motivation Objective Introduction Backward Transliteration Query Translation Experimental Result Conclusions Personal Opinion
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Motivation
How to retrieve multimedia data precisely a important research issue.
People with no strong language skills can easily understand the relevance of the retrieved images.
IR systems must handle proper nouns transliteration approximately to achieve better performance.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Objective
adopt text-based approach to deal with the Chinese-English cross-language image retrieval problem
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Introduction
Chinese
Phoneme
English
IPA
Similarity score +MI
MI
Phoneme
IPA
Input
F-2-H-F
F-2-H-F: First –two-highest-frequency
MI: Mutual Information
1
2
3
4 Similarity score +F2HF
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Similarity Measurement-Dynamic
Dynamic programming to trade off :alignment
similarity scoring matrix M
OPTIMALS1 (j h u g oU)
S2 (v k uo)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Candidate Filter
A transliterated word and its original word contain the same phonemes, and the order of the phonemes are the same.
After retrieving, the top rank of candidate words as the appropriate candidates of the transliterated word.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Candidate Filter
x: Chinese phoneme
y: English phoneme
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Query Translation
We adopted the following two methods to select appropriate translations:
CO modeladopt MI to measure the co-occurrence strength between words
First-two-highest-frequencyhighest occurrence frequency in the English image captions were considered as the target language query terms
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Query Translation
150 distinct Chinese query terms
Total 16 of 150 query terms is unknown word.
The terms contain 7 person names and 5 location names, and were translated by foreign names.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Conclusions
Text-based image retrieval and query translation were adopted in the experiments.