+ All Categories
Home > Documents > AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN...

AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN...

Date post: 28-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
4
AN INDEXING SYSTEM FOR PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC DATABASE Kiyoko Iwasaki, Kazumasa Yamazawa and Naokazu Yokoya Graduate School of Information Science Nara Institute of Science and Technology 8916-5 Takayama, Ikoma, Nara, 630-0192, Japan E-mail: { kiyoko-i, yamazawa, yokoya } @is.naist.jp ABSTRACT With the spread of digital cameras, shooting photos has been becoming an everyday aair. However, there are few meth- ods or systems to manage photos simply, and a huge amount of photo data remains unorganized. Although it is possi- ble to add appropriate words explaining the contents of the photo as one of the methods to manage photos, it requires much time and eort to input such indexes manually. It is also dicult to add indexes intended by a user automati- cally. This paper proposes a semi-automatic photo indexing system that enables users to generate indexes simply and browse a photo library eciently. Index candidates are ac- quired from map database retrieval and relevant word ex- traction using web retrieval based on shooting position and orientation information. We have implemented an indexing system prototype based on the proposed method, and have carried out some experiments. 1. INTRODUCTION In recent years, a number of methods for managing photos based on shooting position information have been investi- gated [1–3]. The Exif [4] specifies the formats of meta- data for photos including caption, camera parameters such as focal length, shooting position such as latitude/longitude and so on. Since cellular phones with both a camera and a GPS device are already in practical use, photos with shoot- ing position information will be generalized further. Shoot- ing position information can be associated with the place names and the facility names through the matching with a map database, and users can browse and retrieve photos by keywords based on the shooting position [3]. Such conven- tional works use only prepared map databases. They often output indexes which the user does not intended. The user is required to input the intended indexes manually. In this paper, we propose a semi-automatic photo in- dexing system that enables users to make location-based indexes for digital photographs easily. Place and facility names as index candidates corresponding to the subject po- sition estimated from the shooting position and orientation are acquired from a map database prepared in advance. If there are no appropriate candidates in the database, new candidates are obtained by relevant word extraction using web retrieval. The map database is updated by adding the indexes selected by users as feedback, and hereafter it can present candidates that are more appropriate. In Section 2, we describe the proposed indexing system. Section 3 describes a prototype system and some experi- ments. Finally, Section 4 summarizes the present work. 2. INDEXING PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION 2.1. Overview of photo indexing system Figure 1 shows the flowchart of the proposed method. First, a user acquires a photo in JPEG format with shooting po- sition and orientation information by a camera to which sensors such as GPS device, gyro sensor and digital com- pass are attached. As for the shooting position, the lati- tude/longitude and the altitude are acquired by the GPS de- vice. As for the shooting orientation, the angle of elevation and the direction are acquired by the gyro sensor and the compass. The camera parameters such as focal length and F-number are also acquired by the Exif header of the JPEG files. To acquire index candidates based on the subject po- sition of photo, the subject position is estimated from the shooting position and orientation and the camera parame- ters. Next, the place and facility names corresponding to the subject position are acquired from a map database prepared in advance. The acquired names are shown to the user as a list of index candidates, and then the user selects an index that is appropriate for the context of photo from the list of candidates. When there can be found no appropriate candi- dates such as more detailed names than those stored in the map database, they are obtained by relevant word extraction
Transcript
Page 1: AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN INDEXING SYSTEM FOR PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC

AN INDEXING SYSTEM FOR PHOTOS BASED ONSHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC DATABASE

Kiyoko Iwasaki, Kazumasa Yamazawa and Naokazu Yokoya

Graduate School of Information ScienceNara Institute of Science and Technology

8916-5 Takayama, Ikoma, Nara, 630-0192, JapanE-mail: { kiyoko-i, yamazawa, yokoya }@is.naist.jp

ABSTRACT

With the spread of digital cameras, shooting photos has beenbecoming an everyday affair. However, there are few meth-ods or systems to manage photos simply, and a huge amountof photo data remains unorganized. Although it is possi-ble to add appropriate words explaining the contents of thephoto as one of the methods to manage photos, it requiresmuch time and effort to input such indexes manually. It isalso difficult to add indexes intended by a user automati-cally. This paper proposes a semi-automatic photo indexingsystem that enables users to generate indexes simply andbrowse a photo library efficiently. Index candidates are ac-quired from map database retrieval and relevant word ex-traction using web retrieval based on shooting position andorientation information. We have implemented an indexingsystem prototype based on the proposed method, and havecarried out some experiments.

1. INTRODUCTION

In recent years, a number of methods for managing photosbased on shooting position information have been investi-gated [1–3]. The Exif [4] specifies the formats of meta-data for photos including caption, camera parameters suchas focal length, shooting position such as latitude/longitudeand so on. Since cellular phones with both a camera and aGPS device are already in practical use, photos with shoot-ing position information will be generalized further. Shoot-ing position information can be associated with the placenames and the facility names through the matching with amap database, and users can browse and retrieve photos bykeywords based on the shooting position [3]. Such conven-tional works use only prepared map databases. They oftenoutput indexes which the user does not intended. The useris required to input the intended indexes manually.

In this paper, we propose a semi-automatic photo in-dexing system that enables users to make location-basedindexes for digital photographs easily. Place and facility

names as index candidates corresponding to the subject po-sition estimated from the shooting position and orientationare acquired from a map database prepared in advance. Ifthere are no appropriate candidates in the database, newcandidates are obtained by relevant word extraction usingweb retrieval. The map database is updated by adding theindexes selected by users as feedback, and hereafter it canpresent candidates that are more appropriate.

In Section 2, we describe the proposed indexing system.Section 3 describes a prototype system and some experi-ments. Finally, Section 4 summarizes the present work.

2. INDEXING PHOTOS BASED ON SHOOTINGPOSITION AND ORIENTATION

2.1. Overview of photo indexing system

Figure 1 shows the flowchart of the proposed method. First,a user acquires a photo in JPEG format with shooting po-sition and orientation information by a camera to whichsensors such as GPS device, gyro sensor and digital com-pass are attached. As for the shooting position, the lati-tude/longitude and the altitude are acquired by the GPS de-vice. As for the shooting orientation, the angle of elevationand the direction are acquired by the gyro sensor and thecompass. The camera parameters such as focal length andF-number are also acquired by the Exif header of the JPEGfiles. To acquire index candidates based on the subject po-sition of photo, the subject position is estimated from theshooting position and orientation and the camera parame-ters.

Next, the place and facility names corresponding to thesubject position are acquired from a map database preparedin advance. The acquired names are shown to the user as alist of index candidates, and then the user selects an indexthat is appropriate for the context of photo from the list ofcandidates. When there can be found no appropriate candi-dates such as more detailed names than those stored in themap database, they are obtained by relevant word extraction

Page 2: AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN INDEXING SYSTEM FOR PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC

Selection of index

R ea cq u is ition of index ca ndida tes

Extracting re l e v ant w o rd s b y w e b re trie v al

Es tim ating s u b j e ct p o s itio n

A cq u iring ind e x cand id ate sf ro m m ap d atab as e

A nno tating p h o to w ith ind e x

U p d ating m ap d atab as e

S h o o ting p h o to

S e l e cting ind e x

Fig. 1. Flow diagram of photo indexing.

WGS84 ellipsoid

z

n

e

Sh oot in g posit ion

d

Su b j ec t posit ion

a

h

b

a : A n g le of dir ec t ion

b : A n g le of elev a t ion

d : D ist a n c e b et w een

sh oot in g a n d su b j ec t posit ion

h : E llipsoida l a lt it u de

� or i on t a l c oor din a t es s! st e"

: # en it h n : $ or t h e: E a st

Fig. 2. Subject position estimation.

via Internet. The user selects a word that is the most relevantto the desired word. The system extracts relevant words us-ing web retrieval with the selected word and shows them asnew index candidates to the user. The index selected by theuser is stored in a personal photo database with the photoand the metadata such as the shooting date, time and po-sition. Moreover, the index is used as feedback to the mapdatabase. After that, the map database is expected to presentcandidates that are more appropriate.

2.2. Subject position estimation

Figure 2 shows the relation between the shooting positionand the subject position. First, the shooting position whoselatitude/longitude and altitude are based on WGS84 (GPScoordinate system) is defined as the origin of the horizontalcoordinate system. On the coordinate system, the subjectposition is estimated from the angle of direction acquiredby the compass, the angle of elevation acquired by the gyrosensor and the distance to the subject acquired from theExif. The following processes use the latitude/longitude ofsubject position projected onto WGS84 coordinate.

2.3. Index candidate acquisition from map database

Index candidates are first automatically acquired by inquir-ing with estimated subject position to a map database. Theinitial map database consists of place data and facility dataincluded in map software on the market, and it is updated

Keyword selected by user

R elev a n t words

A cq ui ri n g top M p a g es of web sea rch results

E x tra cti n g n oun s a s wordi

E sti m a ti n ga p p rox i m a te n um ber ofh i t p a g es i n cludi n g

both k eyword a n d wordi

A cq ui ri n gn um ber of h i t p a g esi n cludi n g wordi

by web sea rch

C a lcula ti on of relev a n cei

S orti n g wordi

i = 1, 2, … , N

N : N u m b e r o f

e x t r a c t e d n o u n s

Fig. 3. Relevant word extraction diagram.

by using the indexes selected by a user as feedback. Eachdata in the database is composed of its name, position (lat-itude/longitude) and frequency of user selection. The indexcandidates are presented to the user in ascending order ofthe distance between the estimated subject position and theposition of the data.

2.4. Relevant word extraction using web retrieval

When the index candidates acquired from the map databaseare not appropriate, new candidates are obtained by relevantword extraction. Sato et al. [5] proposed a relevant wordextraction method using web retrieval. Since the methodaims to apply the extracted words to a dictionary, the mainpurpose of the method is to extract words accurately andthe processing time is not considered. On the other hand,our system should consider the processing time to realizeinteractive indexing.

Figure 3 shows the flowchart of relevant word extrac-tion. The user selects a word related to the desired indexfrom shown candidates. The selected word is used for theinput of extraction. Hereafter the selected word is referredto as “keyword”. First, our system requires the URL listof relevant web pages by web retrieval with the keyword,and acquires the top M web pages of the list. Next, the sys-tem extracts words without HTML tags from the web pages,and obtains the parts of speech of the words. Our system se-lects the nouns as the appropriate words from the extractedwords.

The indicator of relevance between a keyword and anextracted noun is given by Eq.(1) in the present study. Whenthe value of Eq.(1) is large, the relevance is large. Theextracted nouns are sorted in descending order of the rel-evance, and are presented to the user as new index candi-dates. The value of hitkey∩wordi is estimated approximatelyto compute relevancei in practical processing time.

relevancei =hitkey∩wordi/hitkey∪wordi

=hitkey∩wordi/(hitkey + hitwordi − hitkey∩wordi )(i = 1, 2, ...,N), (1)

Page 3: AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN INDEXING SYSTEM FOR PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC

Existing

m a p d a ta

R e l e v a nt w o r d e xtr a c tio n

Index selected by user

U p da te f req uencyU p da te p o si ti o na nd f req uency

N ew i ndex?

Y e s

N o

R eg i stra ti o n

S o urce o f i ndex?

Fig. 4. Update of map database using selected index.

whererelevancei :Relevance of wordi,

hitkey :Number of hit pages including keyword,hitwordi :Number of hit pages including wordi,

hitkey∩wordi :Number of hit pages including both keywordand wordi,

hitkey∪wordi :Number of hit pages including either keywordor wordi,

N :Number of extracted nouns.

2.5. Feedback on map database based on user selection

Figure 4 shows the flowchart of updating the map databaseby using the selected indexes as feedback. The positions ofthe indexes acquired from existing map data such as mapsoftware on the market are reliable. On the other hand, thepositions of the indexes obtained from relevant word extrac-tion may have errors which depend on the sensors at shoot-ing. A new index is registered on the map database with theestimated subject position, and the position of the existingindex is updated by Eq.(2)-(4). Since the feedbacks are re-peated, the corresponding positions of indexes are averaged.

latnew = (latprev × f reqprev + latsb j)/ f reqnew, (2)lonnew = (lonprev × f reqprev + lonsb j)/ f reqnew, (3)

f reqnew = f reqprev + 1, (4)

wherelatnew, lonnew :New lat/lon corresponding to index,

latprev, lonprev :Previous lat/lon corresponding to index,latsb j, lonsb j :lat/lon of subject position,

f reqprev, f reqnew :Frequency of user selection.

3. EXPERIMENTS

3.1. Prototype system

We have implemented a prototype system based on the pro-posed method and have made indexes of photos obtained inreal environments.

Indexc a ndi da t es

Index

S h o o t i ngp o s i t i o n

a nd o r i ent a t i o n

N et w o r k

( W i r el es s L A N �

P H S et c . �

N et w o r k

( W i r el es s L A N �

P H S et c . �S ens o r - a t t a c h ed c a m er a

C l i ent P C

Server

GPS d e v i c e

e T r e x Su m m i t

�Ga r m i n �

Gy r o s e n s o r

I n e r t i a C u b e 2

�I N T E R SE N SE �

C a m e r a

E O S K i s s D i g i t a l � C a n o n �

a p

da t a ! a s e

Fig. 5. Prototype system.

ReacquisitionD ecision

I nd ex

cand id ates

P h oto

Fig. 6. Indexing user interface.

Figure 5 shows the structure of the prototype system. Itis composed of a camera, GPS device and gyro sensor witha compass. Photos and their shooting positions and orien-tations are acquired by the sensors. The acquired data isstored in a client PC. When indexing, a user selects pho-tos with the shooting position and orientation on a user in-terface. The user interface has input forms built on a webbrowser. The selected photos as well as their shooting po-sitions and orientations are sent from the client PC to theserver which has the map database via network. The servergenerates index candidates for the photos received and re-turns them to the client PC.

Figure 6 shows a screen shot of the user interface forindexing. The user selects an index from the drop-down listof index candidates. When there is an index which the userintended, the user selects the index and pushes the decisionbutton for sending the data to the server. When there are noappropriate candidates, the user selects a word related to theindex which the user intended and pushes the reacquisitionbutton to request new candidates acquisition to the server.

The map database in the server consists of the facilitydata provided by Japanese government and the data in mapsoftware on the market. In relevant word extraction, thesystem uses Google API [6] as a search engine and acquirestop 10 pages of search results.

3.2. Indexing for photos

Figure 7 shows examples of photos which were shot at Ya-kushiji, a temple in Nara, Japan. The initial map databasedoes not have any building names in “Yakushiji”, but it hasdata of “Yakushiji”. We have carried out the indexing from

Page 4: AN INDEXING SYSTEM FOR PHOTOS BASED ON ...yokoya.naist.jp/paper/datas/847/icme_camera-ready.pdfAN INDEXING SYSTEM FOR PHOTOS BASED ON SHOOTING POSITION AND ORIENTATION WITH GEOGRAPHIC

(a) Kondo 1 (b) Kondo 2 (c) Genjosanzoin

Fig. 7. Photos to be indexed.

Figures 7(a) to (c).Figure 7(a) shows a photo whose subject is the build-

ing “Kondo”. Suppose that the user aims to append index“Kondo” to the photo. Since the server has the initial mapdatabase, “Yakushiji” is presented as one of the index can-didates to the user as shown in Figure 8(a). The user se-lects “Yakushiji” relevant to “Kondo” from the list of indexcandidates and pushes the reacquisition button. The serverextracts the words relevant to “Yakushiji”. The extractedwords are shown to the user as new index candidates (SeeFigure8(b)). Since the list includes “Kondo” (ranked 19th),the user selects it as the index to the photo. After that, themap database is updated with the index as feedback. Thenumber of index candidates obtained by the relevant wordextraction is 402 in this case, and almost all of the build-ing names in Yakushiji are included in the top 100 words ofthem.

Figure 7(b) shows a photo of the same “Kondo”, butthe shooting position is different from the previous one. Asshown in Figure 8(c), “Kondo” is included in the list of in-dex candidates, because “Kondo” has already been storedin the map database at the previous photo indexing. Henceindexing has became easier. As for Figure 7(c), “Genjosan-zoin” (ranked 5th) is selected from the list of index candi-dates shown in Figure 8(d) and is appended to the photo.

4. CONCLUSIONS

In this paper, we have proposed a semi-automatic photo in-dexing system that enables users to make indexes easily andbrowse a photo library efficiently. In experiments, we haveconfirmed the feasibility of adding appropriate index to pho-tos based on shooting position and orientation. We have alsoconfirmed that the system can present more appropriate in-dex candidates from the database after its update during theindexing process.

Future work includes the improvement of the matchingbetween the position data in map database and the estimatedsubject positions. The current system acquires index can-didates in order of the distance between a subject positionand a position data in map database. We will consider thesensor accuracy, the depth of field and the area of subject.Other points of future work are concerned with experimentsin a wide area by various users. To achieve the experiments,

YakushijiYakushiji

(a) Kondo1: acquisition from mapdatabase

KondoKondo

(b) Kondo1: acquisition from rele-vant word extraction

KondoKondo

(c) Kondo2: acquisition from mapdatabase

GenjosanzoinGenjosanzoin

(d) Genjosanzoin: acquisitionfrom map database

Fig. 8. Selection of index.

more simple devices like hand-helds will be needed.

5. REFERENCES

[1] K. Toyama, R. Logan, A. Roseway, and P. Anandan,“Geographic location tags on digital images,” in Proc.11th ACM Int. Conf. on Multimedia, pp. 156–166, 2003.

[2] D. D. Spinellis, “Position-annotated photographs: Ageotemporal web,” IEEE Pervasive Computing, vol. 2,no. 2, pp. 72–79, 2003.

[3] M. Naaman, Y. J. Song, A. Paepcke, and H. Garcia-Molina, “Automatic organization for digital pho-tographs with geographic coordinates,” in Proc. 2004Joint ACM/IEEE Conf. on Digital Libraries, pp. 53–62,2004.

[4] J. Electronics and I. T. I. Association(JEITA), “Ex-changeable image file format for digital still cameras:Exif version 2.2,” 2002.

[5] S. Sato and Y. Sasaki, “Automatic collection of re-lated terms from the web,” in Proc. 41st Annual Meet-ing of the Association for Computational Linguistics,pp. 121–124, 2003.

[6] “Google Web API.” http://api.google.com/.


Recommended