+ All Categories
Home > Documents > Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St...

Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St...

Date post: 01-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
4
Answering Visuo-semantic Queries with IMGpedia Sebasti´ an Ferrada, Benjamin Bustos, and Aidan Hogan Center for Semantic Web Research Department of Computer Science, Universidad de Chile {sferrada,bebustos,ahogan}@dcc.uchile.cl Abstract. IMGpedia is a linked dataset that provides a public SPARQL endpoint where users can answer queries that combine the visual similarity of images from Wikimedia Commons and semantic information from exist- ing knowledge-bases. Our demo will show example queries that capture the potential of the current data stored in IMGpedia. We also plan to discuss potential use-cases for the dataset and ways in which we can improve the quality of the information it captures and the expressiveness of its queries. 1 Introduction Wikimedia Commons 1 is a large-scale dataset that contains about 30 million freely usable media files (image, audio and video), many of which are used within Wikipedia articles and galleries; it also contains meta-data about each file, such as its author, licensing, and the articles where the file is used. Using this information, DBpedia Commons [5] automatically extracts the meta-data of the media files of Wikimedia Commons pages and presents the resulting corpus as a linked dataset. To compliment DBpedia Commons, we have created IMGpedia [2]: a linked dataset that contains different feature descriptors for 14.8 million images from the Wikimedia Commons. IMGpedia also provides similarity relations among the images, as well as references to DBpedia [3] if the image is used on a Wikipedia article of the entity. This dataset thus enables people to perform visuo-semantic queries, that is, queries that combine image similarity and semantic criteria. We first introduce IMGpedia. We then show examples of queries that IMGpe- dia supports, as will be shown in the demo session. Finally we address the challenges and the future directions of the project, which we also plan to discuss in the session. 2 IMGpedia IMGpedia is a linked dataset [2] that contains three different visual descriptors for each of the 14.8 million images of Wikimedia Commons. These descriptors capture the following features of each image as high-dimensional vectors: brightness distri- bution, border orientations and color layout. IMGpedia provides static similarity relations among the images: for each image and for each descriptor, the dataset contains the 10 nearest neighbors—that is, the 10 most similar images according to how close they are in the Manhattan distance between their descriptors. This information is described in RDF using a custom vocabulary that combines novel terms with terms from established vocabularies and appropriate RDFS/OWL definitions (see our extended paper accepted for the ISWC Resources Track for 1 http://commons.wikimedia.org
Transcript
Page 1: Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St Boniface and Café Florian Fig.2: Images of Roman Catholic cathedrals in Europe that

Answering Visuo-semantic Queries with IMGpedia

Sebastian Ferrada, Benjamin Bustos, and Aidan Hogan

Center for Semantic Web ResearchDepartment of Computer Science, Universidad de Chile

{sferrada,bebustos,ahogan}@dcc.uchile.cl

Abstract. IMGpedia is a linked dataset that provides a public SPARQLendpoint where users can answer queries that combine the visual similarityof images from Wikimedia Commons and semantic information from exist-ing knowledge-bases. Our demo will show example queries that capture thepotential of the current data stored in IMGpedia. We also plan to discusspotential use-cases for the dataset and ways in which we can improve thequality of the information it captures and the expressiveness of its queries.

1 Introduction

Wikimedia Commons1 is a large-scale dataset that contains about 30 millionfreely usable media files (image, audio and video), many of which are used withinWikipedia articles and galleries; it also contains meta-data about each file, such asits author, licensing, and the articles where the file is used. Using this information,DBpedia Commons [5] automatically extracts the meta-data of the media files ofWikimedia Commons pages and presents the resulting corpus as a linked dataset.

To compliment DBpedia Commons, we have created IMGpedia [2]: a linkeddataset that contains different feature descriptors for 14.8 million images from theWikimedia Commons. IMGpedia also provides similarity relations among theimages, as well as references to DBpedia [3] if the image is used on a Wikipediaarticle of the entity. This dataset thus enables people to perform visuo-semanticqueries, that is, queries that combine image similarity and semantic criteria.

We first introduce IMGpedia. We then show examples of queries that IMGpe-dia supports, as will be shown in the demo session. Finally we address the challengesand the future directions of the project, which we also plan to discuss in the session.

2 IMGpedia

IMGpedia is a linked dataset [2] that contains three different visual descriptors foreach of the 14.8 million images of Wikimedia Commons. These descriptors capturethe following features of each image as high-dimensional vectors: brightness distri-bution, border orientations and color layout. IMGpedia provides static similarityrelations among the images: for each image and for each descriptor, the datasetcontains the 10 nearest neighbors—that is, the 10 most similar images according tohow close they are in the Manhattan distance between their descriptors.

This information is described in RDF using a custom vocabulary that combinesnovel terms with terms from established vocabularies and appropriate RDFS/OWLdefinitions (see our extended paper accepted for the ISWC Resources Track for

1 http://commons.wikimedia.org

Page 2: Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St Boniface and Café Florian Fig.2: Images of Roman Catholic cathedrals in Europe that

more details [2]). Additionaly, IMGpedia contains links to DBpedia entities andto DBpedia Commons in order to obtain further metadata related to the image.Currently we provide ∼12 million links to DBpedia: an image is linked to an entityif the image appears in the Wikipedia article of which the entity is about.

IMGpedia is publicly available as an RDF dump2 and as a SPARQL endpoint3.

3 Visuo-semantic Queries

Using SPARQL federation over the IMGpedia and DBpedia datasets, we are ableto answer visuo-semantic queries—that is, queries that combine visual similarity(e.g. images similar to a given picture of La Moneda Palace, in Santiago) withqueries about semantic facts (e.g. obtain a list of governmental palaces in Europe).Hence, an example of a visuo-semantic query would be to obtain the depictions ofthe European governmental palaces that are similar to La Moneda Palace. In thissection we show some examples of queries that can be answered using IMGpedia.

First, IMGpedia can answer image similarity queries, since it provides staticsimilarity relations among them. In our extended paper [2], an example of this kindof query – looking for images similar to one of Hopsten Marktplatz in Germany –and the respective results can be found.

We can also perform semantic image retrieval4. In Listing 1 we request theimages of the paintings made in the 16th century that are currently being displayedat the Louvre. In Figure 1 we show the results.

Listing 1: Query to retrieve images of paintings from the 16th century that aredisplayed at the Louvre.

SELECT ?url ?label WHERE {SERVICE <http :// dbpedia.org/sparql > {

?res a yago:Wikicat16th -centuryPaintings ;dcterms:subject dbc:Paintings_of_the_Louvre ; rdfs:label ?label .

FILTER(LANG(?label)=’en ’)}?img imo:appearsIn ?res ; imo:fileURL ?url . }

Finally, IMGpedia can answer visuo-semantic queries. In our extended paper [2]we show a visuo-semantic query that requests the images of museums that aresimilar to any image of an European cathedral on Wikipedia. In Listing 2 we showa SPARQL query that requests the museums that are similar to images that appearon articles categorized as Roman Catholic cathedrals in Europe, using the propertypath dcterms:subject/skos:broader* to navigate sub-categories. In Figure 2 weshow a sample of the retrieved results.

Listing 2: Federated visuo-semantic query requesting images of museums that aresimilar to images related to European cathedrals

SELECT DISTINCT ?urls ?urlt WHERE {SERVICE <http :// dbpedia.org/sparql > {

?sres dcterms:subject/skos:broader* dbc:Roman_Catholic_cathedrals_in_Europe}?source imo:appearsIn ?sres ; imo:similar ?target ; imo:fileURL ?urls .?target imo:appearsIn ?tres ; imo:fileURL ?urlt .SERVICE <http :// dbpedia.org/sparql > {

?tres dcterms:subject ?sub . FILTER(CONTAINS(STR(?sub), "Museum ")) } }

2 http://imgpedia.dcc.uchile.cl/dumps3 http://imgpedia.dcc.uchile.cl/sparql4 Currently this cannot be done in DBpedia Commons since they do not extract theappearsIn relation

Page 3: Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St Boniface and Café Florian Fig.2: Images of Roman Catholic cathedrals in Europe that

The Beggars,

by Bruegel

The Wedding at Cana,

by Veronese

St. John the Baptist,

by Leonardo

St. John the Baptist,

by Leonardo

The Wedding at Cana,

by Veronese

Baccus,

by Leonardo

Madonna with the Blue

Diadem, by Raphael

Mona Lisa,

by Leonardo

Mona Lisa,

by Leonardo

Mona Lisa,

by Leonardo

Ship of Fools,

by Bosch

Fig. 1: Images of the Wikipedia articles about paintings from the 16th centurydisplayed at the Louvre.

Basilica of St. John L. and Nat. Hist. Museum of Helsinki Cathedral of St. Mary and Dumbarton House Museum Cathedral of St. Mary and Museum of Fine Arts

Linköping Cathedral and 1st Church in Georgia Museum Essen Cathedral Plans and Plans of an Ancient Greek Mmt. Cath. Of St, Mary & St Boniface and Café Florian

Fig. 2: Images of Roman Catholic cathedrals in Europe that have a similar imagerelating to a museum.

4 Future Extensions

IMGpedia is a novel resource. We plan to demo the first release of the dataset byshowing the different kinds of queries that it is able to answer. However, we alsowish to discuss plans to extend and improve the dataset and are interested to collectfeedback from the ISWC community.5 We are currently working on the followingtasks towards improving the quality and usability of the data:

5 An issue-tracker is also available at http://github.com/scferrada/imgpedia/issues

for feedback, feature requests, suggestions, etc.

Page 4: Answering Visuo-semantic Queries with IMGpediaceur-ws.org/Vol-1963/paper615.pdfOf St, Mary & St Boniface and Café Florian Fig.2: Images of Roman Catholic cathedrals in Europe that

– Provide links to Wikidata: Categories on DBpedia are not flexible enoughfor some visuo-semantic queries. We are interested in creating links with Wiki-data [6] to see if this would enable new/better visuo-semantic queries.

– Compare similarity methods: IMGpedia was built using FLANN [4] tocompute the similarity relations. However, other approximated algorithms orindexing techniques can be used. Hence we are studying and comparing thedifferent ways to provide the similarity links.

– Include modern descriptors: The visual descriptors used in IMGpedia arerather classic techniques. We want to explore how image similarity would behaveusing more modern descriptors. One such descriptor is DeCAF7 [1], which isbased on the neural network classification of the image.

– Explore more relations among images: IMGpedia currently only providessimilarity relations between images. We will explore of there are other relationsthat are worth including, such as contains if one image forms part of another,or sameObject if two images capture the same object but with different per-spectives or scales.

– Provide user-interfaces: Currently IMGpedia can be accessed through adump, a SPARQL endpoint, or through dereferencing Linked Data IRIs. Wealso plan to investigate interfaces that will help users interact more intuitivelywith the IMGpedia dataset.

Aside from extensions and improvements to IMGpedia, we are interested tofind additional use-cases for the dataset. We believe that many applications canbe built upon IMGpedia. We can use pre-trained neural networks to classify thedataset’s images and provide the results as further context. We can also train ourown network using the classes or categories of the related DBpedia/Wikidataresources to label the images and see if these provide an improved classification.More generally, we hope that IMGpedia may become a test-bed dataset for furtherworks in the intersection of the Semantic Web and Multimedia.

Acknowledgments This work was supported by the Millennium Nucleus Center forSemantic Web Research, Grant �NC120004 and Fondecyt, Grant � 11140900.

References

1. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf:A deep convolutional activation feature for generic visual recognition. In: Internationalconference on machine learning. pp. 647–655 (2014)

2. Ferrada, S., Bustos, B., Hogan, A.: IMGpedia: a linked dataset with content-basedanalysis of Wikimedia images. In: The Semantic Web-ISWC 2017 (to appear)

3. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hell-mann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - A Large-scale,Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal (2014)

4. Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithmconfiguration. In: VISSAPP. pp. 331–340. INSTICC Press (2009)

5. Vaidya, G., Kontokostas, D., Knuth, M., Lehmann, J., Hellmann, S.: DBpedia Com-mons: Structured multimedia metadata from the Wikimedia Commons. In: The Se-mantic Web-ISWC 2015, pp. 281–289. Springer (2015)

6. Vrandecic, D., Krotzsch, M.: Wikidata: A free collaborative knowledgebase. Comm.ACM 57, 78–85 (2014)


Recommended