+ All Categories
Home > Documents > Searching in the City of Knowledge - IBMLinked Data Cloud >25 Billion Triples on Linked Data Cloud...

Searching in the City of Knowledge - IBMLinked Data Cloud >25 Billion Triples on Linked Data Cloud...

Date post: 31-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
42
IBM - Dublin Research Lab Searching in the City of Knowledge SIGIR Tutorial VeliBicer, Vanessa Lopez IBM Research
Transcript
  • IBM - Dublin Research Lab

    Searching in the City of

    KnowledgeSIGIR Tutorial

    Veli Bicer, Vanessa Lopez

    IBM Research

  • IBM - Dublin Research Lab

    About the Tutorial

  • IBM - Dublin Research Lab

    Scope of the Tutorial

  • IBM - Dublin Research Lab

    Part I

    Beyond Local Search

    Veli Bicer

    IBM Research

  • IBM - Dublin Research Lab

    Outline

    • A Planet of Smarter Cities

    • City Data and Information

    • Making City Search Smarter

  • IBM - Dublin Research Lab

    A Planet of Smarter Cities

    “Cities have the capability of providing something for everybody, only because, and only when, they

    are created by everybody.”Jane Jacobs

  • IBM - Dublin Research Lab

    A planet of smarter cities: In 2007, for the first time in history, the

    majority of the world’s population—3.3 billion people—lived in

    cities. By 2050, city dwellers are expected to make up 70% of

    Earth’s total population, or 6.4 billion people.

  • IBM - Dublin Research Lab

  • IBM - Dublin Research Lab

    China

    WatsonAlmaden

    Austin

    TokyoHaifa

    Zurich

    India

    Dublin

    Melbourne

    Brazil

    IBM Research Worldwide

  • IBM - Dublin Research Lab

    Instru

    men

    ted

    Inte

    rcon

    ne

    cte

    dIn

    tellig

    en

    t

    Dublin

    Test B

    ed

    Energy Movement Water

    Seed Projects

    Real World Insight | Data Sets | Devices

    Optimization

    Predictive Modelling

    Forecasting

    Simulation

    Solu

    tions th

    at S

    usta

    in E

    conom

    ic D

    evelo

    pm

    ent

    Driving New Economic Models

    Significant Collaborative R&D

    Skills Development & Growth

    Competitive Advantage

    Collaboration and Access to Local, Regional & Worldwide Network

    SME’s | MNC’s | Universities | Public Sector | VC Community

    Intelligent Urban and Environmental Analytics and Systems

    Sm

    art C

    ity S

    olu

    tions

    Integrated Cross Domain Solutions

    City Fabric

    Smarter Cities Technology Centre

  • IBM - Dublin Research Lab

    A “mission control” for infrastructureA “mission control” for infrastructure A showcase for urban planning conceptsA showcase for urban planning concepts

    A totally “wired” cityA totally “wired” city A self-sufficient, sustainable eco-cityA self-sufficient, sustainable eco-city

    Many Visions of what a Smarter City might be

  • IBM - Dublin Research Lab

    City Data and Information

    “The country places and the trees don’t teach me anything, but the people in the

    city do”

    Socrates

  • IBM - Dublin Research Lab

  • IBM - Dublin Research Lab

    Transportation Social MediaEnergy Management City Management

    RegionSupply Chain Food System HealthCare

    • Large, open and continuous data environment from heterogeneous domains:

    and even more�

    City of Data and Information: Many Areas

    Water

    Management

  • IBM - Dublin Research Lab

    City Data Trends

    2009,

    Data.gov.uk

    Data.gov (US)

    1993, SEC

    Online

    2004, USG

    announces e-

    Gov 2.0

    Content

    Factual &

    Static

    >350 ‘Open

    City Data

    Catalogs’

    (data.gov)

    >350 ‘Open

    City Data

    Catalogs’

    (data.gov)

    2011+, Gov 3.0

    City as an Enterprise....

    Activity

    Time2010,

    Amazon,

    Google & MSoft

    Content

    Structure

    Innovation

    Aggregation

    & Efforts to

    create linkage

    based on

    Semantic Web

    >25 Billion

    Triples on

    Linked Data

    Cloud

    >25 Billion

    Triples on

    Linked Data

    Cloud

    Innovation

    based on

    Collaboration

    & Social

    Innovation

    35 Cities in

    Open Data

    Hackday,

    12/2010

    35 Cities in

    Open Data

    Hackday,

    12/2010

    Ecosystem

    increasingly

    focused on

    long-term

    sustainability

    Publicdata.eu –

    LOD2 for

    Citizen study

    due 2014

    Publicdata.eu –

    LOD2 for

    Citizen study

    due 2014

  • IBM - Dublin Research Lab

    Some Traffic-related Data Sets from Dublin

    � Big data

    � Heterogeneous data

    � Static, Continuous data

    � Not all open yet,

    � Not linked yet

    � Noisy data (inconsistent, imprecise)

  • IBM - Dublin Research Lab

    POWERED by

    Open Innovation Portalwww.dublinked.ie

  • IBM - Dublin Research Lab

    Dublinked - outcomes

    • Publish and put into context (100’s datasets, 1000’s of files)

    • Create innovation ecosystem

    Waste Collection

    Property management

    Environment

    Demographics

    Business & Retail

    Commercial valuations

    and rates

    Tourism

    Transport & Access

    Crime

    Heritage

    Mapping

    Housing

    WaterFault Reporting

    Events

    Health

    Planning

    Pool resources

    Share results

  • IBM - Dublin Research Lab

    More on city data will be covered in

    Part II

  • IBM - Dublin Research Lab

    Making City Search Smarter

    “We cannot afford merely to sit down and deplore the evils of city life as inevitable, when cities are constantly growing,

    both absolutely and relatively. We must set ourselves vigorously about the task of improving them; and this task

    is now well begun.”

    Theodore Roosevelt

  • IBM - Dublin Research Lab

    City SearchNot a revolution, but evolution

    City SearchConcerns all type of (complex) queries encountered in everyday city life, e.g. city events.

    City SearchRelevance is highly dimensional, context-dependent and leverage more city-specific information than Web information.

    Genetic D

    rift

    Web SearchOrdinary Web users want to locate information (e.g. documents) on the Web

    Genes

    Information Need Search Relevance Information Source

    Local SearchMainly targeting queries to locate businesses within a geographic area

    Local SearchExtending Web-based relevance with spatiotemporal relevance.

    City SearchCity data provides a unique source of information to understand city context

    Web SearchModels relevance using IR models based on content, Web popularity, clickthrough data etc.

    Local SearchEnhanced with location and time information collected from mobile sensors, IP address etc.

    Web SearchUtilizing Web-based information such as document content, Web graph, search engine logs, etc.

  • IBM - Dublin Research Lab

    What do people search for?

    A recent user study about the public displays in the urban areas to understand the

    information needs of citizens [Kukka, PUC, 2013]

    Conducted in Oulu, Finland

    Diversity of the information

    Need for city awareness

    Differences to Web

    queries [Spink et al, 2001]

  • IBM - Dublin Research Lab

    What do people search for?Top 8 categories according to user scores [Kukka, PUC, 2013]

  • IBM - Dublin Research Lab

    Local search is on the rise

    Percentage of local search traffic in major search engines

    According to [Zhang et al, GIR’06], 83.77% Yahoo! geo-queries has a city name

    Source: Chitika, 2012

  • IBM - Dublin Research Lab

    Local Queries: Geotagging

    People tend to use geotags

    Source: Search Engine Land, 2011

  • IBM - Dublin Research Lab

    Search Topics

    Not every city is the same

    Source: Chitika, 2013

  • IBM - Dublin Research Lab

    Time matters

    Source: Chitika, 2013, Lane et al, Ubicomp’10

  • IBM - Dublin Research Lab

    Distance

    • Users’ “distance sensitivity” is relative to the type of business

    considered

    Source: Berberich, SIGIR’11

  • IBM - Dublin Research Lab

    Relevance

    • Still not awake? Coffee? ☺

    • Costa? Starbucks?

    – Does the distance really matter?

  • IBM - Dublin Research Lab

    Relevance

    • Local vs. Web popularity

    • What else?

    • More information�

    More relevance to the

    user!!

    Source: Foursquare,

    July 2013

  • IBM - Dublin Research Lab

    Relevance

    • More Semantics

    – “skate ramp park Dundrum Town Center”

    • The information is not contained in one data

    source

    • Need information about parks, POI, location etc.dublinked dbpedia

    park

    yeshasSkate

    dtc

    dundrum

    located

    located

  • IBM - Dublin Research Lab

    Relevance

    • Need to buy new “furniture”?

  • IBM - Dublin Research Lab

    Relevance

    • Dublin TRIPS data:

  • IBM - Dublin Research Lab

    Relevance

    • Dublin Trips Data:

    – Journey times throughout the city

    – Real-time data with updates in every minute

    – Historical data is available for every day since 9/7/2012

    – Mined from SCATS-based (Sydney Coordinated Adaptive Traffic

    System) intelligent transportation system for 500+ sites around Dublin

    • Accessible from:

    – http://dublinked.ie/datastore/datasets/dataset-215.php

    • Visualization

    – http://www.dublinked.ie/traffic/

  • IBM - Dublin Research Lab

    Relevance

    • More transportation data

    – Public Transport Route Networks

    • http://dublinked.ie/datastore/datasets/dataset-258.php

    – Dublin Bus GPS Data• http://dublinked.com/datastore/datasets/dataset-304.php

    – Dublin Bus GTFS data • http://dublinked.ie/datastore/datasets/dataset-254.php

    – Accessible Parking Places • http://dublinked.com/datastore/datasets/dataset-049.php

    – Roads and Streets in Dublin City • http://dublinked.com/datastore/datasets/dataset-123.php

  • IBM - Dublin Research Lab

    RelevanceBuying your dream house

    Finding the houses?

    Is the price reasonable?How is the neighborhood?

    Perfect match!!

  • IBM - Dublin Research Lab

    Relevance

    • Property Register Index : ~52000 property sales

    Available at http://kdeg.cs.tcd.ie/propertyPriceMap/

  • IBM - Dublin Research Lab

    Relevance

    • More city data:

    – Amenities & Recreation

    • http://dublinked.ie/datastore/by-category/amenities-

    recreation.php

    – Schools

    • http://dublinked.com/datastore/datasets/dataset-099.php

    – Key developing areas

    • http://dublinked.ie/datastore/datasets/dataset-134.php

    – Air pollution monitoring data

    • http://dublinked.ie/datastore/datasets/dataset-185.php

  • IBM - Dublin Research Lab

    Relevance

    • The “perfect” Irish weather ☺

    • Choosing the best city activity

    depends on the weather

    – Indoor vs. Outdoor

  • IBM - Dublin Research Lab

    Information Sources

    Web Context City ContextWeb documents Structured City Data

    User Queries City-specific Web Documents

    Clickthrough Data Sensor Data

    Hyperlinks Social Media, Check-ins

    Road Network

    Transportation Data

    City Events

    Regional Information

    Municipality

    Crime and safety and much more…

  • IBM - Dublin Research Lab

    Wrap Up

    •Majority of World population live in cities

    •Cities are dynamic entities combining people,

    systems, infrastructure, businesses

    •More and more city data becomes available

    enabling more insight

    •City data is heterogeneous, multi-domain,

    noisy and big

    Cities and City Data

    •Managing City Data

    • Characteristics and types of city data

    • Semantic Processing and lifting

    • Demos

    •Searching City Data

    • Challenges for IR community

    • A review of related approaches

    • Future directions

    What is next in this tutorial?

    •City search as an evolution of search into city

    context

    •Characterized by specific information needs of

    people in everyday city life

    •Drift in search relevance from Web context to

    the city context

    •Drift in information sources used to drive

    search process

    City Search

  • IBM - Dublin Research Lab

    References

    • Marty Himmelstein, Local search: The internet is the yellow pages, IEEE Computer, 2005

    • Klaus Berberich, Arnd C. Konig, Dimitrios Lymberopoulos, Peixiang Zhao, Improving local search ranking through external logs, SIGIR 2011.

    • Hannu Kukka, Vassilis Kostakos, Timo Ojala, Johanna Ylipulli, Tiina Suopajarvi, Marko Jurmu, Simo Hosio, This is not classified: everyday information seeking and encountering in smart urban spaces, Personal and Ubiquitous Computing, 2013

    • Spink, A., Wolfram, D., Jansen, M. B., & Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American society for information science and technology, 52(3), 226-234.

    • Zhang, Wei Vivian, Benjamin Rey, Eugene Stipp, and Rosie Jones. Geomodificationin Query Rewriting. In GIR. 2006.


Recommended