+ All Categories
Home > Science > #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

#AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Date post: 15-Jul-2015
Category:
Upload: gicycle
View: 3,145 times
Download: 1 times
Share this document with a friend
Popular Tags:
21
An intrinsic approach for the detection and correction of attributive inconsistencies and semantic heterogeneity in OSM data Martin Loidl | [email protected] Stefan Keller| [email protected] AAG Annual Meeting – Workshop OpenStreetMap Studies Chicago, April 24th 2015
Transcript
Page 1: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

An intrinsic approach for the detection and correction of attributive inconsistencies and semantic heterogeneity in OSM data

Martin Loidl | [email protected] Keller| [email protected]

AAG Annual Meeting – Workshop OpenStreetMap StudiesChicago, April 24th 2015

Page 2: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

OSM bottom-up community approach Rudimentary data model and attribute structure (tagging scheme K = v)

Attributes: recommendations ≠ conventions ≠ formalized standard

No restriction of tag usage and definition

Problem Statement

2

http://www.openstreetmap.es

Page 3: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Within one way

Within a succession of ways (e.g. street)

Attributive Inconsistencies

3

highway = motorway

name = Kennedy Expressway

bicycle = yes

highway = motorway

name = Kennedy Expressway

ref = I 90

highway = motorway

name = Fisher Freeway

ref = I 90

highway = motorway

name = Kennedy Expressway

ref = I 90

Page 4: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Different (correct) description for one and the same entity

Specific to crowd-sourced data (≠ authoritative data follow strict specifications)

Semantic Heterogeneity

4

highway = cycleway

foot = designated

width = 3

highway = path

bicycle = designated

foot = yeshighway = footway

bicycle = designated

surface = asphalt

Page 5: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Considering attributive inconsistencies and semantic heterogeneity is relevant for … Visualization (data rendering)

Descriptive statistics (classification)

Spatial analysis (e.g. routing)

Improve results through Harmonization (remove semantic heterogeneities)

Correction through estimation (gaps, inconsistencies)

Relevance

5

Page 6: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Spatial data quality Standards (e.g. ISO 19157 = harmonization of

multiple preceeding standards) and extensive body of literature of limited use for OSM data

Quality asssessment of OSM data Primarily focusing on positional accuracy and

geometrical completeness

Reference data set and/or descriptive statistics

Comparable little work on attribute quality

Data Quality

6

Haklay 2010

Hochmair et al. 2015

Barron et al. 2014

Page 7: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Why an intrinsic approach? Extrinsic approach requires reference data set,

which ideally has:

Same geographical coverage

Same data model and attribute structure

[Koukoletsos et al. (2012): multi-stage process to deal with it to a certain extent]

Quality of reference data set (authoritative data doesn‘t necessarily imply better data!)

Data often created for very different purposes

Quality Assessment

7

Elsbethen (Austria):authoritative data –

OSM data

Page 8: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Exclusively based on respective data set (data-centered approach)

Makes use of: Redundancy

Inherent logic, functionally related attributes

Intrinsic Approach

8

Translation into querystatements

highway = * surface = *

tracktype = *

Page 9: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Case Study Area

9

4,600 km² in Austrian-Bavarian boarder region ~ 22,600 km total network length

Rural and urban areas

Data preparation Extraction from OSM Database

(April 1st 2015)

Conversion to topological correct graph (edge-node) in GeoDB

Page 10: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Major Road Network

10

Major road = motorway, primary, secondary (incl. links)

Consistent for road category (highway = *) Makes features mappable = primary

intent/purpose of OSM

Attributes incomplete (n = 11,951 segments) name = *: 64.6%

surface = *: 22.93% [ can be estimated: asphalt]

maxspeed = *: 72.19%

lanes = *: 57.86%

Rather an issue of completeness than of inconsistency and heterogeneity

Page 11: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Local Road Network

11

Majority of ways in OSM Differences in terms of attribute

quality (existence, consistency etc.)

Relevant e.g. for active modes oftransport (cycling, hiking etc.) In many cases more extensive

(spatial coverage, attribute details) than authoritative data

Page 12: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Define set of logical/legal contradictions

Connect to corresponding tags Tag specification according to Wiki

Query the dataset for contradictions

Attributive Inconsistencies

12

approx. 1 from 1,000

("tracktype" = 'grade3' or "tracktype" = 'grade4' or "tracktype" = 'grade5')

and "surface" = 'asphalt'

Page 13: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Distribution of inconsistencies: Regional diversity (national laws?)

Spatial clusters (local mapper/communities?)

Spatial Particularities

13

highway = residential

maxspeed = 80

Page 14: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Correction without ground truthing = estimation

Quality of estimation depends on number of functionally related attributes

Correction of Inconsistencies

14

Page 15: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

How to map a mixed foot-/cycleway in OSM?

Heterogeneity

15

http://www.stadt-salzburg.at

Page 16: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

How to map a mixed foot-/cycleway in OSM? Co-existence vs. “tag war”

Credibility and reputation (Flanagin & Metzger 2008)

Heterogeneity

16

("highway" = 'footway' and ("bicycle" =

'designated' or "bicycle" = 'yes' or

"bicycle" = 'official'))

OR

("highway" = 'cycleway' and ("foot" =

'designated' or "foot" = 'yes'))

OR

("highway" = 'path' and ("foot" =

'designated' or "foot" = 'official') and

("bicycle" = 'designated' or "bicycle" =

'official'))

OR

("highway" = 'track' and ("foot" =

'designated' or "foot" = 'official') and

("bicycle" = 'designated' or "bicycle" =

'official'))

669 segments

1,202 segments

2,655 segments

73 segments

Page 17: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Different (correct) views on same entity

Heterogeneity

17

highway = cycleway

surface = asphalt

ref = BGL 3

foot = designated

bicycle = designated

segregated = no

Last editor: j_cook

highway = path

surface = asphalt

foot = designated

bicycle = designated

Last editor: pyram

Page 18: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

18

highway = track

name = Treppelweg

surface = gravel

tracktype = grade2

foot = yes

bicycle = yes

width = 3

highway = path

name = Treppelweg

surface = gravel

tracktype = grade2

foot = designated

bicycle = designated

width = 3

http://www.bing.com/maps

Page 19: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Define derived attributes that fit best for actual purpose

Harmonization of Heterogeneity

19

Loidl & Zagel (2014)

Page 20: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

OSMAXX Extracts OSM data

Data cleaning (capital letters etc.) and harmonization (generalization)

Conversion to GIS formats

For visualization and geospatial analysis

Harmonization of Heterogeneity

20

Page 21: #AAG2015 presentation on OSM attribute inconsistency and semantic heterogeneity

Inconsistency = quality issue Can be detected with intrinsic approach

Heterogeneity = depends on purpose Definition of derived attributes

Implement assessment routines during editing or in post-processing? Tag recommender system during editing (Vandecasteele & Devillers 2014)

Probabilistic approach and/or functionally related attributes

Prevent from contradiction

Data tuning in post-processing allows specification for actual purpose

Combination prevent – detect – repair (Herzog et al 2007)

Data model issue social complexity of OSM (Spielmann 2014)

Wrap-Up

21

@gicycle_

gicycle.wordpress.com


Recommended