+ All Categories
Home > Technology > Inside Overpass API - State of the Map 2013

Inside Overpass API - State of the Map 2013

Date post: 25-May-2015
Category:
Upload: osmfstateofthemap
View: 507 times
Download: 0 times
Share this document with a friend
Description:
*** Presented by Roland Olbricht at State of the Map 2013 *** For the video of this presentation please see http://lanyrd.com/2013/sotm/scpkhk/ *** Full schedule available at http://wiki.openstreetmap.org/wiki/State_Of_The_Map_2013 Overpass API has become the most used database to extract OSM data over the web. Yet it remained highly available. This high availability of Overpass API is ensured by its load management. For the first part of the talk we will discuss statistics for the historic and current usage of Overpass API. In the second part, we will present the mechanisms used for this load management and explain how to assess the footprint of a query. Finally we will advise how to get response to your queries as fast as possible.
Popular Tags:
21
Inside Roland Olbricht at SOTM 2013 in Birmingham
Transcript
Page 1: Inside Overpass API - State of the Map 2013

Inside

Roland Olbricht

at SOTM 2013 in Birmingham

Page 2: Inside Overpass API - State of the Map 2013

Overview

1. The server as a whole

2. Processing of requests

3. The query statement pipeline

Page 3: Inside Overpass API - State of the Map 2013

1. The server as a whole

Page 4: Inside Overpass API - State of the Map 2013

4000 to 6000Unique IPs

per day

150'000 to250'000

requests per day

10 GB to 30 GBresult data

per day

Page 5: Inside Overpass API - State of the Map 2013

Downloadsize per IP

> 1 GB

100 MB – 1GB

10 MB – 100MB

1 MB – 10 MB

100 KB – 1 MB

10 KB – 100 KB

1 KB – 10 KB

< 1 KB

Download sizeper request# Unique IPs

4

20

150

384

595

1104

1189

764

884'037

8'513'484

61'999

20'245

4'740

1'696

921

339

Statistics of 2013-08-30

Page 6: Inside Overpass API - State of the Map 2013

Share resourcesacross 10^7 !

=> [timeout:...]: Server keeps track of „free time units“ Server accepts a client request if it is below half of free server time units

Client requests Server state

240000 free time units

239820 free time units

153420 free time units

because 153420/2 < 86400.153420 free time units

153240 free time units

With timeout 180 ?

With timeout 86400 ?

With timeout 86400 ?

With timeout 180 ?

Page 7: Inside Overpass API - State of the Map 2013

Short allowed runtime High Priority

Long allowed runtime Low Priority

Since June 2012all requests with [timeout:...] < 180 acceptedrequests with longer timeout occasionally rejected

Share resourcesacross 10^7 !

Page 8: Inside Overpass API - State of the Map 2013

2. Processing of requests

Page 9: Inside Overpass API - State of the Map 2013

The bottleneck ...

… is disk I/O.

almost completely idle peaks oftennear 100%

Page 10: Inside Overpass API - State of the Map 2013

node

[name=„Aston Business School“];

out;

Request

Disk time

Memory

„out“ vs „out skel“vs „out meta“

(node 1473072867,lat = 52.4867839,lon = -1.8884618)

node

[name=„Aston Business School“];

out;

(node 1473072867,lat = 52.4867839,lon = -1.8884618,amenity=bicycle_parkingbcc_ref=433bicycle_parking=standscapacity=10covered=yesname=Aston Business School)

Page 11: Inside Overpass API - State of the Map 2013

out skel;

node

[name=„Aston Business School“];

out skel;

Request

Disk time

Memory(node 1473072867,lat = 52.4867839,lon = -1.8884618)

node

[name=„Aston Business School“];

(node 1473072867,lat = 52.4867839,lon = -1.8884618)

„out“ vs „out skel“vs „out meta“

Page 12: Inside Overpass API - State of the Map 2013

out meta;out meta;

node

[name=„Aston Business School“];

Request

Disk time

Memory(node 1473072867,lat = 52.4867839,lon = -1.8884618)

node

[name=„Aston Business School“];

(node 1473072867,version = 2, timestamp = ...,…,lat = 52.4867839,lon = -1.8884618,amenity=bicycle_parkingbcc_ref=433bicycle_parking=standscapacity=10covered=yesname=Aston Business School)

„out“ vs „out skel“vs „out meta“

Page 13: Inside Overpass API - State of the Map 2013

node

[name=„Aston Business School“];

out meta;

Request

Disk time

Memory(node 1473072867,lat = 52.4867839,lon = -1.8884618)

Every statementtakes disk time

Internally, we onlystore skeletons.

„out“ vs „out skel“vs „out meta“

Page 14: Inside Overpass API - State of the Map 2013

3. The query statement pipeline

Page 15: Inside Overpass API - State of the Map 2013

The query statementis a pipeline

Planning decisions

Collect ids of potential results

Copy from memory if possible

derive geo index from query

lookup geo index by ids

fetch all skeletons

cheap filtering

filter by key conditionals

expensive filtering

Ids

raw data

filtering

more conditions better than fewer

Page 16: Inside Overpass API - State of the Map 2013

The query statement pipeline:

node[name=„Aston Business School“];

Planning decisionsCollect ids of potential resultsCopy from memory if possible

derive geo index from querylookup geo index by idsfetch all skeletons

cheap filteringfilter by key conditionalsexpensive filtering

Disk time

Collect ids of potential results

lookup geo index by idsfetch all skeletons

(node 1473072867)

(Idx 0x42f00f00)(node 1473072867,lat=52.487, lon=-1.889)

Page 17: Inside Overpass API - State of the Map 2013

The query statement pipeline:

node[amenity=bicycle_parking];

Planning decisionsCollect ids of potential resultsCopy from memory if possible

derive geo index from querylookup geo index by idsfetch all skeletons

cheap filteringfilter by key conditionalsexpensive filtering

Disk time

Collect ids of potential results

lookup geo index by idsfetch all skeletons

(node 1000, …,node …, node …,node 1473072867,node …, node …) [~ 80'000 objects]

(Idx 0x1, 0x2, 0x3, …,...)((node 1, lat=..., lon=..., …, (node 1473072867, lat=52.487, lon=-1.889), ...)

~80'000 disc seeks

~30'000 disc seeks

Page 18: Inside Overpass API - State of the Map 2013

(node 1000, …,node …, node …,node 1473072867,node …, node …) [~ 80'000 objects]

The query statement pipeline:node[amenity=bicycle_parking](52.48, -1.89, 52.49, -1.88);

Planning decisionsCollect ids of potential resultsCopy from memory if possible

derive geo index from querylookup geo index by idsfetch all skeletons

cheap filteringfilter by key conditionalsexpensive filtering

Disk time

Collect ids of potential results

fetch all skeletons

(Idx 0x42f00f00)

(node 1473072867,lat=52.487, lon=-1.889)

derive geo index from query

Page 19: Inside Overpass API - State of the Map 2013

The query statement pipeline:node[name=„Aston Business School“](51.0, -3.0, 60.0, 3.0);

Planning decisionsCollect ids of potential resultsCopy from memory if possible

derive geo index from querylookup geo index by idsfetch all skeletons

cheap filteringfilter by key conditionalsexpensive filtering

Disk time

Collect ids of potential results

fetch all skeletons

(Idx 0x42000000, …, Idx 0x42ffffff)

(node 1473072867,lat=52.487, lon=-1.889)

derive geo index from query

~3'000 disc seeks

(node 1473072867)

Page 20: Inside Overpass API - State of the Map 2013

Resumee

Be bold, the server cares for large queries

Select right „out“ mode for performanceand for quick testing

Use all available information,in particular small bounding boxesand specific search conditionals

Page 21: Inside Overpass API - State of the Map 2013

Thank you for your attention


Recommended