Date post: | 16-Apr-2017 |
Category: |
Technology |
Upload: | sk53 |
View: | 887 times |
Download: | 0 times |
Seeing the LightSeeing the LightLocal Government Open DataLocal Government Open Data
Jerry Clough – SK53Maps Matter (www.sk53-osm.blogspot.com)
NaPTAN : Bus StopsNaPTAN : Bus Stops• Data imported 2009• Never cross-checked systematically• Duplicate stops (survey & NaPTAN)• Name Changes
– Pub => Other landmark• Mainly used for adding street names• Not updated
• Grit Bins• Disabled Parking• Motorcycle Parking• Leisure Centres• Libraries• Local Nature Reserves• Planning Applications• Polling Stations• School Crossing Patrols• CCTV• Places of Worship
Nottingham Open Nottingham Open DataData
• Streetlights• Schools• Public Rights of Way• Tram routes• Bus Stops• Childcare• Pedestrian Crossings• Food Hygiene Scheme• Licensed Premises• Illuminated Road Signs
CCTVCCTV
Licensed PremisesLicensed Premises
• Not just Pubs & RestaurantsNot just Pubs & Restaurants– at least 2 Floristsat least 2 Florists
• Licenses forLicenses for– Alcohol (on and off site)Alcohol (on and off site)– DancingDancing– Music (live & recorded)Music (live & recorded)– Boxing & WrestlingBoxing & Wrestling
Licensed Premises : Data-Licensed Premises : Data-driven Surveydriven Survey
Food Hygiene RatingsFood Hygiene Ratings• Addresses• Partial geolocation
– postcode• Business Types
– Pub/Bar/Nightclub– Supermarket– Café/Restaurant– Other Retail
• Covers at least 50-60% of retail outlets
• Usually current– Typical inspection interval
6-12 months
Streetlights : OSM AccuracyStreetlights : OSM Accuracy
Many streets traced from unaligned Yahoo imagery, provides quick recognition of them.
Streetlights : Unadopted Streetlights : Unadopted RoadsRoads
Streetlights : PathsStreetlights : Paths
Streetlights : Named StreetsStreetlights : Named Streets
Streetlights : AddressesStreetlights : Addresses
Achievements (so far)Achievements (so far)• Tram line construction
tracked closely– Allows better tracking of:
• Road closures• Construction areas
• Licensed Premises– 94% reconciled
• Up from ~40% in March• Food Hygiene
– 72% reconciled (1759/2433)
• Retail Premises– 95% of all shops in city
now mapped• Postcodes mapped
– 500+ added– 100% increase
• Addresses– Several ‘000 added
• Images– 8000 collected for
mapping
Error Rates : Licensed PremisesError Rates : Licensed PremisesMapped Total Not Mapped Total Not applicable Total Grand Total
PC Outer Data Y X G I N (blank) ? D N/aNG1 No. 347 5 49 3 404 12 9 21 4 11 15 440
Pct 78.86% 1.14% 11.14% 0.68% 91.82% 0.00% 2.73% 2.05% 4.77% 0.91% 2.50% 3.41% 100.00%NG11 No. 36 1 2 39 7 6 1 14 1 1 54
Pct 66.67% 1.85% 3.70% 0.00% 72.22% 12.96% 11.11% 1.85% 25.93% 1.85% 0.00% 1.85% 100.00%NG2 No. 57 6 63 2 3 5 2 8 10 78
Pct 73.08% 0.00% 7.69% 0.00% 80.77% 2.56% 3.85% 0.00% 6.41% 2.56% 10.26% 12.82% 100.00%NG3 No. 62 2 7 71 5 3 8 1 1 80
Pct 77.50% 2.50% 8.75% 0.00% 88.75% 6.25% 0.00% 3.75% 10.00% 0.00% 1.25% 1.25% 100.00%NG5 No. 104 1 2 107 4 1 5 1 1 113
Pct 92.04% 0.88% 1.77% 0.00% 94.69% 3.54% 0.88% 0.00% 4.42% 0.88% 0.00% 0.88% 100.00%NG6 No. 70 3 1 74 4 3 1 8 82
Pct 85.37% 3.66% 1.22% 0.00% 90.24% 4.88% 3.66% 1.22% 9.76% 0.00% 0.00% 0.00% 100.00%NG7 No. 233 2 27 262 3 4 2 9 9 9 280
Pct 83.21% 0.71% 9.64% 0.00% 93.57% 1.07% 1.43% 0.71% 3.21% 0.00% 3.21% 3.21% 100.00%NG8 No. 118 5 1 124 2 2 4 1 1 129
Pct 91.47% 0.00% 3.88% 0.78% 96.12% 1.55% 1.55% 0.00% 3.10% 0.00% 0.78% 0.78% 100.00%NG9 No. 4 4 2 1 3 7
Pct 57.14% 0.00% 0.00% 0.00% 57.14% 28.57% 14.29% 0.00% 42.86% 0.00% 0.00% 0.00% 100.00%Total No. 1031 14 99 4 1148 29 32 16 77 8 30 38 1263Total Pct 81.63% 1.11% 7.84% 0.32% 90.89% 2.30% 2.53% 1.27% 6.10% 0.63% 2.38% 3.01% 100.00%
Mapped: Y = On OSM, X = Surveyed, not added, G = Surveyed, gone (no longer present), I = Imaginary (surveyed, never present)
Not Mapped: N = Known to exist, not surveyed yet, (blank) = status not known, not surveyed? = Surveyed, existence hard to determine
Not applicable: D = duplicate data, N/a = Open spaces and other non-addressed POIs
ConclusionsConclusions
• Non-import approaches to Open Data can be highly productive– Smaller focussed data sets are
easier to cope with:• Pubs, Places of Worship, not Bus
Stops or Streetlights– Side benefits considerable
• On-the-ground surveys extended to many parts of the city
– Many additional images to assist interpretation of aerial imagery
• Addresses already available for POIs, shorter surveys needed(= increased overall coverage)
• Postcode coverage • House numbers can be collected
on small scale– Valuable because additional
numbers can be interpolated from OD sources
• Open Data requires interpretation– Original purpose often at odds
with mapping– Error rates ~ 5%
• Good quality ancillary information really helps– Aerial imagery– Postcode centroids (open data)
give approximate location• Ordnance Survey OGL is major
barrier for some data sets
• Conflation and Change Detection not easily automated– Necessary for data maintenance– Necessary for large data sets
Data Matching : what’s needed for Data Matching : what’s needed for ConflationConflation
• Point sources initially• Multiple (weighted)
matching criteria– Geographical co-
ordinates• Precise• Fuzzy (postcode
centroids)• Areas
– POI Type• Fuzzy
– Bar/pub/restaurant– Name
• Fuzzy Matching of names– Levenstein distances
inadequate– “Sycamore Primary
School” vs “Sycamore Academy”
– “Robin Hood” vs “RobinHood and Little John”
– Tokenise ?• Building Blocks
– Nominatim– OSL Musical Chairs (ris)– Library of Congress
(chippy, schuyler)