Date post: | 27-Jan-2015 |
Category: |
Business |
Upload: | anders-quitzau-ibm |
View: | 109 times |
Download: | 2 times |
IBM Open Data
© 2013 IBM Corporation
What is Open Data ?
Open data is: • Data that is generally provided at no cost and with no license constraint • Machine readable and often people readable • Accessed via files or APIs especially for linked data. The data may be static or dynamic.
Why is Open Data being used ? • Transparency : to build stakeholder understanding and confidence. Increasing data visibility
both inside and outside the organization • Cost efficiency: to reduce the cost and effort of developing, operating and delivering
services (both IT and civic services) • Economic : to enable innovation by small agile businesses to develop
new insight, new services and new products (e.g. tools for ‘co-creation’, innovative systems of engagement)
Who? • Governments agencies and the research community • Private industry looking for a competitive advantage
2
h"p://visual.ly/open-‐data-‐movement “$7bn worth of items on eBay through APIs”
“$7bn worth of items on eBay through APIs”
“The API which has easily 10 times More traffic than the website,
Open data today: 162 governments 43 countries > 1 million data sets
Potential revenue to EU governments could be around €27 b - Gartner
Open Data is big and growing
3
McKinsey and Associates estimates the annual economic value of big, open liquid health data at about $350 billion annually
Open weather data collected by the National Oceanic and Atmospheric Association has an annual estimated economic value of $10 billion
IBM Open Data
© 2013 IBM Corporation
And in Denmark – Open Data movement is emerging
And IBM is participating…..
Helsinki ManyEyes
Nantes Intelligent
Transportation
Open data portal to facilitate data-driven innovation in an urban environment
Maintains a catalog of open data provided by city governments with visualization tools and collaboration
Uses open data to create high-resolution weather forecasts
Web site with Cognos software that can be used to visualize open data
TJW Research project to build a semantic model for smart cities
Developed strategies for creating visualizations that can enable citizens make use of and benefit from open data
Intelligent transportation is using Open Data from Nantes to feed Traffic Prediction
CoOp Research project with the City of Minneapolis. Uses internal and open data to improve Traffic, Public Safety, Planning KPI’s
5
IBM Open Data
© 2013 IBM Corporation
Buildings – UK public buildings – example
IBM Open Data
© 2013 IBM Corporation
New York City OpenData – Building Footprints
Dublinked – Open data as a catalyst for innovation and economic development
• Open Data from public sector for public – and restricted -‐ consump=on
• Innova=on network of academic, public & commercial research
• Sandbox for experiment and test plus poten=al roll-‐out
IBM Open Data
© 2013 IBM Corporation
Dublinked Site Update
• New data added almost daily • Increasing spread of contribu=on from
Authori=es • Live data is most downloaded • NUIM now managing members directly Content is growing nicely, our focus has shiLed to differen=ated func=on …. How can we make Dublinked the most advanced Data Ecosystem in the World?
IBM Open Data
© 2013 IBM Corporation
Information layers on; Traffic sensors, way finding, buildings, air quality, noise, cycling, road works, complaints...
© 2010 IBM Corporation
Information layers on; Traffic sensors, way finding, buildings, air quality, noise, cycling, road works, complaints...
Fire Brigade, Waste, Water, Transport, Utilities
Parks, Planning, Environment, Heritage, Litter, Maintenance
Arts, Culture, Sports & Recreation
Housing, Social and Community Services
Dublin city data
IBM Open Data
© 2013 IBM Corporation
Live data: Integrate live streaming data (via IBM System S)
© 2010 IBM Corporation
Next steps: Integrate live streaming data (via IBM System S) Next steps: live data
DCC has Real Time data on Dublin Bus Fleet
Allows IBM access to data and provide server space.
1000 vehicles provide location information every 20 – 30 Seconds
Real Time data goes “stale “ very quickly and becomes historic data`.
IBM Open Data
© 2013 IBM Corporation
Linked Open Data All data is now using publishing using linked data standards – i.e. the data is
structured in a way that that it can be interlinked and become more useful. This allows DUBLinked to share information in a way that can be read automatically by
any computer via the internet. This enables data from different sources to be connected and queried.
IBM Open Data
© 2013 IBM Corporation
Benefits seen - Dublinked
• Interrnal Efficiency Benefits
• Demonstrate that city administrators are doing a reasonable, good job
• Demonstrate that you are complying with regulatory requirements and reporting
• Compare yourself with other organisations, cities, countries etc
• Grow new businesses and jobs
IBM Open Data
© 2013 IBM Corporation
Open Data Start-up Mypp.ie
IBM Open Data
© 2013 IBM Corporation
Linked Open Data - Continued Metadata is linked to external websites, like DBPedia, when appropriate. This removes
ambiguity when talking about things (e.g. “Apple” – the fruit or the manufacturer of iPads ?) and can allow cross-‐site query capability:
IBM Open Data
© 2013 IBM Corporation
Linked Open Data - Continued
Metadata is linked to external vocabularies, like IPSV, when appropriate. This allows cross linking to other open data sites that use the same vocabularies:
IBM Open Data
© 2013 IBM Corporation
Online mapping of files with geospatial fields
Files that contain geospa=al informa=on such as longitude and la=tude can be mapped automa=cally :
IBM Open Data
© 2013 IBM Corporation
Example ReasonableCity
Problem How can we provide City decision makers with explana=ons and diagnoses for events by applying machine reasoning techniques to a fusion of massive, rich, complex and dynamic data? How can we move from explana=on to predic=on?
Challenges • Iden?fying relevant data and informa=on • Capturing and represen?ng anomalies • Correla?ng knowledge on heterogeneous data sources • Advanced fusion of data
Goals • Iden=fica=on of the nature and cause of changes • Explaining logical connec?on of knowledge across space and =me • Move from explana=on to predic=on
Anomaly Detected: Delayed buses, congested roads
Detection to Diagnosis?
IBM Open Data
© 2013 IBM Corporation
There is very broad interest in Dublinked
© 2010 IBM Corporation
And there certainly is broad interest in Dublinked Next steps: Gov2Gov and beyond
Waste Collection
Property management
Environment
Demographics
Business & RetailCommercial valuations and rates
Tourism
Transport & Access
Crime
Heritage
Mapping
Housing
WaterFault Reporting
Events
Health
Planning
Pool resources Share results
IBM Open Data
© 2013 IBM Corporation
Emerging Business Models, examples
• Premium Product / Service. HospitalRegisters.com
• Freemium Product / Service. public transportation in urban areas.
• Open Source. OpenCorporates and OpenPolis
• Infrastructural Razor & Blades. Public Data Sets on Amazon Web Service
• Demand-Oriented Platform. DataMarket and Infochimps
• Supply-Oriented Platform. Socrata and Microsoft Open Government Data Initiative
• Free, as Branded Advertising. IBM City Forward, IBM Many Eyes or Google Public Data Explorer
• White-Label Development.. This business model has not consolidated yet, but some embryonic attempts seem to be particularly promising.
Source: Business Model & Policy Innovation Unit at the Istituto Superiore Mario Boella
IBM Open Data
© 2013 IBM Corporation
Dublinked membership fee structure
IBM Open Data
© 2013 IBM Corporation
Next steps
Dublinked & Social Program Data: – Release non-sensitive social data to Dublinked – For example, Housing Stock data – Research on anonymisation & privacy
Research for Smarter Social Programs (outside Dublinked)
– Multilevel analysis of high-cost/high-need regions and citizens • Identify regions where service providers are overloaded or inaccessible • Recommend alternative service providers to balance demand • Forecast regions that will become problematic
– Identifying patterns to provide recommendations for interventions • Identify patterns for those that are chronically homeless • Highlight those that are identified as returning to recommend intervention
– Forecasting future requirements and emerging ‘hot spots’ • Are risk factors, such as substance abuse, propagated through the social graph? • Are social costs propagated through the social or spa=al graph?
IBM Open Data
© 2013 IBM Corporation
How to open up data
• Collect the data together into a single file or set of files.
• Resolve any licensing conflicts; for instance, different parts of the same dataset may be owned by different parties.
• Choose an open data license, such as the Open Data Commons Attribution License. Alternatives, including share-alike licenses are available from the Open Data Commons site.
• Upload the data to a publicly-accessible part of your website. Registration should not be required to access the data.
• Include the following information along with the data:
• License details (from step 3)
• Technical details of the format that the data is stored in. Note that this does not need to be a standard format, as long as you can explain it to users, but it should not be in formats that require proprietary software to use (such as XLS), or in non-machine-readable formats (such as PDF).
• Details of when the data was last updated, and will be updated next.
• Provenance of the data, including details of original creator.
• Methodology information, such as how data was collected, calibrated, and transformed prior to upload.
(credit: http://opendatahandbook.org)
IBM Open Data
© 2013 IBM Corporation
Open data is ‘just’ data, but there are a few new considerations Most current Open Data sets are not that large, but they are growing
To date Open Data has been mostly static ,but real-time data streams growing (e.g.. COSM)
Open Data is by its very nature diverse in content, format and semantic and can be linked
Open Data can come from anywhere with no guarantees about quality or authenticity
Open Data is part of a spectrum of visibility Most orgs have a mix of data from private to open
Volume
Data at Rest
Terabytes to exabytes of
existing data to process
Velocity
Data in Mo?on
Streaming data, milliseconds to
seconds to respond
Data in Many Forms
Variety
Structured, unstructured,
text, multimedia
Veracity
Data in Doubt
Uncertainty due to data inconsistency & incompleteness,
ambiguities, latency, deception,
model approximations
Open Data represents a value migration from the data itself to higher value analytics and service
Value
Data of Many Values
Large range of data values from free
(data philanthropy to high value monetization
Value Visibility
Data in the Open
Open data is generally open to anyone. Which raises issues of privacy. Security and provenance
Big Data + Open Data + Co-Creation = New opportunities
Big Data
24
IBM Open Data
© 2013 IBM Corporation © 2010 IBM Corporation
The technology; Conceptual roadmap for the system
IBM Connections Content Sharing & Collaboration
IBM IOC Interaction with Industry Solutions
Dublin City
Enterprise Platform
IBM Enterprise Cloud Scalable compute, storage & network infrastructure
Provider 1…N
Open REST Web Services API
Catalog & Navigation Search & Query
Privacy & Security
Knowledge Representation & Reasoning
Publication & Annotation
Visualization & Analytics
Enterprise Citizen IBM Products & Services
Robust models to organize and represent resources and their
context
Scalable privacy and security of resources
Automated assimilation and sharing of resources
Compose resources for development, mash-up &
visualization
Challenges include ..
IBM Research
Partners & People
Key
Represent knowledge efficiently for continuous machine reasoning and
diagnosis
Open Innovation Portal
IBM Open Data
© 2013 IBM Corporation
IBM have the technologies to deliver a scalable and robust Open Data platform Value Proposition: • IBM has the strength & depth of technology
to help clients exploit Open Data as a Strategic Information Asset
• We can bring a robust Data lifecycle management & governance ( trust, privacy, . ) approach to Open Data, This is key !
IBM have technologies needed e.g. : • Information Server for data quality
• Vivisimo (for linked data access ) & Initiate for federated Master Data Management
• Guardium for privacy management and Optim for application archive
• Infosphere Streams for Data-in-Motion
• BigInsights Hadoop-based analysis
• New Cloud-hosted spreadsheet functionality
26
IBM Open Data
© 2013 IBM Corporation
Having opened up – now what?
• Tell the world!
• Understanding your audience • Post your material on third-party sites • Making your communications more social-media friendly
• Social media • Engage the publich:
• Unconferences, Meetups and Barcamps
• Making things! Challenges, requests, hackdays, prizes and prototypes
• (credits: http://opendatahandbook.org)
IBM Open Data
© 2013 IBM Corporation
Anders Quitzau, Innova=on Execu=ve – [email protected] Peter Lange, CTO IBM Smarter Ci=es – [email protected]