+ All Categories
Home > Documents > SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Date post: 23-Dec-2015
Category:
Upload: dapaas
View: 16 times
Download: 0 times
Share this document with a friend
Description:
A presentation by Dr Titi Roman on the DaPaaS platform and data challenges within the oil and gas industry.
Popular Tags:
41
Data-as-a-Service for Open Data: The DaPaaS Approach @ SPE Oslo February 10, 2015 http ://dapaas.eu/ Dumitru Roman, SINTEF, Norway
Transcript
Page 1: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Data-as-a-Service for Open Data: The DaPaaS Approach

@ SPE Oslo

February 10, 2015

http://dapaas.eu/

Dumitru Roman, SINTEF, Norway

Page 2: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Outline

• Oil Spill Pilot Example

• Open (Linked) Data

• DaPaaS Platform

– Data transformation-as-a-Service

2

Page 3: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Oil spills…

• May have severe ecological and socio-economic effects

• Understanding fate and effects of oil spills is key to risk mitigation and response management

3

Page 4: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

10/02/2015

4

Page 5: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Oil Spill Pilot Example

• A number of oil drift models have been developed – typically oriented towards specific regions

• Adapting an oil drift model to another region is often difficult, due to differences in driving data's structure and semantics

• Moreover, coupling oil drift models to other models (e.g. for biological effects) is difficult

• Stakeholders: SINTEF's oil drift model OSCAR, other organizations that may be involved are oil companies, coast guards, consulting companies, researchers and the general public

5

Page 6: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Oil Spill Contingency And Response (OSCAR)

• Examples Challenges:

– If I want to move my oil drift model to a new geographic region, how do I find appropriate data sources? And how can I trust that the data works with my model?

– Given a real oil spill, where do I find up-to-date forecast data for my oil drift model? How can I utilise them by incorporating them within my workflow model and how can I trust that it works with it?

– How can I look at biological effect models that are not “hardwired” into my model?

OSCAR (developed by SINTEF MET)

Page 7: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Data is there (often as open data)

…but difficult to access, reuse, integrate, …

Page 8: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Agenda

• Oil Spill Pilot Example

• Open (Linked) Data

• DaPaaS Platform

– Data transformation-as-a-Service

8

Page 9: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Open Data

• Open Data Movement: make data available (primarily government data)

– Businesses and citizens can develop new ideas, services and applications

– Can support (government) transparency and accountability

9Source: McKinsey http://www.mckinsey.com/insights/business_technology/open_data_unlocking_innovation_and_performance_with_liquid_information

Gartner:

By 2016, the use of "open data" will continue to

increase — but slowly, and predominantly limited to

Type A enterprises.

By 2017, over 60% of government open data

programs that do not effectively use open data

internally, will be scaled back or discontinued.

By 2020, enterprises and governments will fail to

protect 75% of sensitive data and will declassify and grant broad/public access to it.

Source: Garner http://training.gsn.gov.tw/uploads/news/6.Gartner+ExP+Briefing_Open+Data_JUN+2014_v2.pdf

Page 10: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Lots of open datasets on the Web…

• A large number of datasets have been published as open data in the recent years

• Many kinds of data: cultural, science, finance, statistics, transport, environment, …

• Popular formats: tabular (e.g. CSV, XLS), HTML, XML, JSON, …

10

Page 11: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

…but few actually used

• Applications utilizing open and distributed datasets have been rather few, e.g.

• Challenges include:– Lack of resources: unreliable data access

– Lack of expertise: not easily available to organisations

– Technical/organizational

11

Open Data Portal Datasets Applications

data.gov ~ 110 000 ~ 350

publicdata.eu ~ 50 000 ~ 80

data.gov.uk ~ 20 000 ~ 350

data.norge.no ~ 300 ~ 40

Page 12: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Open Data as Tabular Data…and its problems

– Records organized in silos of collections

– Very few links within and/or across collections

– The nature of the relationships among records/collections is hidden

– Difficult to integrate/query

– …

12

Data is trapped!

Tabular datasets

publicdata.eu data.gov.uk

Page 13: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Linked Data is great for Open Data

• Linked Data as a great means to represent and integrate data from disparate sources on the Web

• Where Linked Data can take Open Data:

– Free data from silos

– Seamless interlinking of data

– Understand the data and expose hidden

relationships

– New ways to query and interact with data

13

Page 14: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

…but has been ignored by the mainstream

• Difficult to make it accessible to people

– Publishers

– Developers

– Data workers

• We are packaging Linked Data to make it more approachable to the open data community

14

Page 15: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Agenda

• Oil Spill Pilot Example

• Open (Linked) Data

• DaPaaS Platform

– Data transformation-as-a-Service

15

Page 16: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

DaPaaS – one package 3 audiences

16

DaPaaS Project

Data Publisher

End-Users Data Consumer

Application Developer

Helping publishing

open data

Giving better,

easier tools

Reaching through

data and applications

Page 17: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

DaPaaS means to making Open (Linked) Data easier to use

• A platform/hosting: to make it easy for publishers to put data on the web, and developers to publish their applications

• A portal: to help advertising data and applications availability - and enticing new users

• Tool-supported data transformation methodology: to make it easy for people with Excel knowledge to publish large amounts of high quality data

• API's with high-quality documentation: for processing large amounts of data reliably in order to create interactives, visualisations and transformations

17

Make Linked Data more accessible to everyone!

Page 18: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

DaPaaS – Data Value Chain

• End-user Data Consumer

– Browse/Search Datasets&Apps Catalogue

– App execution

• App Developer

– Browse/Search Datasets Catalogue

– App deployment and metadata creation

• Data Publisher

– Dataset and Metadata creation

– Data import and transformation

– Data exploration

– Data-driven portal configuration

– Data export

– Browse/Search Datasets Catalogue18

Data Value Chain

Page 19: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Enablers

19

DaPaaS platform

Grafter Grafterizer(Graphical Tool & DSL)

RDF database-as-a-service

PLUQI Open Data Visualization-as-a-service (Rainbow)

RDF DDP

Page 20: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Grafter

• Grafter is a Clojure library, a DSL and a suite of tools for data transformation and processing

– Clojure is a functional programming language similar to Lisp

• Primarily used for handling data conversions from:

– tabular data formats to tabular data formats

– tabular data formats to RDF Linked Data format

• Open Source

– Eclipse Public License (EPL)

– http://github.com/dapaas/grafterizer

20

Page 21: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Tabular data (spreadsheet)to RDF Linked Data (graph)

1. Specify a pipeline, of tabular transformations for data cleaning and transformation.

2. Create the graph fragments, resulting in the generation of an RDF graph.

21

Page 22: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 23: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 24: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 25: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 26: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 27: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 28: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 29: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 30: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 31: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 32: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]
Page 33: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Grafterizer

• GitHub:

– github.com/dapaas/grafterizer

• License:

– Eclipse Public License (EPL)

• Partner collaboration

– Grafterizer developed by SINTEF

33

Page 34: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Use Case: Data Transformation

• Import raw tabular data

• Clean up and transform data using Grafterizer

Transform

Prepared Data

Raw Data

Page 35: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Use Case: Mapping to RDF

• Import prepared data

• Define ontology mapping using Grafterizer

• Generate RDF graph

Generate RDF

Ontology XOntology X

Ontology X

Ontology mapping

Prepared Data

Map

Map

RDF Graph

Page 36: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Use Case: Transformation and Mapping to RDF

• Import raw data

• Clean up and transform using Grafterizer

• Define ontology mapping using Grafterizer

• Generate RDF Graph

TransformGenerate

RDF

Ontology XOntology X

Ontology X

Ontology mapping

RDF Graph

Raw Data

Prepared Data

Map

Map

Page 37: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Example: Transformation and Mapping to RDF

37

Name Sex Age

Alice f "34"

Bob m "63"

Transform and generate

RDF

Page 38: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Portal demo

• Portal

– http://dapaas.ontotext.com/demo/

• PLUQI application

– http://dapaas-apps.ontotext.com/PLUQI/home

• Data-driven portal

– http://dapaas.ontotext.com/demo/pages/ddp.jsp?portal-test-1

• Grafterizer

– http://dapaas.ontotext.com/demo/pages/add_dataset.jsp

38

Page 39: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

Summary

• Open Data can be useful in the Oil&Gas domain (example: Oil Spill Pilot)

• Lots of open datasets, but very few actually used (e.g. low number of applications using them)

• Linked Data is a promising technology for Open Data, but difficult to use for publishers, developers, data workers

• DaPaaS – emerging solution (as-a-Service) for making Open (Linked) Data more accessible

– Platform, portal, methodology, APIs

– (Repeatable) Data Transformation is a core aspect of DaPaaS

– Public release expected this year – stay tuned!

39

Page 40: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

http://dapaas.eu

@dapaasproject

Thank you!

40

Contact: [email protected]

Page 41: SPE Oslo [Big Data Solutions & Analytics in Upstream Oil and Gas Industry]

41


Recommended