+ All Categories
Home > Technology > Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

Date post: 11-May-2015
Category:
Upload: issgc-summer-school
View: 1,082 times
Download: 1 times
Share this document with a friend
Description:
Speaker: Elias Theocharopoulos
Popular Tags:
70
web: www.omii.ac.uk email: [email protected] Sessions 43 & 44 Accessing data using a common interface: OGSA- DAI as an example Elias Theocharopoulos and Tilaye Alemu ISSGC ‘09 – Sophia Antipolis – Tuesday, 14th July 2009
Transcript
Page 1: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Sessions 43 & 44Accessing data using a

common interface: OGSA-DAI as an example

Elias Theocharopoulos and Tilaye Alemu

ISSGC ‘09 – Sophia Antipolis – Tuesday, 14th July 2009

Page 2: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

2

Overview

• The problem: Sharing data in a grid• What is OGSA-DAI?• Data-centric workflows• Key OGSA-DAI terms• The OGSA-DAI client toolkit• Use cases and extensibility points• Pros and cons

Page 3: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

3

The problem:Sharing and accessing

data in a grid

Page 4: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Distributed data resources

Page 5: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

How about a central server?

Client

FR query

FR data

Page 6: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Central server pros and cons

• Access to up-to-date data• Single point of access• Data in common format• Database can handle joins

• Initial overhead in terms of time, effort and cost

• Keeping data up to date• Loss of control by data providers

o Assuming they even let go

• Security and trust

Page 7: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

How about providing direct access?

Client

UK data

ES query

ES data

IA query

IA data

Translate and join

UKquery

Page 8: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Direct access pros and cons• Access to up-to-date data• Fast access• Data providers retain control

• Fat clients• Heterogeneity and inconsistency

o Data o Databaseso Connectiono Security

• Security overheads for data providerso Manage firewalls and usernames/passwords for multiple clients

• Hard to use in grid/web service workflows

Page 9: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

How about providing a ZIP on the web?

ClientUnZIP, translate and join

HTTP GET

UK data ES data IA data

ZIP HTTP GET

ZIP HTTP GET

ZIP

Page 10: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

ZIP on the web pros and cons

• Fast access• Data providers retain control

• Very large downloads even if client only needs subset

• Providers have to select and ZIP their data

• Client has to install data into a local database

• Static snapshot

Page 11: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Sharing distributed heterogeneous resources with OGSA-DAI

Client

OGSA-DAI

UK query

UK data

ES query

ES data

IA query IA

data

FR data

Translate and join

FR query

Page 12: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

12

Motivation

• Grid is about sharing resources

• Need to share structured data resourcesRelational Database

XML Database

Indexed File

Page 13: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

13

What is OGSA-DAI?• Open Grid Services Architecture Data

Access Integration• A framework that executes workflows• Workflows are data-centric• Workflow components are designed for

data access, integration, transformation and delivery

• Can access heterogeneous data resources• Webservice interface• Intended as a toolkit for building higher-

level application-specific data services

Page 14: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI’s vision

• Sharing data resources to enable collaboration• Data access

o Structured data in distributed heterogeneous data resources

• Data integrationo e.g. expose multiple databases to users as a single virtual

database

• Data transformationo e.g. expose data in schema X to users as data in schema Y

• Data deliveryo To where it’s needed by the most appropriate means o e.g. web service, e-mail, HTTP, FTP, GridFTP

Page 15: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI and data-centric workflows

Page 16: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI workflow

• Executes workflows

• Workflows contain activitieso Well-defined functional unitso Data goes in, something is done, data comes

outo Equivalent to programming language methods

• Workflows are submitted by clientso To an OGSA-DAI web service

Page 17: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

An OGSA-DAI workflow - a simply analogy

Pays Capital

l'Espagne

Madrid

l'Italie Rome

Pays Capital

Grande-Bretagne

Londres

France ParisConvert query from French to English

Convert query from French to English

Convert query from French to Spanish

Convert query from French to Spanish

Run SQL query

Run SQL query

Convert data from English to

French

Convert data from English to

French

Join the data

Join the data

País Capital

España

Madrid

Italia Roma

Country

Capital

UK London

France Paris

SELECT Country, Capital FROM Countries

SELECT País, Capital FROM Países

SELECT Pays,Capital FROM Pays

Run SQL query

Run SQL query

Convert data from

Spanish to French

Convert data from

Spanish to French

Pays Capital

Grande-Bretagne

Londres

France Paris

l'Espagne Madrid

l'Italie Rome

Page 18: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

How it appears to the client

workflow(SELECT Pays,Capital FROM Pays)

Pays Capital

Grande-Bretagne Londres

France Paris

l'Espagne Madrid

l'Italie Rome

Client

OGSA-DAI

Page 19: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

21

Data integration with OGSA-DAI workflows• Across OGSA-DAI services

DB1OGSADAI

DB2

SQLQuery (DB1)

SQLQuery (DB2)

DeliverJOIN

Receive from OGSA-DAI

OGSADAI

Data

Deliver to OGSA-DAI

Workflow 1

Workflow 2

Page 20: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

22

Key OGSA-DAI terms: activities, resources,

workflows

Page 21: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

23

OGSA-DAI: Key Term Activity

• An activity is a named unit of functionality

o A well defined workflow unito Pluggableo Composable

• An activity can have o 0 or more named inputso 0 or more named outputs

• Blocks of data flow from an activity’s output into another activity’s input

Page 22: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

24

OGSA-DAI: Key Term Activity (cont.)

• Example activities includeo Execute an SQL query o ZIP a batch of datao List the files in a directoryo Execute an XSL transform on an XML

documento Deliver data to an FTP server

Page 23: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

25

OGSA-DAI: Key Term Activity (cont.)

• Activity Connectionso All required inputs must be connectedo All outputs must be connectedo Optional inputs

• Inputso Literalo Streamedo Types

Page 24: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

26

Connecting activities - examples

Page 25: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

27

Data grouping: Lists

• Special blocks are used to mark the beginning and the end of a list.

• A list groups related data as one unit.

• For example ReadFromFileActivity can dynamically take any number of filenames as input.

o Without a way to group the output byte arrays we would have no way to differentiate between the binary data of filenames f1 and f2.

o Streaming is preserved since for each file a number of byte arrays is produced to be forwarded to coming activities.

ReadFromFileActivityf1,f2

[byte[]…],[ byte[]..]

Page 26: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

28

Passing data internally: OGSA-DAI Tuple• A special type of data passing between

activities• A Tuple is a data representation similar

to a row of relational data. Each element of a Tuple represent a column.

• Tuples are normally grouped in lists and they are preceded by a metadata block.

Athens 20

Madrid 22

Rome 25SqlQuery

SELECT city, temp FROM weather;

Page 27: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

29

An interesting activity: Tee

• There are activities that operate on the level of blocks and are not concerned with the type and values of data they are handling. E.g TeeActivity:

TeeActivity[A,B,C,D]

[A,B,C,D]

[A,B,C,D]

No of outputs: 2

Page 28: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

30

OGSA-DAI: Key Term Resource

• Data request execution resource• Data resources• Data sources• Data sinks• Sessions

o A state container associated with a set of workflows

o One workflow can lodge stateo A subsequent workflow can retrieve it

• Requestso One per workflow submitted to a DRERo Access request status

Page 29: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

31

OGSA-DAI: Key Term Workflow

• A workflow can contain:o Activities

• Resource-based: SQLQuery• Non-Resource:

Transformation and Delivery

o Resources• Targeted by Activities

o Other Workflows• Sub workflows• Other types of workflow

Page 30: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

32

OGSA-DAI: Key Term Workflow (cont’)• OGSA-DAI can be used as a workflow

processing system that is designed to stream data through a set of activities in a pipelined manner.

• In the Query->Transform->Deliver workflow, if the activities are well defined all three will be processing concurrently with different portions of the data stream.

Page 31: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

33

OGSA-DAI: Key Term Workflow (cont’)• Pipeline workflow consists of a set of chained

activities that will be executed in parallel with data flowing between the activities.

• Sequence workflow all the sub-workflows added to this workflow will be executed in sequence.

For example 1st sub-workflow in a sequence creates a table, 2nd bulk loads transformed data into this table.

• Parallel workflow all the sub-workflows added to this workflow will be executed in parallel.

1

2

Page 32: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

34

Getting to the first practical: The OGSA-

DAI client toolkit.

Page 33: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

35

OGSA-DAI client toolkit

• OGSA-DAI client toolkito Construct and submit requests in Java not

XML• Toolkit manages interaction with web services

via SOAP over HTTP; it handles SOAP request construction and response parsing.

o Provides Java abstractions of• Services• OGSA-DAI resources and properties• Requests• Activities

Page 34: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

36

The client toolkit

• The workflow description is sent to the OGSA-DAI server as an XML document.

• Application developer does not need to worry about creating this document.

• The client toolkit provides ways of assembling activity workflows programmatically.

• We will see how to use the client toolkit during the hands-on session.

Page 35: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

37

Data Request

Execution Service

Data Request Execution Resource

Client

Data Resource Data

Data Resource Data

Data Resource Data

SessionSessionRequestRequest Management

Service

MyDRER

One

Two

Three

MyRequest123456

Service/resource model

Page 36: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

38

Client Toolkit Activities

• One client activity per server activity• Same input and output names• Plus some convenience methodsFor example:• Retrieve results as a JDBC ResultSet

from a TupleToWebRowSet activity.• Retrieve update count as an Integer

from a SQLUpdate activity

Page 37: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

39

Step by Step Guide for Writing Clients• Create activities

o There’s a corresponding client toolkit activity for each server-side activity

DeliverToFTP deliver = new DeliverToFTP();ReadFromFile readFile = new ReadFromFile();

Page 38: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

40

• Set inputs for each activity (e.g. parameters)

• Every input parameter can either be literal input or streamed from another activity

o Literal inputs, e.g. for constant parameters:

o Connect input to the output of another activity to stream data

Connecting activities

deliver.connectDataInput(readFile.getDataOutput());

deliver.addFilename("results1.txt");deliver.addHost(“[email protected]:21");

Page 39: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

41

Gaining access to the results

• If the output of an activity can be provided in a user-friendly type, then there are methods to access the results:

o Check whether there are more results to be retrieved

o Get the next result in a convenient type

boolean hasNext = sqlUpdate.hasNextResult();

int count = sqlUpdate.getNextResult();

Page 40: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

42

Build and execute the Workflow Request• Create workflow and add activities to

them• A data service executes the workflow

and returns a response (or an error!)• The response may contain data

(depending on the activities)• Each client toolkit activity provides utility

methods for retrieving its response data

Page 41: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

43

First hands-on session

Go to : http://homepages.nesc.ac.uk/~elias/issgc09/html/

practical.html

Page 42: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

44

Extensibility points & components

Page 43: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

45

Extending OGSA-DAI: What

• OGSA-DAIo A Frameworko Extensible

• Out of the Box is the basicso Different applications have different needso New Sources of Datao New Functionality

Page 44: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

46

Extending OGSA-DAI: Overview

Data Sink

Data Source

Request

OMII

Activity Framework

GT Axis UNICORE WS-DAI ?

Workflow Execution Engine

gLite Embedded

Presentation Layer

SQ

LQuery

XP

athQuer

y MyO

wnA

ctivity

DeliverT

oUR

L

Data ResourcesX

SLT

ransform

OGSA-DAI Core

Sessions

Persistence and Configuration

New Types of Data

New Functionality

New Message Frameworks

Page 45: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

47

Extending OGSA-DAI: Activities

• Activities do some unit of work• Specific transformation

o Data Format: SWISS-PROT to format X

• Deliveryo Deliver to a target service

• Data analysis and Integrationo Combine data from different sources

Page 46: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

48

Extending OGSA-DAI: Resources

• New resources – why?o New Productso New Applicationso Specialised Access

• Required:o DataResourceo DataResourceStateo ResourceAccessor

Page 47: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

49

Extending OGSA-DAI: Remote Resource

• Accessing Resources on Remote OGSA-DAI

• Avoid replication of resources• Security Issues

o Devolved to Local OGSA-DAIo Security between OGSA-DAI Deployments

Page 48: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

SQL views• Define a drPatient view

o SELECT id, name, age, sex, doctor.name as drName FROM patient, doctor WHERE patient.DrID = doctor.ID;

• Client runs SELECT * FROM drPatient;• Shorthand for complex query results• Data access control e.g. users of drPatient

o Cannot access a patient’s ZIPo Are unaware of the doctor or patient tables

ID Name Age Sex

ZIP Dr ID

1 Ken 42 M IL1478305

456

2 Josie 25 F BN1 7QP 789

ID Name DN

123 Greene US-Chicago-G

456 Ross US-Chicago-R

789 Fairhead UK-Holby-F

Page 49: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI SQL views

• OGSA-DAI SQL views data resourceo Represents a view across a database

exposed by an OGSA-DAI relational resource

• SQLQuery activityo Parses queryo Splices in view definitiono Submits transformed query to database

• Can define views for read-only databases

• Schema transformationo Map a logical schema to a physical schema

Page 50: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Distributed query processing

• OGSA-DQP o Developed by Universities of Manchester and Newcastleo Refactored for OGSA-DAI 3.0 by EPCC as part of the NextGrid

projecto OGSA-DAI DQP package

• Multiple tables on multiple databases are exposed to clients as multiple tables in one “virtual database”

• Clients are unaware of the multiple databases• Databases can be exposed

o EITHER within one OGSA-DAI servero OR via multiple remote OGSA-DAI servers

Page 51: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI DQP

OGSA-DAI (DQP query evaluator)

Client

OGSA-DAI (core + DQP coordinator)

5: Results

4: Push results3: Execute sub-queries

2: Parse query and form query plan

OGSA-DAI

3b: SELECT Annotations_Ratings.ID,

Annotations_Ratings.Confidence FROM Annotation_Ratings

WHERE Annotations_Ratings.Confidence

> 0.99

3a: SELECT Archeo_Finds.ID,

Archeo_Finds.Provenance FROM Archeo_Finds;

OGSA-DAI

1: SELECT Archeo_Finds.ID, Archeo_Finds.Provenance, Annotations_Ratings.Confidence FROM Annotations_Ratings,

HGV_June WHERE Annotations_Ratings.Confidence > 0.99 AND Annotations_Ratings.ID = Archeo_Finds.ID;

5: Combine and post-process – do the JOIN

Page 52: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI workflows – a de-facto standard• OGSA-DAI workflows are a de-facto standard

o Of use to many projects as we’ll see

• For some applications workflows are too powerful

o Too expressiveo Infer semantics from names of activities available on

server• Must interrogate the server

o Problems using OGSA-DAI services in workflow engines e.g. Taverna

o Not compatible with existing data analysis tools

Page 53: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Facades

• Define facades on top of OGSA-DAI• Why?

o Provide interfaces with more tightly-defined semanticso Comply with standardso Exploit existing data analysis tools

• Continue to exploit the power of workflows under-the-hood

o “Canned workflows”o Templates selected and populated, executed and

parsedo Map service operations to “template” OGSA-DAI

workflows

Page 54: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Grid-enabling existing data-related products

Data analysis tool

OGSA-DAI

OGSA-DAI mediator

Page 55: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

OGSA-DAI in action

Page 56: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

VOTES – data with different schema distributed across multiple databases within a group of strategic partners

• Virtual Organisations for Trials and Epidemiological Studies (VOTES)

o http://labserv.nesc.gla.ac.uk/projects/votes/index.html o UK Medical Research Council project

• Data access and integration in the clinical domain

o Relational databases – Microsoft SQL Server, Access, …

o Distributed database joins• Patient information• Clinical trials records

o Linking key is Scotland’s CHI number

Page 57: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

VOTES – cross-database join activity

• This is equivalent to running:

SELECT chi, sex, DOB, diagnosis FROM patients, trialX WHERE patients.chi = trialX.chi;

• patients and trialX are in two different databases

DB1OGSADAI

workflow

DB2

SQLQuery(DB1)

SQLQuery(DB2)

MergeJoin

(CHI, Sex, DOB, Diagnosis)

(CHI, Sex, DOB)

(CHI, Diagnosis)

Ordered datastreams

SELECT CHI, Sex, DOBFROM PatientsORDER BY CHI

SELECT CHI, DiagnosisFROM TrialXORDER BY CHI

Deliver

Page 58: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Public Health Grid – data with different schema distributed across multiple databases within a group of strategic partners

• US Public Health Grido US Centers for Disease Controlo University of Pittsburgho Tarrant Country Public Health Departmento Dallas County Public Health Department

• Real-time Outbreak and Disease Surveillanceo Health query systemo Look for incidences of some disease on the rise over an

areao Historical and live data

• Health centres maintain their own databaseso Distributed databaseso Different products and schemas

• e.g. PatientID, Id, PatientIdentifier, PatientNumbero Security and privacy is important

Page 59: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Public Health Grid – workflows, DQP and views

DB1

OGSADAI

workflow

DB2

SELECT zip, count(*) as totalFROM CasesWHERE Reason = “Flu”GROUP BY zipORDER BY zip

SQLQuery(DB6)

DB4 DB3View

(15112, 3)

(15144, 1)

DB5

OGSA-DQP

DB6 View

Cases:SELECT * FROMDB1.Cases UNION DB2.Cases UNIONDB4.Cases

OGSA-DAI

OGSA-DAI

OGSA-DAI

Page 60: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

SEE-GEO – working with private and public data

• SEcurE access to GEOspatial serviceso http://edina.ac.uk/projects/seesaw/seegeo/

index.html o EDINA, MIMAS, NeSC, NCeSSo UK JISC project

• Geographical information systems• Virtual integration of and access control to

o Census data – geo-data access serviceo Borders data – web feature serviceo Data hosted by other organisations and exposed

as services

Page 61: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

SEE-GEO – geo-linking service portal

GLS Portal

Deliver

Deliver

Transform

Transform

JoinJoinGetGet

GetGet

Maps

1: GLSQuery submited via

portal e.g. “Leeds population

distribution by census output

area”

4: URL of image is returned to portal – avoids costly SOAP/HTTP transfer of image

5: Portal gets image using URL

Image Creation Service

MIMASCensus

UK

BORDERS

OGSA-DAI

2: Workflow is populated with query parameters and run

3: Image is placed on a map

server

Page 62: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Why OGSA-DAI?

Page 63: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Workflows

• A workflow can represent a complex data management scenario, involving:

o Data accesso Transformationo Filteringo Updating o Numerous distributed, heterogeneous

databases

Page 64: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Workflows and performance

• OGSA-DAI is one more layer between clients and data

• Therefore, OGSA-DAI is not as fast as a direct connection to a database

o OGSA-DAI uses JDBC so will never be as fast as a direct JDBC connection

• But this is not what OGSA-DAI is designed to do

Page 65: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Workflows and performance

• Having a server execute workflows yieldso Thinner clients with less memory and CPU requirementso Minimised client-server communication overheads

• Activities process data on the servero Minimises data movemento As opposed to BPEL or Taverna or web service-based

workflow engines which pass data to and fro via web services

• Data streamingo Activities work on different parts of the data stream in

parallelo Reduces memory footprint on servero Reduces execution time

Page 66: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Why another layer can be good

• Data providers retain control of their data• A place to hide database heterogeneities

o Yields thinner clients

• A place to enforce additional securityo Hide the actual location of the datao Filter the data according to the rights of clientso Manage access to federations, databases,

tables, documents, files, rows, lines

• A place to define views on read-only databases

Page 67: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Developing applications

• OGSA-DAI is highly extensibleo Data resources, activities, security,

presentation layers

• An enabling frameworko Save development timeo Focus on application-specific featureso Get standard functionalities out-of-the-box

• Queries, updates, transformations, deliveries

Page 68: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Portability

• OGSA-DAI is 100% Javao Runs under Windows, UNIX, Linux

• OGSA-DAI uses web serviceso Clients can be written in any language and

on any platform that supports web services

Page 69: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

76

Second and third hands-on sessions

Go to :http://homepages.nesc.ac.uk/~elias/issgc09/html/

practical.html#ScenarioTwoDataIntegration

Page 70: Session 43 :: Accessing data using a common interface: OGSA-DAI as an example

web: www.omii.ac.uk email: [email protected]

Further information

• WWW site : http://www.ogsadai.org.uk • Info : [email protected] • Users e-mail list : [email protected]


Recommended