Introduction to OGSA-DAI

Post on 13-Jan-2016

41 views 0 download

Tags:

description

Introduction to OGSA-DAI. The OGSA-DAI Team info@ogsadai.org.uk. The OGSA-DAI Project. A generic framework for integrating data access and computation Uniform interface to relational, XML, flat file data resources Using the grid to take specific classes of computation nearer to the data - PowerPoint PPT Presentation

transcript

http://www.ogsadai.org.uk

Introduction to OGSA-DAI

The OGSA-DAI Team

info@ogsadai.org.uk

2http://www.ogsadai.org.uk

The OGSA-DAI Project

A generic framework for integrating data access and computation– Uniform interface to relational, XML, flat file data resources

Using the grid to take specific classes of computation nearer to the data

Kit of parts for building tailored access and integration applications

Investigations to inform DAIS-WG One reference implementation for DAIS Releases publicly available NOW

3http://www.ogsadai.org.uk

Project Partners

Powered by ….

Funded by the Grid Core Programme

4http://www.ogsadai.org.uk

Project Membership

Principal Investigators

Project Manager

Programme Management Board Chair

Technical Review Board Chair

Research Team

IBM Dissemination TeamEPCC Team

Charaka CharakaMike Ally AmyMario

Malcolm

Kostas

Norman Paul

Neil

Andy Simon BrianDave PatrickNeil

IBM Development Team

6http://www.ogsadai.org.uk

Project Status

Current release 4.0– Globus Toolkit 3.2 compliant– Platform and language independent

• Java 1.4• Document model

Work concentrated on data access– Wraps data resources without hiding underlying data

model– Provide base for higher-level services

• Distributed Query Processing (DQP)• Data federation services

7http://www.ogsadai.org.uk

Supported Data Resources

Relational XML Other

MySQL Xindice Files DB2 eXist ?Oracle PostgreSQL SQLServer

8http://www.ogsadai.org.uk

Web Service Architecture

Service Registry

Service Consumer

Service Provider

Publish

Bind

Disc

over

9http://www.ogsadai.org.uk

OGSA-DAI Service Architecture

DAISGR

Service Consumer

GDSFGDS

Publish

Bind

Disc

over

10http://www.ogsadai.org.uk

OGSA-DAI Services

OGSA-DAI uses three main service types– DAISGR (registry) for discovery– GDSF (factory) to represent a data resource– GDS (data service) to access a data resource

This will change

acce

sses

represents

DAISGR GDSF GDS

DataResource

locates creates

11http://www.ogsadai.org.uk

GDSF and GDS

Grid Data Service Factory (GDSF)– Represents a data resource– Persistent service

• Currently static (no dynamic GDSFs)

– Cannot instantiate new services to represent other/new databases

– Exposes capabilities and metadata– May register with a DAISGR

Grid Data Service (GDS)– Created by a GDSF– Generally transient service– Required to access data resource– Holds the client session

13http://www.ogsadai.org.uk

DAISGR

DAI Service Group Registry (DAISGR)– Persistent service– Based on OGSI ServiceGroups– GDSFs may register with DAISGR– Clients access DAISGR to discover

• Resources• Services (may need specific capabilities)

– Support a given portType or activity

14http://www.ogsadai.org.uk

Analyst

RegistryDAISGR

FactoryGDSF

registerServicefindServiceData

findServiceData

Data resource publication through registry Data location hidden by factory Data resource meta data available through

Service Data Elements

Location

15http://www.ogsadai.org.uk

Interaction Model: Start up

OGSI Container

OGSI Container

GDSF

DAISGR1. Start OGSI containers with persistent services.2. Here GDSF represents Frog database.

16http://www.ogsadai.org.uk

Interaction Model: Registration

OGSI Container

OGSI Container

GDSF

DAISGR3. GDSF registers with DAISGR.

Frogs: GSH

17http://www.ogsadai.org.uk

Interaction Model: Discovery

OGSI Container

OGSI Container

GDSF

DAISGR4. Client wants to know about frogs. Can: (i) Query the GDSF directly if known or(ii) Identify suitable GDSF through DAISGR.

Frogs: GSH

Mmmmm…

Frogs?

Find

Serv

ice:

Fro

gsGSH

: GDSF

18http://www.ogsadai.org.uk

Interaction Model: Service Creation

OGSI Container

OGSI Container

GDSF

DAISGR5. Having identified a suitable GDSF client asks a GDS to be created.Frogs: GSH

GDS

CreateService

GSH: GDS

19http://www.ogsadai.org.uk

Interaction Model: Perform

OGSI Container

OGSI Container

GDSF

DAISGR

6. Client interacts with GDS by sending Perform documents.7. GDS responds with a

Response document.8. Client may terminate GDS

when finished or let it die naturally.

Frogs: GSH

GDSPerform Document

Response Document

20http://www.ogsadai.org.uk

Interaction Model: Summary

Only described an access use case– Client not concerned with connection mechanism– Similar framework could accommodate service-service

interactions

Discovery aspect is important– Probably requires a human– Needs adequate definition of metadata

• Definitions of ontologies and vocabularies - not something that OGSA-DAI is doing …

21http://www.ogsadai.org.uk

More Complex Behaviour

Data Resource

Container

Client GDSGDT

Data Resource

Container

GDS

GDT

Deliver data back to the client.

Data Resource

Deliver data to

a third

party.

Deliver data another GDS.

And there's a lot more that you can do …

22http://www.ogsadai.org.uk

Usage Patterns

GA

Q

S+R

Data

Q - QueryD - DeliveryS - StatusR - ResultU - UpdateI - Data id

Q+D

A

C

GS

R

G

C

A

Q

S

D

R

A G

Q+U

S

Retrieve Update/Insert Pipeline

G2=C

G1=P

A I

Q1

S2

S1

U/R

Q2+D

Q1+D

G2=C

A

G1=P

S2

S1

Q2

U/R

Actors

- OGSI process - Non-OGSI processA - AnalystC - ConsumerG - GDSP - Producer

CallResponse

Data Flow

A

PG

U

IQ

S

A

PG

U

I

S

Q+D

23http://www.ogsadai.org.uk

Project Using OGSA-DAI

24http://www.ogsadai.org.uk

Projects Using OGSA-DAI

OGSA-DAI(http://www.ogsadai.org.uk)

AstroGrid(http://www.astrogrid.org/)

BioSimGrid(http://www.biosimgrid.org/)

BioGrid(http://www.biogrid.jp/)

Bridges(http://www.brc.dcs.gla.ac.uk/projects/bridges/)

eDiaMoND (http://www.ediamond.ox.ac.uk/)

FirstDig(http://www.epcc.ed.ac.uk/~firstdig/)

GeneGrid(http://www.qub.ac.uk/escience/projects.php#genegrid)

GEON(http://www.geongrid.org/)

IU RGRBench(http://www.cs.indiana.edu/~plale/projects/RGR/OGSA-DAI.html)

myGrid(http://www.mygrid.org.uk/)

N2Grid(http://www.cs.univie.ac.at/institute/index.html?project-80=80)

ODD-Genes(http://www.epcc.ed.ac.uk/oddgenes/)

OGSA-WebDB(http://www.gtrc.aist.go.jp/dbgrid/)

INWA(http://www.epcc.ed.ac.uk/)

25http://www.ogsadai.org.uk

Project classification

OGSA-DAI

BiologicalSciences

PhysicalSciences

Commercial Applications

ComputerSciences

• FirstDig

• INWA

• Bridges • AstroGrid

• BioSimGrid• BioGrid

• eDiamond• myGrid

• ODD-Genes

• N2Grid

• GEON

• MCS

• IU RGBench

• OGSA Web-DB

• GeneGrid

• GridMiner

26http://www.ogsadai.org.uk

Points to Note

Feedback from users largely positive– Good suggestions– Fair criticisms– How OGSA-DAI is being used– Where it succeeds and where it fails– Helping us to capture requirements

Hope to allow user contributions– Plan to establish a policy/framework for this

Engage more with User Community– Meetings scheduled for this year

• OGSA-DAI mini-workshop at AHM 2004• OGSA-DAI tutorials at various meetings/locations

27http://www.ogsadai.org.uk

e-Digital MammOgraphy National Database– Mammogram - X-ray of the breast

Built prototype of a national database of mammographic images – In support of the UK Breast screening

programme

Employed Grid technologies to facilitate process

Thanks to eDiaMonND project and the Digital Database for Screening Mammography

for this image.

28http://www.ogsadai.org.uk

Breast screening in the UK began in 1988– Women aged 50-64 screened every 3 Years– Women aged 50-70 from 2004– 1 View/Breast → 2 views by 2003

UK has– Over 90 Breast screening units throughout the UK– Each one deals with about 45000 women on average p.a.

Each centre sees 5000-20000 images/year In 2001-02 → 2002-03

– Screened: 1.4M → 1.5M – Recalled for Assessment : 77911 → 79441 – Cancers detected : 10003 → 10467– Lives per year Saved: 300 → 1250 (by 2010)

Distributed team of doctors perform the analysis

29http://www.ogsadai.org.uk

DB2 ContentManager

DB2 ContentManager

DB2 ContentManager

DB2 ContentManager

DB2 Federation

OGSA-DAI OGSA-DAI OGSA-DAI OGSA-DAI

Database Files

OGSA-DAI

Core Services

Core Services

Core Services

Core Services

DataLoad

TrainingApp

TrainingServices

UCLKCL UEDCHU

CoreAPI

TrainingAPI

TrainingApplication

Core & Training API

OGSA-DAI

DataLoad

TrainingApp

Core & Training API

DataLoad

TrainingApp

Core & Training API

DataLoad

TrainingApp

Core & Training API

30http://www.ogsadai.org.uk

eDiaMoND Findings:– OGSA-DAI provides a flexible framework– Dynamically configure the system through discovery– Activities can operate with different levels of granularity– Federation can be introduced at various levels– Good documentation on how to extend the framework

• Extended Activities to access IBM DB2 Content Manager

– Changes between versions broke some things• Low level XML issues

31http://www.ogsadai.org.uk

FirstDIG

Data mining with the First Transport Group, UK– Example: “When buses are more than 10 minutes late there is an

82% chance that revenue drops by at least 10%”– "The results of this exercise will revolutionise the way we do

things in the bus industry.“, Darren Unwin, Divisional Manager, First South Yorkshire.

OGSA-DAIOGSA-DAI OGSA-DAIOGSA-DAI

OGSA-DAI Client Application

Data Mining Application

32http://www.ogsadai.org.uk

INWA

Innovation Node: Western Australia– Informing Business & Regional Policy:

Grid-enabled fusion of global data and local knowledge

Project– Run from Nov 2003 - Aug 2004– Involved 10 partners (6 UK + 4 Australia)

Aim– Data mine commercially sensitive data– Security an absolute MUST– Employ Grid technologies– Need access to data and computational resources

Demonstrator using:– OGSA-DAI

• Incorporate data resources

– Sun DCG's TOG (Transfer-queue Over Globus)• Handle job submission to analyse micro array data

33http://www.ogsadai.org.uk

user@australia

Curtin,Australia

EPCC,UK

INWA

Grid Engine

Bank Telco

Grid Engine

Bank Telco

OGSA-DAI OGSA-DAI

OGSA-DAI OGSA-DAI

TOG

TOG

Data Browser

Data Browser

user@edinburgh

Telco data

Bank data

Australian property

UK Property

34http://www.ogsadai.org.uk

INWA: Lessons Learned

Performing Data Integration:– TimeZone date problems

Security issues:– Bugs in

• JavaCoG in GT3• OGSA-DAI could not switch security for Grid data transfers• TOG had no security option

– All of these have been fixed

Middleware not mature enough for commercial deployment

35http://www.ogsadai.org.uk

Why OGSA-DAI?

Why use OGSA-DAI over JDBC?– Can embed additional functionality at the service end

• Transformations, compressions• Third party delivery• The extensible activity framework

– Avoiding unnecessary data movement– Common interface to heterogeneous data resources

• Relational, XML databases, and files

– Usefulness of the Registry for service discovery• Dynamic service binding process• Provision of good meta-data is necessary

– Language independence at the client end• Do not need to use Java

– Platform independence• Do not have to worry about connection technology, drivers, etc