+ All Categories
Home > Documents > Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration,...

Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration,...

Date post: 24-May-2020
Category:
Upload: others
View: 14 times
Download: 0 times
Share this document with a friend
15
Jarrar © 2019 Architectural Solutions in Data Integration Mustafa Jarrar: Lecture Notes on Architectural Solutions in Data Integration. Birzeit University, 2018 Mustafa Jarrar Birzeit University Version 5
Transcript
Page 1: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 1

Architectural Solutions in Data Integration

Mustafa Jarrar: Lecture Notes on Architectural Solutions in Data Integration.Birzeit University, 2018

Mustafa JarrarBirzeit University

Version 5

Page 2: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 2

Online Courses : http://www.jarrar.info/courses

Watch this lecture and download the slides

Thanks to Anton Deik for helping me preparing this lecture

Page 3: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 3

Architectural Solutions in Data Integration

Part 1: Application-driven Integration Architectures

Part 2: Information Integration Architectures

Part 3: What Integration Criteria to Use

Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe, Consolidation, Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query complexity, heterogeneity

Mustafa Jarrar: Lecture Notes on Architectural Solutions in Data Integration.Birzeit University, 2018

Page 4: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 4

Different Solutions

Two families of solutions for the integration issue:

– Application-driven Integration• Various types of middleware (e.g. Web Services, Remote

Procedure Call (RPC), Publish & Subscribe) that achieve reconciliation through application to middleware communication

– Data-driven Integration• Various types of data reconciliation and integration

– Consolidation

– Data Fusion

– Data Integration

Page 5: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 5

Application-driven Integration

1- Service Oriented Architecture Scenario

. . . . . .MSG-1

ASSS

ASSS

ASSS

ASSS

ASSS

ASSS. . .

LegendSS = Security ServerAS = Adapter ServerMSG = Data Message

MSG-Nenterprise service bus

Page 6: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 20196

Application-driven Integration

Source 1Source 2

Source nApplication 1 Application 2 Application n

Middleware

1

2

347

5

6

Update of an object O

PublishesSubscribes

2- Publish-Subscribe Architecture Scenario• Update via the middleware, then publish this update, other application that

subscribe to receive updates, will also update their sources.

• Typical application-driven integration architecture for integration of updates.

Based on Carlo Batini [13]

Page 7: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 7

Architectural Solutions in Data Integration

Part 1: Application-driven Integration Architectures

Part 2: Information Integration Architectures

Part 3: What Integration Criteria to Use

Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe, Consolidation, Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query complexity, heterogeneity

Mustafa Jarrar: Lecture Notes on Architectural Solutions in Data Integration.Birzeit University, 2018

Page 8: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 8

Information Integration Architectures

Source 1

Source 2

Source n

…..

Source 2

Source 1

Source n

Unique DB

New architectureonce for all

3- Consolidation ScenarioMerage all data sources into one new schema, and drop the old

Based on Carlo Batini [13]

Page 9: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 9

Information Integration Architectures

Source 1

Source 2

Source n

…..

Unique DB

New architecture: periodically updated

Data Warehousemiddleware

New database

4- Data Warehouse Scenario

Based on Carlo Batini [13]

Page 10: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 10

Information Integration Architectures

5- Virtual Data Integration Scenario

Source 1

Source 2

Source n

…..

Mediator

Local schema

Local schema

Local schema

Local schemaLocal

schemaLocal schema

Globalschema

New architectureNo new database!

Based on Carlo Batini [13]

Page 11: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 11

Architectural Solutions in Data Integration

Part 1: Application-driven Integration Architectures

Part 2: Information Integration Architectures

Part 3: What Integration Criteria to Use

Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe, Consolidation, Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query complexity, heterogeneity

Mustafa Jarrar: Lecture Notes on Architectural Solutions in Data Integration.Birzeit University, 2018

Page 12: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 12

The integration problem…

Source 2

Source 1Registry of clients 1

Source 3

Source 4

Source n

…..

Which kind ofintegration?

New architecture

Registry of clients 2

Retailsales

On linesales

Other

How to decide?

Based on Carlo Batini [13]

Page 13: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 13

What Integration Criteria to Use

1. Autonomy, the degree of independence between the different database administrators in their design choices;

2. Relevance of historical data, and consequent need to periodically store new data without deleting the old ones;

3. Query complexity, in terms of amount of data and tables visited and number of operators on them, and consequent time complexity in query execution;

4. Relevance of currency in queries, the need for queries to extract current data;

5. Economic value of integration, the relevance of having integrated information in input for business operational and decisional processes in order to produce effective outputs;

Based on Carlo Batini [13]

Page 14: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 201914

What Integration Criteria to Use

6. Volatility of sources, frequency of adding or deleting sources, and frequency of change of source schemas;

7. Relevance of queries w.r.t transactions, relative importance and frequency of queries with respect to changes in data;

8. Management complexity, the effort to be spent in management activities related to databases and hw-sw infrastructures, due to the corresponding complexity of the organizations using the data bases;

9. Costs of heterogeneity, hidden and explicit costs related to business processes that are due to making use of heterogeneous data.

Based on Carlo Batini [13]

Page 15: Architectural Solutions in Data Integration - Jarrar...Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe,

Jarrar © 2019 15

References[1] Mustafa Jarrar, Anton Deik: The Graph Signature: A Scalable Query Optimization Index for RDF Graph Databases Using Bisimulation and Trace

Equivalence Summarization. International Journal on Semantic Web and Information Systems, 11(2), 36-65,. April-June 2015

[2] Mustafa Jarrar, Anton Deik, Bilal Faraj: Ontology-Based Data And Process Governance Framework -The Case Of E-Government Interoperability In Palestine . In pre-proceedings of the IFIP International Symposium on Data-Driven Process Discovery and Analysis (SIMPDA’11). Pages(83-98). 2011.

[3] Mustafa Jarrar and Marios D. Dikaiakos: A Query Formulation Language for the Data Web. The IEEE Transactions on Knowledge and Data Engineering. IEEE Computer Society. Pages(783-798). Volume 24, Number 4, April 2012

[4] Paolo Ceravolo, Chengfei Liu, Mustafa Jarrar, Kai-Uwe Sattler: Special Issue on Querying the Data Web -Novel techniques for querying structured data on the web. The World Wide Web Journal. Volume(14), Issue (5-6). Springer. August 2011. ISSN:1573-1413.

[5] Anton Deik, Bilal Faraj, Ala Hawash, Mustafa Jarrar: Towards Query Optimization for the Data Web - Two Disk-Based algorithms: Trace Equivalence and Bisimilarity. Proceedings of the 3rd Palestinian International Conference on Computer and Information Technology (PICCIT 2010). 2010.

[6] Mustafa Jarrar, Marios D. Dikaiakos: Querying the Data Web: the MashQL Approach. IEEE Internet Computing. Volume 14, No. 3. Pages (58-670). IEEE Computer Society, ISSN 1089-7801. May 2010.

[7] Mustafa Jarrar, Marios D. Dikaiakos: Querying the Data Web: the MashQL Approach. IEEE Internet Computing. Volume 14, No. 3. Pages (58-670). IEEE Computer Society, ISSN 1089-7801. May 2010.Mustafa Jarrar and Marios D. Dikaiakos: A Data Mashup Language for the Data Web . Proceedings of LDOW, WWW'09. ACM. ISSN 1613-0073. (2009).

[8] Mustafa Jarrar and Marios D. Dikaiakos: MashQL: a query-by-diagram topping SPARQL -Towards Semantic Data Mashups. Proceedings of ONISW'08, part of the ACM CiKM conference. ACM. pages (89-96) ISBN 9781605582559.(2008).

[0] Mustafa Jarrar: Towards methodological principles for ontology engineering. PhD Thesis. Vrije Universiteit Brussel. (May 2005)

[10] Mustafa Jarrar, Luk Vervenne, Diana Maynard: HR-Semantics Roadmap- The Semantic challenges and opportunities in the Human Resources domain . Technical Report. The Ontology Outreach Advisory, Belgium. (OOA-HR/2007-08-20/v025). August 2007

[11] Lyndon Nixon, Malgorzata Mochol, Mustafa Jarrar, Stamatia Dasiopoulou, Vasileios Papastathis, and Yiannis Kompatsiaris: Prototypical business use cases. Deliverable D1.1.2 (WP1.1), The Knowledge Web Network of Excellence (NoE) IST-2004-507482, Luxemburg. January 2005.

[12] Peter Spyns, Daniel Oberle, Raphael Volz, Jijuan Zheng, Mustafa Jarrar, York Sure, Rudi Studer, and Robert Meersman: OntoWeb- a Semantic Web Community Portal. Proceedings of the 4th International Conference on Practical Aspects of Knowledge Management (PAKM 2002). Pages (189-200). LNCS 2569, Springer. ISBN: 3540003142. December 2002.

[13] Carlo Batini: Course on Data Integration. BZU IT Summer School 2011.[14] Stefano Spaccapietra: Information Integration. Presentation at the IFIP Academy. Porto Alegre. 2005.[15] Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009.


Recommended