+ All Categories
Home > Documents > Lecture 1 ddbms

Lecture 1 ddbms

Date post: 17-Jun-2015
Category:
Upload: mangesh-wanjari
View: 736 times
Download: 9 times
Share this document with a friend
Popular Tags:
45
Lecture December 21, 2011 Distributed Database Management System By Mangesh R. Wanjari Asst. Professor, Department of CSE Shri Ramdeobaba College of Engineering and Management, Nagpur
Transcript
Page 1: Lecture 1 ddbms

Lecture December 21, 2011

Distributed Database Management System

By

Mangesh R. WanjariAsst. Professor, Department of CSE

Shri Ramdeobaba College of Engineering and Management, Nagpur

Page 2: Lecture 1 ddbms

Distributed Database Systems 2Wednesday, Dcember 21, 2011

Evolution of DDBMS

Decentralized database management systems (DDBMS) - Interconnected computer systems- Data/processing functions reside on multiple sites

1970’s: Centralized DBMS1980’s: Social and Technical Changes

- Ad hoc capability required- Decentralized management structure common

1990’s: New forces- Computational capacity of Personal Computers- Internet and the World Wide Web used for data access

and distribution- Data analysis through data mining and data warehousing

Page 3: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 3

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 4: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 4

Definition• A distributed database (DDB) is a collection of

multiple, logically interrelated databases distributed over a computer network.

• A distributed database management system (DDBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users.

• Distributed database system (DDBS) = DDB + D–DBMS

Distributed Database Systems

Page 5: Lecture 1 ddbms

Distributed Database Systems 5Wednesday, Dcember 21, 2011

Features of Distributed Versus Centralized Databases

• Centralized Control• Data Independence• Reduction in Redundancy• Complex Physical Structures and efficient

access• Integrity, Recovery, and Concurrency control• Privacy and Security

Page 6: Lecture 1 ddbms

Distributed Database Systems 6Wednesday, Dcember 21, 2011

Why Distributed Databases

• Organizational and economic reasons• Interconnection of existing databases• Incremental growth• Reduced communication overhead• Performance considerations• Reliability and availability

Page 7: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 7

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 8: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 8

Introduction

• The traditional database approach keeps all data centrally and then accesses them mostly in a client server model

• in a distributed database system data are distributed over site geographically

• Say for example there are four branches of a bank at different sites

• There will be two types of transaction, one is local transaction and the other is global transaction

• In global transaction case the program has to access data over site, which needs much attention, such as, transaction over network, speed, efficient access, integrity, recovery, concurrency control, privacy, security and a lot of things

Distributed Database Systems

Page 9: Lecture 1 ddbms

Distributed Database Systems 9Wednesday, Dcember 21, 2011

Centralized Database Management System

Page 10: Lecture 1 ddbms

Distributed Database Systems 10Wednesday, Dcember 21, 2011

Distributed Processing Environment

Page 11: Lecture 1 ddbms

Distributed Database Systems 11Wednesday, Dcember 21, 2011

Distributed Database Environment

Page 12: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 12

Traditional distributed processing architecture

Distributed Database Systems

LANCLIENT

CLIENT

LAN

CLIENT CLIENT

CLIENT CLIENT

LAN

CLIENT

CLIENT

LAN

CLIENT

Mumbai

CLIENT

CLIENT CLIENT

Nagpur

DBM

S

WID

E AREA NE TW

ORK

Delhi Bangalore

CLIENT

CLIENT

CLIENT

CLIENT

Page 13: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 13

Distributed Database Architecture

Distributed Database Systems

WID

E AREA NETW

ORK

LANCLIENT CLIENT

CLIENT CLIENT

DBM

SLAN

CLIENT CLIENT

CLIENT CLIENT

DBM

S

Bangalore

CLIENT CLIENT

CLIENT

DBM

S

Nagpur

CLIENT

CLIENT CLIENT

CLIENT

DBM

S

Delhi

CLIENT

CLIENT

CLIENT

Mumbai

Page 14: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 14

Distributed Database Management System

Distributed Database Systems

Components of a Distributed DBMS Possible access methods

Page 15: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 15

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 16: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 16

Reference Architecture for Distributed DBMS

Distributed Database Systems

The reference model has two main parts

1. Site independent schemas2. Site dependent schemas

Page 17: Lecture 1 ddbms

Distributed Database Systems 17Wednesday, Dcember 21, 2011

FRAGMENTS AND PHYSICAL IMAGES FOR A GLOBAL RELATION

Page 18: Lecture 1 ddbms

Distributed Database Systems 18Wednesday, Dcember 21, 2011

The most important three features that motivates in designing this architecture are

• Separation of data fragmentation and allocation.

• The control of redundancy.

• The independence from local DBMS.

What is so fascinating about this architecture?

Page 19: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 19

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 20: Lecture 1 ddbms

Distributed Database Systems 20Wednesday, Dcember 21, 2011

Types of Data Fragmentation

• Horizontal Fragmentation• Vertical Fragmentation• Hybrid/Mixed Fragmentation

There are some rules that must be followed when defining fragments:

• Completeness condition• Reconstruction condition• Disjointness condition

Page 21: Lecture 1 ddbms

Distributed Database Systems 21Wednesday, Dcember 21, 2011

Horizontal Fragmentation

Let a global relation beSUPPLIER (SUM, NAME, CITY)

Here the SUPPLIER contains supplier number, supplier name and the city where the supplier lives. However if the entire supplier comes from Nagpur city (“NGP”) and Mumbai city (“MUM”) then the horizontal fragmentation can be defined in the following way:

SUPPLIER1 = SL CITY = ”NGP” SUPPLIERSUPPLIER2 = SL CITY = ”MUM” SUPPLIER

It is always possible to reconstruct the SUPPLIER global relation through the union operation:

SUPPLIER = SUPPLIER1 UN SUPPLIER2

q1: CITY=“NGP” AND q2: CITY=“MUM”

Page 22: Lecture 1 ddbms

Distributed Database Systems 22Wednesday, Dcember 21, 2011

Horizontal Fragmentation CNTD..

A1 A2 ………. An1

1

1

2

2

3

3

3

T1

T2

T3

.

.T60

T61

.

.

Tn

A1 A2 ………. An

A1 A2 ………. AnT1

T2

T3

.

.T60

T61

.

.

Tn

Site 1

Site 2

SUPPLY(SNUM, PNUM, DEPTNUM, QUAN)

SUPPLY1 =SUPPLY SJ SNUM =SNUM SUPPLIER1

SUPPLY2 =SUPPLY SJ SNUM =SNUM SUPPLIER2

Derived Horizontal Fragmentation

Page 23: Lecture 1 ddbms

Distributed Database Systems 23Wednesday, Dcember 21, 2011

VERTICAL FRAGMENTATION

A1 A2 A3 A4

A1 A2 A3 A4

Original Relation

(R) t1

t2

tn

RS1

RS2

t1

t2

tn

t1

t2

tn

SITE1 SITE2

How to Reconstruct:

R=Rs1 Rs2 Rsn

TID –Tuple ID Hidden Attribute to

ensure account and simple join reconstruction

RS1.TID=RS2.TID

Join condition

1

2

n

1

2

n

TID TID

Page 24: Lecture 1 ddbms

Distributed Database Systems 24Wednesday, Dcember 21, 2011

EMPLOYEE (EMPNUM, SAL, TAX, MGRNUM, DEPTNUM)

A vertical fragmentation of this relation can be defined as

EMPLOYEE1 = PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEEEMPLOYEE2 = PJ EMPNUM, SAL, TAX EMPLOYEE

The fragmentation could, for instance, reflect an organization in which salaries and taxes are managed separately. The reconstruction of relation EMPLOYEE can be obtained as

EMPLOYEE = EMPLOYEE1 JN EMPNUM = EMPNUM EMPLOYEE2

VERTICAL FRAGMENTATION

Page 25: Lecture 1 ddbms

Distributed Database Systems 25Wednesday, Dcember 21, 2011

MIXED FRAGMENTATION

usa

Europe

A1 A2 A3

A1 A2 A3

A4 A5

A4 A5

A1 A2 A3 A4 A5

(Salary Attributes)

(Benefit Attributes)

Rs1

Rs2

Rs3

Rs4

R

Page 26: Lecture 1 ddbms

Distributed Database Systems 26Wednesday, Dcember 21, 2011

EMPLOYEE (EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)

The following is a mixed fragmentation that is obtained by applying the vertical fragmentation of the previous example, followed by a horizontal fragmentation on DEPTNUM:

EMPLOYEE1 = SL DEPTNUM <= 10 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEEEMPLOYEE2 = SL 10 < DEPTNUM <= 20 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEEEMPLOYEE3 = SL DEPTNUM > 10 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEEEMPLOYEE4 = PJ EMPNUM, NAME, SAL, TAX EMPLOYEE

The reconstruction of relation EMPLOYEE is defined by the following expression:

EMPLOYEE = UN (EMPLOYEE1, EMPLOYEE2, EMPLOYEE3) JN EMPNUM=EMPNUM PJ EMPNUM, SAL, TAX EMPLOYEE4

MIXED FRAGMENTATION

Page 27: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 27

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 28: Lecture 1 ddbms

Distributed Database Systems 28Wednesday, Dcember 21, 2011

Select NAME into $NAME from SUPPLIER where SNUM=$SNUM

Transparencies as seen by simple applicationFragmentation transparency

Location transparency

Local Mapping transparency

Page 29: Lecture 1 ddbms

Distributed Database Systems 29Wednesday, Dcember 21, 2011

Transparencies as seen by simple application

No transparency

Page 30: Lecture 1 ddbms

Distributed Database Systems 30Wednesday, Dcember 21, 2011

Topics left for you from the syllabus

• Distributed database access primitives

• Integrity constraints in Distributed databases

Page 31: Lecture 1 ddbms

Wednesday, Dcember 21, 2011 31

Overview

• What and why?• The Distributed Database Management Systems• The Reference Architecture for Distributed Databases• Data Fragmentation, • Distributed Transparency • Distributed Database Design

Distributed Database Systems

Page 32: Lecture 1 ddbms

Distributed Database Systems 32Wednesday, Dcember 21, 2011

Distributed Database Design

Any database design has following issues to be addressed

1. Designing the conceptual schema2. Designing the physical storage

Distribution of data also add to this

3. Designing how to fragment data4. Designing how to allocate fragments to sites

Page 33: Lecture 1 ddbms

Distributed Database Systems 33Wednesday, Dcember 21, 2011

Distributed Database Design

Objectives of the design of data distribution

• Process locality• Availability and reliability of distributed data• Workload distribution• Storage cost and availability• Distributed database design

Page 34: Lecture 1 ddbms

Distributed Database Systems 34Wednesday, Dcember 21, 2011

Distributed Database Design

Two approaches to design

1. Top-down approach• start by designing the global schema, and we proceed by designing the

fragmentation of the database, and then allocating the fragments to the sites, creating the physical images

• Suitable for systems which are developed from scratch

2. Bottom-up approach• The selection of a common database model for describing the global

schema of the database.• The translation of each local schema into the common data model.• The integration of the local schemata into a common global schema.

Page 35: Lecture 1 ddbms

Distributed Database Systems 35Wednesday, Dcember 21, 2011

The Design of Database Fragmentation

Horizontal Fragmentation (Primary)

Let P={p1,p2,…,pn} be a set of simple predicates. In order for P to represent fragment correctly and efficiently, P must be complete and minimal

1. We say that a set P is complete iff any two tuples belonging to the same fragment are referenced with same probability by any application.

2. We say the set P is minimal if all its predicates are relevant.

Page 36: Lecture 1 ddbms

Distributed Database Systems 36Wednesday, Dcember 21, 2011

The Design of Database Fragmentation

Horizontal Fragmentation (Derived)

• A distributed join is a join between horizontally fragmented relations which is represented by Join Graphs

• Join Graphs• Total• Reduced

• Simple• Partitioned

Derived Fragments : Ri=Si SJF R

Page 37: Lecture 1 ddbms

Distributed Database Systems 37Wednesday, Dcember 21, 2011

The Design of Database Fragmentation

Vertical Fragmentation

1. Split approach2. Grouping approach

Page 38: Lecture 1 ddbms

Distributed Database Systems 38Wednesday, Dcember 21, 2011

The Design of Database Fragmentation

Mixed Fragmentation

1. Applying Vertical Fragmentation to Horizontal fragments

2. Applying Horizontal Fragmentation to Vertical fragments

Page 39: Lecture 1 ddbms

Distributed Database Systems 39Wednesday, Dcember 21, 2011

The Allocation of Fragments

General criteria for fragment allocation

1. Redundant2. Non-Redundant

If replicated complexity is high because3. The degree of replication of each fragment becomes a

variable of the problem.4. Modeling read applications is complicated by the fact that

the applications can now select among several alternative sites for accessing fragments

Page 40: Lecture 1 ddbms

Distributed Database Systems 40Wednesday, Dcember 21, 2011

For determining the redundant allocation of fragments, either of the following methods can be used:

1. All beneficial sites: In this approach the set of all sites where the benefit of allocation one copy of the fragment is higher than the cost, and allocate a copy of the fragment to each element of this set.

2. Additional replication: Here first the solution of the non replicated problem, and then progressively introduce replicated copies starting from the most beneficial; the process is terminated when no additional replication is beneficial.

The Allocation of Fragments

Page 41: Lecture 1 ddbms

Distributed Database Systems 41Wednesday, Dcember 21, 2011

Measure of costs and benefits of fragment allocation

Some Definitions

• i is the fragment index• j is the site index• k is the application index• fkj is the frequency of application k at site j• rki is the number of retrieval references of

application k to fragment I• uki is the number of update references of

application k to fragment I• nki = rki - uki

Page 42: Lecture 1 ddbms

Distributed Database Systems 42Wednesday, Dcember 21, 2011

Measure of costs and benefits of fragment allocation

Horizontal fragmentation:

1. Using the ‘best-fit’ approach for a non-replicated allocation, we place Ri at the site where the number of references to Ri is maximum. The number of local references of Ri at site j is

Bij = ∑k fkj nki

Ri is allocated at site j* such that Bij* is maximum.

2. Using the ‘all beneficial sites’ method for replicated allocation, we place Ri at all sites j where the cost of retrieval references of applications is larger than the cost of update references to Ri from applications at any other site. Bij is evaluated as the difference:

Bij = k fkj rki – C *∑k∑ j≠j fkj’ uki

C is a constant which measures the ratio between the cost of an update and retrieval access

Page 43: Lecture 1 ddbms

Distributed Database Systems 43Wednesday, Dcember 21, 2011

3. Using the ‘additional replication’ method for replicated allocation, we can measure the benefit of placing a new copy of Ri in terms of increased reliability and availability of the system.

Let di denote the degree of redundancy of Ri, and let Fi denote the benefit of having Ri fully replicated at each site. The following function was introduced to measure this benefit:

β(di) = (1 – 21-di) Fi

Note that β(1) = 0, β(2) = Fi / 2, β(3) = 3 Fi / 4, and so on.We evaluate the benefit of introducing a new copy of Ri at site j by modifying the formula of case 2 as follows:

Bij = k fkj rki – C *∑k∑ j≠j fkj’ uki + β(di)

Measure of costs and benefits of fragment allocation

Page 44: Lecture 1 ddbms

Distributed Database Systems 44Wednesday, Dcember 21, 2011

References

1. “Distributed databases Principals & Systems”, Stefano Ceri, Ginseppe Pelagatti, McGrawHill Book Company, 1984.

2. ”Database System Concepts”, Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Third Edition,The McGraw Hill Companies, Inc, 1997.

3. Database Systems- Design, Implementation and Management; Peter Rob, Carlos Coronnel; Course Technology; 2000

4. Principles of Distributed Database Systems , M. T. Özsu and P. Valduriez, 3rd edition, Springer, 2011

Page 45: Lecture 1 ddbms

Distributed Database Systems 45Wednesday, Dcember 21, 2011

Motivation is what gets you started and Habit is what keeps you going…

Thanks a lot for patient listening!!

Questions?

You can reach me atmangeshwanjari[at]gmail.com


Recommended