+ All Categories
Home > Documents > Distributed Database Management System (DDBS)

Distributed Database Management System (DDBS)

Date post: 15-Jan-2016
Category:
Upload: juana
View: 81 times
Download: 0 times
Share this document with a friend
Description:
Distributed Database Management System (DDBS). Motivation : Data is used at multiple distributed sites (e.g. Branch offices). Communication between sites is ----- costly ----- potentially unreliable Solution : - PowerPoint PPT Presentation
22
Distributed Database Management System (DDBS) Motivation : Data is used at multiple distributed sites (e.g. Branch offices). Communication between sites is ----- costly ----- potentially unreliable Solution : ----- Allow sites to store/maintain the data they use most often/specialize in ----- Sharing with other sites/HQs if combinations of data necessary
Transcript
Page 1: Distributed Database Management System (DDBS)

Distributed Database Management System (DDBS)

Motivation: Data is used at multiple distributed sites (e.g. Branch offices).

Communication between sites is ----- costly----- potentially unreliable

Solution: ----- Allow sites to store/maintain the data they use

most often/specialize in ----- Sharing with other sites/HQs if combinations of

data necessary

Page 2: Distributed Database Management System (DDBS)

Network Topology

A B

CF

E D

A

CFB

DE

A B

C

DE

F

A

B E

C D F

AB

E F

C

D

Fully connected networkPartially connected network

Tree structured network

Star network

Ring network

Cost/reliability #of hops

Page 3: Distributed Database Management System (DDBS)

Tradeoffs between

1. keeping data in centralized headquarters:• simpler maintenance• simpler consistency enforcement• possibly more efficient if many updates,aggregate

computations

2. or distributed across branch offices:• lower communication cost• reliability• parallelism can be implemented locally

Page 4: Distributed Database Management System (DDBS)

• Interconnectivity of pre-existing DBs• Expandability (don’t need to replace whole system

to grow)• Cost (many small engines on PC’s cheaper than

mainframes) issue:communication costs vs. hardware computation costs.

• Performance (place data near where used)• Availability and reliability

Advantages of DDBS (heterogeneous)

Page 5: Distributed Database Management System (DDBS)

Complicating factors

• Maintaining data consistency (in face of replication and sharing)

• Distributed directory management (who controls mapping of data to sites)

• Security

• Heterogeneous Databases

different database architectures

Page 6: Distributed Database Management System (DDBS)

Distributed Database Design Issues

Options for storing a relation R across multiple sites:

• Replication (maintain copies/replicas of R on multiple sites)

• Fragmentation (Relation store in fragments/ pieces on multiple sites)

• combination of both

Page 7: Distributed Database Management System (DDBS)

R1

R2

Copy of R1 Copy of R1

Copy of R2 Copy of R2

site1 site2 site3 site4

1/3 of R1

1/3 of R1

1/3 of R1

1

2

3 4

5

67

8

9

½OfR2

½OfR2

A/B C/D

(Horizontal)

(Vertical)

FRAGMENTATION

REPLICATION

Page 8: Distributed Database Management System (DDBS)

Replication

• Issues: (whole database replication v.s. no replication)

- what to replicate?(all relations or only frequently user shared data)

- where to replicate? (function of communication costs,usage needs,resources)

- which relations to replicate?

-” primary copy” of relation (simplifies consistency enforcement, but where located?)

Non redundantallocation

Page 9: Distributed Database Management System (DDBS)

Replication (cont)

•Advantages:

–Improved availability (multiple sources for a relation if a site is down)

–Increased parallelism (sites can process (primarily) read- only operations in parallel,minimizing data transfer)

(well suited for read-only, majority read-only data access)

Page 10: Distributed Database Management System (DDBS)

Replication (cont)

• Disadvantages:o problems/overhead for writes/updateso costs of consistency enforcement

- updates propagated to all sites (communication costs)

- costs of synchronization/locking for consistency enforcement on update

greater than in single source models.

Complicates concurrency and recover Replication inefficient in databases with frequent

updates

Page 11: Distributed Database Management System (DDBS)

FRAGMENTATION

• Vertical• Horizontal• mixel

Issues:

- completeness: Every tuple/attribute in some fragment

- reconstruction:easy way of reconstructing full relation

- transparency

Page 12: Distributed Database Management System (DDBS)

-Fragments contain subsets of complete tuples (all attributes at all sites)

How to reconstruct

R=Rs1 Rs2 ……. Rsn

HORIZONTAL FRAGMENTATION

Original relation

A1 A2 ………. An1

1

1

2

2

3

3

3

T1

T2

T3

.

.T60

T61

.

.

Tn

A1 A2 ………. An

A1 A2 ………. AnT1

T2

T3

.

.T60

T61

.

.

Tn

Site 1

Site 2

Page 13: Distributed Database Management System (DDBS)

Horizontal Fragmentation

• Example Usefulness:

- Each branch office maintains complete attribute set of its employees

(salary,benefits,address/phone,departments,projects,etc.)

- Site of Fragment easily determined by a key attribute value -e.g. Branch_office*

Page 14: Distributed Database Management System (DDBS)

VERTICAL FRAGMENTATION

A1 A2 A3 A4

A1 A2 A3 A4

Original Relation (R) t1

t2

tn

RS1

RS2

t1

t2

tn

t1

t2

tn

SITE1 SITE2

How to Reconstruct:

R=Rs1 Rs2 Rsn

TID –Tuple ID Hidden Attribute to

ensure account and simple join reconstruction

RS1.TID=RS2.TID

Join condition

1

2

n

1

2

n

TID TID

Page 15: Distributed Database Management System (DDBS)

Example usefulness:

Salary Office

Benefits Office

Directory (Name|address|phone|fax)

Dependents Management Office

each control their own appropriate

attribute for all corporate branch offices

VERTICAL –Attribute-centered management

(keep all instances of an attribute in one place)

HORIZONTAL – tuple/individual-centered management

(keep all values of a tuple in one place)

VERTICAL FRAGMENTATION

Page 16: Distributed Database Management System (DDBS)

MIXED FRAGMENTATION

usa

Europe

A1 A2 A3

A1 A2 A3

A4 A5

A4 A5

A1 A2 A3 A4 A5

(Salary Attributes)

(Benefit Attributes)

Rs1

Rs2

Rs3

Rs4

R

Page 17: Distributed Database Management System (DDBS)

Partition of Attributes/tuples need not be disjoint

REPLICATION and FRAGMENTATION

A1 A2 A3 A4 A5

A1 A2 A3 A4 A2 A3 A4 A5

Overlap

(replication of attributes)

Page 18: Distributed Database Management System (DDBS)

TRANSPARENCY

Fragmentation Transparency

-User doesn’t need to know mapping between relations and fragmented subrelations

Replication Transparency

-User doesn’t need to know about existence or location of other copies (treat as if single copy of DB)

Location and Naming Transparency

-Use shouldn’t need to know about location and full names of data on the server

Salary(ssn=so(Employee)) Site27, Employee. Fragment3. Replica7

Name Server Proper site, Fragment, replica for this data access

Unique name

Page 19: Distributed Database Management System (DDBS)

Issues1:

Parallel Processing across Fragments

LName(salary>40,000(Employee))

LName(salary>40,000(Emp1)) U LName(salary>40,000(Emp2))

QUERY PROCESSING IN DDMS

=Emp1 U Emp2

2 Fragments

Site 1 Site 2

Execution in Parallel on fragments

and union results together

Horizontal fragmentations

Page 20: Distributed Database Management System (DDBS)

(A B) C

A (B C)50K 0.5K

1K 3K

0.5K

0.5K

Site1 Site2 Site3

50K 1K 3KJoins- symmetric and

associative

Parallel Processing

(xx(A)) (B C)

Page 21: Distributed Database Management System (DDBS)

QUERY PROCESSING IN DDBS

R= Fnames, Cnames, Dnames (Employee Department)

Strategies:1)Ship both relations to the result site and join there

2)Ship employee to 2, join at 2, results to 3

3)Ship Department to 1, join at 1, results to 3

minimize total communication cost of data transfer

1,003,000 bytes transfered

1,002,000 bytes transfered

5,000 bytes transfered

Join Strategies

Site 3

100 records, 2000 bytes

Site 1

10,000 records, 1,000,000 bytes

Site 2

100 records, 3000 bytes

Mg rssn to ssn

Page 22: Distributed Database Management System (DDBS)

-transaction managers / coordinators

-log managers

Problems:

-failure of site

-failure of link

-loss of messages

if server is down, elect new server what about network partitioning?

RECOVERY IN DDBS

Server’s link

Newly elected Server

Difficult to know which had occurred

Original Server


Recommended