+ All Categories
Home > Documents > Schema Refinement

Schema Refinement

Date post: 31-Jan-2016
Category:
Upload: niel
View: 41 times
Download: 0 times
Share this document with a friend
Description:
Schema Refinement. Chapter 19 (part 1) Functional Dependencies. The Evils of Redundancy. Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update anomalies Main refinement technique: decomposition - PowerPoint PPT Presentation
20
CS542 Schema Refinement Chapter 19 (part 1) Functional Dependencies
Transcript
Page 1: Schema Refinement

CS542 1

Schema Refinement

Chapter 19 (part 1)

Functional Dependencies

Page 2: Schema Refinement

CS542 2

The Evils of Redundancy

Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update anomalies

Main refinement technique: decomposition Example: replacing ABCD with, say, AB and BCD,

Functional dependeny constraints utilized to identify schemas with such problems and to suggest refinements.

Page 3: Schema Refinement

CS542 4

Insert Anomaly

sNumber sName pNumber pName

s1 Dave p1 prof1

s2 Greg p2 prof2

Student

Question : How do we insert a professor who has no students?

Insert Anomaly: We are not able to insert “valid” value/(s)

Page 4: Schema Refinement

CS542 5

Delete Anomaly

sNumber sName pNumber pName

s1 Dave p1 MM

s2 Greg p2 ER

Student

Question : Can we delete a student that is the only student of a professor ?

Delete Anomaly: We are not able to perform a delete without losing some “valid” information.

Page 5: Schema Refinement

CS542 6

Update Anomaly

sNumber sName pNumber pName

s1 Dave p1 MM

s2 Greg p1 MM

Student

Question : How do we update the name of a professor?

Update Anomaly: To update a value, we have to update multiple rows. Update anomalies are due to redundancy.

Page 6: Schema Refinement

CS542 7

Functional Dependencies (FDs)

A functional dependency X Y holds over relation R if, for every allowable instance r of R:

t1 r, t2 r, (t1) = (t2)

implies

(t1) = (t2)

Given two tuples in r, if the X values agree, then the Y values must also agree.

X

Y Y

X

Page 7: Schema Refinement

CS542 8

FD Example

sNumber sName address

1 Dave 144FL

2 Greg 320FL

Student

Suppose we have FD sName address

• for any two rows in the Student relation with the same value for sName, the value for address must be the same

• i.e., there is a function from sName to address

Page 8: Schema Refinement

CS542 10

Keys + Functional Dependencies

Assume K is a candidate key for R

What does this imply about FD between K and R?

It means that K R !

Does K R require K to be minimal ?

No. Any superkey of R also functionally implies all attributes of R.

Page 9: Schema Refinement

CS542 11

Example: Constraints on Entity Set

Consider relation obtained from Hourly_Emps: Hourly_Emps (ssn, name, lot, rating, hrly_wages,

hrs_worked) Notation:

We denote relation schema by its attributes: SNLRWH This is really the set of attributes {S,N,L,R,W,H}.

Some FDs on Hourly_Emps: ssn is the key: S SNLRWH rating determines hrly_wages: R W

Page 10: Schema Refinement

CS542 12

Problems Caused by FD

Problems due to Example FD :

rating determines hrly_wages: R W

Page 11: Schema Refinement

CS542 13

Example Problems due to R W :

Update anomaly: Can we change W in just the 1st tuple of SNLRWH?

Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating?

Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5!

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Hourly_Emps

rating (R) determines hrly_wages (W)

Page 12: Schema Refinement

CS542 14

Same Example Problems due to R W !

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7Hourly_Emps2

Wages

Solution : 2 smaller tables insteadof one big one !

Hourly_Emps

Page 13: Schema Refinement

CS542 15

Same Example Problems due to R W !

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7Hourly_Emps2

Wages

Will 2 smaller tables be better than one big one?

Update anomaly: Can we change W in just the 1st tuple of SNLRWH?Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating?Deletion anomaly: If we delete all employees with rating 5, we lose information about wage for rating 5?

Page 14: Schema Refinement

CS542 17

Reasoning About FDs

Given some FDs, we can usually infer additional FDs:

ssn did, did lot implies ssn lot

Page 15: Schema Refinement

CS542 18

Properties of FDs

Consider A, B, C, Z are sets of attributes

Armstrong’s Axioms: Reflexive (also trivial FD): if A B, then A B Transitive: if A B, and B C, then A C Augmentation: if A B, then AZ BZThese are sound and complete inference rules for FDs!

Additional rules (that follow from AA): Union: if A B, A C, then A BC Decomposition: if A BC, then A B, A C

Page 16: Schema Refinement

CS542 21

Closure of FDs

An FD f is implied by a set of FDs F if f holds whenever all FDs in F hold. = closure of F is set of all FDs that are implied

by F.

Computing closure of a set of FDs can be expensive. Size of closure is exponential in # attrs!

F

Page 17: Schema Refinement

CS542 23

Reasoning About FDs (Contd.)

Instead of computing full closure F+ of a set of FDs Too expensive

Typically, we just need to know if a given FD X Y is in closure of a set of FDs F.

Algorithm for efficient check: Compute attribute closure of X (denoted ) wrt F:

• Set of all attributes A such that X A is in• There is a linear time algorithm to compute this.

Check if Y is in X+ . If yes, then X Y in F+.

X

F

Page 18: Schema Refinement

CS542 24

Algorithm for Attribute Closure

Computing the closure of set of attributes {A1, A2, …, An}, denoted {A1, A2, …, An}+

1. Let X = {A1, A2, …, An}2. If there exists a FD B1, B2, …, Bm C, such

that every Bi X, then X = X C3. Repeat step 2 until no more attributes can

be added.4. Output X+ = {A1, A2, …, An}+

Page 19: Schema Refinement

CS542 26

Another Example : Inferring FDs

Consider R (A, B, C, D, E) with FDs F = { A B, B C, CD E } does A E hold ? Rephrase as :

Is A E in the closure F+ ? Equivalently, is E in A+ ?

Let us compute {A}+

{A}+ = {A, B, C} Conclude : E is not in A+, therefore A E is

false

Page 20: Schema Refinement

CS542 27

Recap: So Far.

Functional Dependencies : Relationships across Attributes of Relations

Redundancy : Arises due to certain relationships (FDs) holding.

So far : Reasoning with FDs.

Approach : Establish certain “normal forms” with respect to dependencies


Recommended