Normalization 02 - York University Fall 2009/LectureNotes/Normalizat… · 4 Closure of a set of...

1

Normalization 02

CSE3421 notes

2

Closure of a set of attributes

Given a set of FDs F, find the set of attributes functionally determined by a given set of attributes X. This is called the closure of X, and denoted as X+.

F, X � ? (or, X+ = ? ) Example: F = { A � B, B � C} Closure of A: A+ = ABC

3

Example

F = { C � A,

BC � D,

ACD � B,

D � EG,

AB � C}

AB � ?C

BC

ACD

D

AB

GEDCBA

AB � ABC � ABC(A)D � ABCD(B)EG

Redundant from C

From BC

Redundant from ACDFrom D

4

Closure of a set of FDs

• Given a set of FDs F, find all FDs that can be produced by F. This is called the closure of F, and denoted as F+.

• F+ can be found by applying the Armstrong Axioms.

5

Armstrong Axioms

• (A1) Reflexivity (this produces all trivial FDs)

X Y X Y⊇ ⇒ →• (A2) Augmentation

,X Y XZ YZ Z→ ⇒ → ∀

• (A3) Transitivity

,X Y Y Z X Z→ → ⇒ →

6

Additional rules• Union

,X Y X Z X YZ→ → ⇒ →• Decomposition

X YZ X Y and X Z→ ⇒ → →

By applying (A1), (A2), (A3) repeatedly, we get F+.

7

Example

• Let R: (A, B, C) and F = { A � B, B � C}. F+ = ?

• Apply Armstrong Axioms

• (A1) (reflexivity) produces all trivial FDs, i.e., those whose left-hand-side (LHS) is a superset of the right-hand-side (RHS). For example, AB � A, AB � B, etc).

8

• Non-trivial FDs:– Apply (A3), transitivity:

• A � B with B � C generate A � C.

• (A3) cannot be further applied.

– Apply (A2), augmentation, in all possible permutations.• For A � B: remaining attribute is C. Therefore, AC�BC is in

F+.

• For B�C: add A and get BA�CA. Therefore, AB�AC is in F+.

• For A�C: add B and get AB�BC. Therefore, AB�BC is in F+.

9

So, F+ = { A � B, B � C, all trivial FDs, A � C, AC � BC, AB � AC, AB � BC}

From (A3)

From (A2)

10

Where are we …X+: Closure of set of attributes

F+: closure of set of FDs

MC: minimal cover of F

Lossless join property

Algorithm to compute 3NF

Preservation of dependencies

11

Lossless join

• If a relation R is decomposed into R1, R2, …and the natural join R1 join R2 join R3 …is exactly equal to R, then the decomposition is said to have the lossless join property.

12

Example

c2b2a2

c1b1a1

CBA

R

b2a2

b1a1

BA

R1:(A,B)

c2b2

c1b1

CB

R2:(B,C)

c1b1a1

c2b2a2

CBA

R1 join R2

Exactly equal to R. so have lossless join

property.

13

Example

c2b1a2

c1b1a1

CBA

R

b1a2

b1a1

BA

R1:(A,B)

c2b1

c1b1

CB

R2:(B,C)

c2b1a2

c1b1a2

c1b1a1

c2b1a1

CBA

R1 join R2

Not the same as R. So do

not have lossless join

property. (have lossy

join)

14

Proposition

A decomposition

( )1 2,R Rρ =Has a lossless join with respect to F, if and only if, either

[ ]1 2 1 2R R R R F+∩ → − ∈

or

[ ]1 2 2 1R R R R F+∩ → − ∈

15

Example Assume R1 = {A, B}, R2 = {B, C} is a decomposition of R. Then,

{ }1 2R R B∩ =

{ }1 2R R A− =

{ }2 1R R C− =

For the decomposition to be lossless, it should be either B�A in F+ or B�C in F+. But to figure if this is the case, we should have F first (which we don’t in this example) … stay tuned.

16

Proposition

If X�Y is in F of R and X Y∩ = ∅ then

The decomposition of R into R1: R-Y; R2 : XY, is lossless.

17

Proof

( )( )

1 2

1 2

R R R Y XY

XYW Y XY

XW XY

X

R R X

∩ = − ∩

= − ∩= ∩=⇒ ∩ =

For some W

Now need to show that

X�R1 (i.e., X� R-Y), or

X�R2 (i.e., X�XY)

18

Proof …/ Since

X�Y (assumption), and

X�X (trivial),

we have that X�XY, which is R2.

Therefore, X�R2. (done)

19


Definition: A decomposition

( )1 2,...,, NR R Rρ =

Preserves a set of FDs F, if

( )1

i

N

Ri

F Fπ+

+ +

=

= ∪

where ( )iR Fπ +

is the set of all dependencies from F+ that are

comprised of attributes in Ri.

20

Note: for simplicity, we may denote

( )iR Fπ +

as

iF +

21

Example

R:(A, B, C)

F= { A � C,

B � C,

A � B}

Assume R1 = (A, B) and R2 = (B, C).

Does the decomposition of R into R1, R2 preserve dependencies?

22

Example …/

Check 1 2F F∪ first (and if not succeed then have to try

1 2F F+ +∪

{ }1 2 ,F F A B B C∪ = → →

Need only show that the 3rd dependency of F, A � C can be generated by

1 2F F∪

Since A�B and B�C, we have that A�C. (done).

Therefore the decomposition preserves dependencies.

23

Example 2

R:(A, B, C)

F= { A � B,

B � C,

C � A}

( )1 2,R Rρ = Where R1 = (A, B), R2 = (B, C)

Therefore, F1 = {A � B} and F2 = { B � C}

Still need C � A. (note, A�B and B�C imply A�C but not C�A.

24

Example 2 …

Have to find 1 2 .F and F+ +

1 :F +

Have to calculate F+ and then project on the appropriate attributes.

To calculate F+, apply Armstrong axioms on F.

25

Example 2 …

F:

A � B (1)

B � C (2)

C � A (3)

(1), (2) � A � C, in F+.

(2), (3) � B � A, in F+ (and also in F1+)

(3), (1) � C � B, in F+ (and also in F2+)

26

Example 2 …/So far …

F1+: A � B (i) B � A (ii)

F2+: B � C (iii)C � B (iv)

Still need C � A. Notice, (iv), (ii) � C � A. i.e. C � A is in � done. Therefore, the decomposition preserves dependencies.

( )1 2F F++ +∪

27

Where are we …X+: Closure of set of attributes

F+: closure of set of FDs

MC: minimal cover of F

Lossless join property

Algorithm to compute 3NF


28

Minimal cover for a set of FDs

• A minimal cover G for a set of FDs F, is an equivalent set of FDs that is minimal in the following sense:

1. Every dependency in G is as small as possible (i.e., each LHS of each FD has as few attributes as possible, and the RHS has only one attribute).

2. Every FD in G is required for the closure G+ to be equal to the closure F+.

29

How to calculate G?

• Given F, how do we obtain the minimal cover G?

1. Put F in standard form(i.e., all FDs in F have RHS with one attribute).

2. Eliminate extraneous attributes from LHS.

3. Eliminate redundant FDs.

30

Important !!!

1. The above steps should be performed in order (1)-(2)-(3), or else the result may not be a minimal cover.

2. The order in which we process the FDs may result in different minimal covers (… which is ok). i.e., the minimal cover is not unique for a set F.

31

Example

F:A � B (1)ABCD � E (2)EF � G (3)EF � H (4)ACDF � EG (5)

Calculate the minimal cover of F.

32

Step 1: Put F in standard form

• FDs (1)…(4) are already in standard form.

• For FD (5): – ACDF � E (5.1)

– ACDF � G (5.2)

33

Step 2: eliminate extraneous attiributes from LHS(minimize LHSs)

(1) A � B : nothing to eliminate.

(2) ABCD � E. • If delete A, will have BCD � E.

• Is this LHS good enough?

• It is, if either • BCD � E, or

• BCD � W, such that W contains ABCD (the original LHS).

Note, (BCD)+ = BCD. Therefore, cannot delete A.

34

Can we delete B?If so, then will have

ACD � EIs this LHS good enough?It is, if either

ACD � E, orACD � W that contains ABCD.

Note, ACD � ACD � ABCD. Therefore, ACD � W = ABCD, and thus B can be eliminated!

So, ABCD � E becomes ACD � E.

35

Can I delete any more? i.e., delete C or D?If delete C, then ACD � E becomes AD � E. Test:

AD � AD � ABD, does not contain ACD. Therefore, cannot delete C.

Similarly, cannot delete D.

Since we finished scanning the entire LHS of this FD, step 2 is finished for this FD, and the resulting FD is

ACD � E (2.1) – replaces (2) of original F.

36

Repeat the above process for FDs (3), (4), (5.1), (5.2).

For (3): EF � G.

1. Can I delete E? ... If so, will have F � G. – not possible … check.

2. Can I delete F? … if so, will have E � G – not possible .. Check.

Therefore, there is no change in (3).

37

For (4) : EF � HAgain, cannot delete anything from LHS.

For (5.1) [ ACDF � E ].

Delete A?

�CDF � E ? Or,

�CDF � W that contains ACDF ?

(CDF)+ = CDF, which is neither E nor W.

� cannot delete A.

38

For (5.1) [ ACDF � E ] …

Delete C?

�ADF � E, or

�ADF � W that contains ACDF.

ADF � ADF �from (1)� ABDF, which does contain E or ACDF.

� Cannot delete C.

39

For (5.1) [ ACDF � E ] …

Delete D?

�ACF � E, or

�ACF � W that contains ACDF.

ACF � ACF �from (1)� ABCF, which does not contain E or ACDF.

� Cannot delete D.

40

For (5.1) [ ACDF � E ] …Delete F?�ACD � E, or�ACD � W that contains ACDF.

ACD � ACD �from (1)� ABCD�from (2)� ABCDE, which contains E !!

Therefore, F can be deleted from the LHS of (5.1), and (5.1) becomes

ACD �� E (5.1.1)

Finished step 2 of (5.1)!! .. On to (5.2) …

41

ACDF � G (5.2)• Repeat the above process and find that

nothing can be eliminated from the LHS of (5.2).

• So step 2 of the minimal cover computation is finished (we minimized all LHSs of all FDs).

42

The resulting FDs from step 2, are:

• A � B (1) ---------- (1)

• ACD � E (2.1) ------- (2)

• EF � G (3) --------- (3)

• EF � H (4) --------- (4)

• ACD � E (5.1.1)

• ACDF � G (5.2) ------- (5)Same as (2) !!

43

End of Normalization 02

Date post:	13-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Normalization 02 - York University Fall 2009/LectureNotes/Normalizat… · 4 Closure of a set of...

Documents