Lilac Safadi Normalization
2
Database Design
Steps in building a database for an application:
Real-world domain
Conceptualmodel
DBMS data model
Create Schema
(DDL)
Modify data (DML)
Lilac Safadi Normalization
3
How to produce a good relation How to produce a good relation schema?schema?
1. Start with a set of relation
2. Define the functional dependencies for the relation to specify the PK
3. Transform relations to normal form
Lilac Safadi Normalization
4
Data RedundancyData Redundancy
SL21
SG37
SG14
SA9
SG5
StaffNo
John
Ann
David
Mary
Susan
FName
White
Beech
Ford
Howe
Brand
LName position
Manager
Assistant
Supervisor
Assistant
Manager
Salary
30000
12000
18000
9000
24000
BrnNo
B005
B003
B003
B007
B003
City
London
Glasgow
Glasgow
Aberdeen
Glasgow
SL41 Julie Lee Assistant 9000 B005 London
Address
22 Deer Rd
163 Main St
16 Arglly St
22 Deer Rd
163 Main St
163 Main St
Relations that have redundant data may have update anomalies (insert, modify, delete)
STAFFBRANCH
B003 Glasgow163 Main St
B003 Glasgow163 Main St
B003 Glasgow163 Main St
Lilac Safadi Normalization
5
SL21
SG37
SG14
SA9
SG5
StaffNo
John
Ann
David
Mary
Susan
FName
White
Beech
Ford
Howe
Brand
LName position
Manager
Assistant
Supervisor
Assistant
Manager
Salary
30000
12000
18000
9000
24000
SL41 Julie Lee Assistant 9000
BrnNo
B005
B003
B007
City
London
Glasgow
Aberdeen
Address
22 Deer Rd
163 Main St
16 Arglly St
STAFF
BRANCH
BrnNo
B005
B005
B003
B003
B003
B007
Lilac Safadi Normalization
6
Relation DecompositionRelation DecompositionNormalization process involve decomposing a relation
Decomposition require to be reversible
Functional dependencies guarantee decomposition to be reversible
While normalization, two important properties associated with decomposition:
Lossless-join
Dependency preservation
Lilac Safadi Normalization
7
SL21
SG37
SG14
SA9
SG5
StaffNo
John
Ann
David
Mary
Susan
FName
White
Beech
Ford
Howe
Brand
LName position
Manager
Assistant
Supervisor
Assistant
Manager
Salary
30000
12000
18000
9000
24000
SL41 Julie Lee Assistant 9000
BrnNo
B005
B003
B007
City
London
Glasgow
London
Address
22 Deer Rd
163 Main St
16 Arglly St
STAFF
BRANCH
City
London
London
Glasgow
Glasgow
London
Glasgow
Lilac Safadi Normalization
8
Data RedundancyData Redundancy
SL21
SG37
SG14
SA9
SG5
StaffNo
John
Ann
David
Mary
Susan
FName
White
Beech
Ford
Howe
Brand
LName position
Manager
Assistant
Supervisor
Assistant
Manager
Salary
30000
12000
18000
9000
24000
BrnNo
B005
B007
City
London
London
SL41 Julie Lee Assistant 9000 B005 London
Address
22 Deer Rd
22 Deer Rd
22 Deer Rd
STAFFBRANCH
B003 Glasgow163 Main St
B003 Glasgow163 Main St
B003 Glasgow163 Main St
SL21 John White Manager 30000 LondonB007 16 Arglly St
SA9 Mary Howe Assistant 9000 B007 London16 Arglly St
SL41 Julie Lee Assistant 9000 B005 London22 Deer Rd
Lilac Safadi Normalization
9
Functional DependenciesFunctional DependenciesDescribes the relationship between attributes in a relation.
If A and B are attributes of relation R,
B is functionally dependent on A, denoted by A B, if each value of A is associated
with exactly one value of B. B may have several values of A.
Determinant Dependent
•Functional dependency is identifies between attributes in a relation at different times
(all time functional dependency)
A BB is functionallydependent on A
Lilac Safadi Normalization
10
A B
t
u
If t & u agree here Then they must agree here
Functional DependenciesFunctional Dependencies
A Bwhenever two tuples t & u agree on all attributes of A, then they must agree on attribute B
Lilac Safadi Normalization
11
Functional Dependencies
Example
StaffNo positionB is functionallydependent on A
position StaffNoStaffNo is NOT functionally
dependent on position
SL21 Manager
Manager SL21 SG5
1:1 or M:1 relationship
between attributes in a
relation
1:M relationship
between attributes in a
relation
Lilac Safadi Normalization
12
Trivial Functional DependenciesTrivial Functional Dependencies
A B is trivial if B A
StaffNo, SName SName
StaffNo, SName StaffNo
We are not interested in trivial functional dependencies as it provides no genuine
integrity constraints on the value held by these attributes
Lilac Safadi Normalization
13
StaffBranch ExampleStaffBranch ExampleFunctional dependencies on StaffBranch relation
StaffNo FName, Lname, position, salary, brnNo, Address, city
BranchNo Address, city
Address, city BranchNo
BranchNo, position salary
Address, city, position salary
Determinants:
StaffNo, BranchNo, (Address, city), (branchNo, position), and (address, city, position)
Lilac Safadi Normalization
14
Identifying the PKIdentifying the PKPurpose of functional dependency, specify the set of integrity constraints that must
hold on a relation
The determinant attribute(s) are candidate of the relation
•1:1 relationship between determinant & dependent
• No subset of determinant attribute(s) is a determinant. (nontrivial)
If (A, B) C, then NOT A B, and NOT B A
• All attributes that are not part of the CK should be functionally dependent on the
key. CK all attributes of R
• Hold for all time
PK is the candidate attribute(s) with the minimal set of functional dependency
Lilac Safadi Normalization
15
ClosureClosure
Closure (inferred from) X+: the set of functional dependencies that are implied by a given set of functional dependencies X
A B
t
u
If t & u agree here Then they must agree here
C
So surely they will agree here
C B
X A B
X+ A C
Lilac Safadi Normalization
16
Closure ExampleClosure Example
S BranchNo (Address, city)
S+ BranchNo AddressBranchNo city
Implied by
Lilac Safadi Normalization
17
Inference Rules for Functional Inference Rules for Functional DependenciesDependencies
Armstrong’s aximos (inference rules): the set of inference rules specifies how functional dependencies can be inferred from given one
Inference rules:Reflexivity If B A, then A BAugmentation If A B, then A,C B,CTransitivity If A B and B C, then A CSelf-Determination A ADecomposition If A B,C, then A B and A CUnion If A B and A C, then A B,CComposition If A B and C D, then A,C B,D
Lilac Safadi Normalization
18
Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies
• Complete set of functional dependencies for a relation can very large
• We need to reduce the set to a manageable size, by applying the inference rules
repeatedly until they stop producing new FDs
Assume S1 & S2 are set of dependencies
S1 S2, then S2 is a cover for S1 or S1 is covered by S2
if S2 is a cover for S1
& S1 is a cover for S2
S1 equivalent to S2
Lilac Safadi Normalization
19
Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies
A set of functional dependencies X is minimal if it satisfies the following:
• Every dependency in X has a single attribute for its right-hand side
• Can’t replace any dependency A B in X with C B , where C A, & still have
a set of dependencies equivalent to X
• Can’t remove any dependency from X and still have a set of dependencies that is
equivalent to X
Lilac Safadi Normalization
20
Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies
1. For each X {A1, A2, .. An}, create X A1, X A2, …., X An
2. A, B C is equivalent to B C, then replace A, B C with B C
3. X - {A B} equivalent to X, then remove A B
Lilac Safadi Normalization
21
The purpose of NormalizationThe purpose of NormalizationNormalization is a bottom-up approach to database design that begins by examining
the relationships between attributes. It is performed as a serious of tests on a relation
to determine whether it satisfies or violates the requirements of a given normal form.
Purpose:
Guarantees no redundancy due to FDs
Guarantees no update anomalies
Normal Forms:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
Lilac Safadi Normalization
22
The Process of NormalizationThe Process of Normalization
Normalization is a technique for analyzing relations based on their CK & FD
5NF
4NF
BCNF
3NF
2NF
1NF
Higher Normal Form
Strong
er in
form
at
Less
vulne
rable
to u
pdat
e an
omali
es
Lilac Safadi Normalization
23
First Normal Form (1NF)First Normal Form (1NF)
Unnormalized form (UNF): A relation that contains one or more repeating groups
First normal form (1NF): A relation in which the intersection of each row and
column contains one & only one value
Unnormalized relation
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4PG36
PG16
CR56 Aline Stewart
Lilac Safadi Normalization
24
UNF 1NFUNF 1NFApproach 1Approach 1
Expand the key so that there will be a separate tuple in the original relation for each
repeated attribute(s). Primary key becomes the combination of primary key and
redundant value
1NF relation
Disadvantage: introduce redundancy in the relation
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4PG36
PG16
CR56 Aline Stewart
CR76 John Key
CR56 Aline Stewart
CR56 Aline Stewart
Lilac Safadi Normalization
25
If the maximum number of values is known for the attribute, replace repeated
attribute (PropertyNo) with a number of atomic attributes (PropertyNo1,
PropertyNo2, PropertyNo3)
1NF relation
Disadvantage: introduce NULL values in the relation
UNF 1NFUNF 1NFApproach 2Approach 2
ClientNo
CR76
PropertyNo1
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4 PG36CR56 Aline Stewart
PropertyNo2 PropertyNo3
NULL
PG16
Lilac Safadi Normalization
26
UNF 1NFUNF 1NFApproach 3Approach 3
Remove the attribute that violates the 1NF and place it in a separate relation along
with the primary key
ClientNo
CR76
Name
John Key
CLIENT
CR56 Aline Stewart
ClientNo
CR76
PropertyNo
PG4
PROPERTY
PG16
PG4PG36
PG16
CR56
CR76
CR56CR56
1NF relation
1NF relation
Lilac Safadi Normalization
27
Full Functional DependencyFull Functional Dependency
If A and B are attributes of a relation
B is fully functionally dependent on A if B is functionally dependent on A, but not
on any proper set of A
B is partial functional dependent on A if some attributes can be removed from A
& the dependency still holds
StaffNo, Sname BranchNo Partial dependency
ClientNo, PropertyNo RentDate Full dependency
Lilac Safadi Normalization
28
Second Normal Form (2NF)Second Normal Form (2NF)
Second normal form (2NF): A 1NF relation in which every attribute is fully
nontrivial functionally dependent on the PK.non-prime attributes fully dependent on
PK.
Applies to relations with composite primary keys & partial dependencies
1NF relation
ClientNo cNamePropertyNo
CLIENT_RENTAL
Address RentStart RentFinish Rent OwnerNo OName
Lilac Safadi Normalization
29
1NF 2NF1NF 2NF
1. Start with 1NF relation
2. Find the FDs of a relation
3. Test the FDs whose determinant attribute is part of the PK
Lilac Safadi Normalization
30
ClientNo cNamePropertyNo
CLIENT_RENTAL
pAddress RentStart RentFinish Rent OwnerNo OName
(ClientNo, PropertyNo) PKClientNo, PropertyNo RentStart, RentFinish Full DependencyClientNo CName Partial DependencyPropertyNo Paddress, Rent, OwnerNo, Oname Partial Dependency
ClientNo, RentStart PropertyNo, pAddress, RentFinish, Rent, OwnerNo, OnamePropertyNo, RentStart ClientNo, cName, RentFinish
1NF 2NF1NF 2NF
Lilac Safadi Normalization
31
1NF 2NF1NF 2NF
3. Remove partial dependencies by placing the functionally dependent attributes in
a new relation along with a copy of their determinants
2NF relation 2NF relation
2NF relation
ClientNo cName
CLIENTClientNo PropertyNo RentStart RentFinish
RENTAL
PropertyNo
PROPERTY_OWNER
pAddress Rent OwnerNo OName
Lilac Safadi Normalization
32
Transitive DependencyTransitive Dependency
A, B, C are attributes of a relation, such that
If A B and B C, then C is transitively dependent on A via B
Provided A is NOT functionally dependent on B or C (nontrivial FD)
Example
StaffNo BranchNo , BranchNo Address
StaffNo Address
Lilac Safadi Normalization
33
Third Normal Form (3NF)Third Normal Form (3NF)
Third normal form (3NF): A 2NF relation in which NO non-prime attribute is
transitively dependent on the PK
3NF relation 3NF relation
2NF relation
ClientNo cName
CLIENTClientNo PropertyNo RentStart RentFinish
RENTAL
PropertyNo
PROPERTY_OWNER
pAddress Rent OwnerNo OName
Lilac Safadi Normalization
34
2NF 3NF2NF 3NF
1. Identify the PK in the 2NF relation
2. Identify FDs in this relation
3. If transitive dependencies exist, place transitively dependent attributes in a new
relation along with a copy of their determinants
3NF relation 3NF relation
OwnerNo OName
OWNER
PropertyNo pAddress rent OwnerNo
PROPERTY_FOR_RENT
Lilac Safadi Normalization
35
Review of DecompositionsReview of Decompositions
CLIENT_RENTAL
CLIENT RENTAL OWNER PROPERTY_FOR_RENT
PROPERTY_OWNER
1NF
2NF
3NF
Lilac Safadi Normalization
36
General Definition of 2NF & 3NFGeneral Definition of 2NF & 3NF
Second normal form (2NF): A 1NF relation in which every non-primary-key
attribute is fully functionally dependent on the CK
Third normal form (3NF): A 2NF relation in which NO non-primary-key attribute in
a nontrivial FD is transitively dependent on the CK
Lilac Safadi Normalization
37
Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF)(BCNF)
Boyce-Codd normal form (3NF): A 3NF relation in which every determinant in a
nontrivial FD is a CK
Difference between 3NF & BCNF: A B
• 3NF allows A NOT CK
• BCNF insists on A is a CK
Potential to violate BCNF may occur in a relation that:
• Contain two (or more) composite CKs
• CKs overlap. (at least one attribute in common)
Lilac Safadi Normalization
38
Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF)(BCNF)
A B C D
3NF but not BCNF
Lilac Safadi Normalization
39
ClientNo
CLIENT_INTERVIEW
Int_Date Int_Time StaffNo RoomNo
3NF BCNF3NF BCNF
ClientNo, Int_Date Int_Time, StaffNo, RoomNoStaffNo, Int_Date, Int_Time ClientNoRoomNo, Int_Date, Int_Time StaffNo, ClientNoStaffNo, Int_Date RoomNo
1. Examine FDs for a relation2. If determinant is NOT a CK, decompose relation into 2 relations
Lilac Safadi Normalization
40
3NF BCNF3NF BCNF
3. Remove non-CK dependencies by placing the functionally dependent attributes
in a new relation
BCNF relation BCNF relation
Int_Date RoomNo
STAFF_ROOMClientNo Int_date Int_time StaffNo
INTERVIEW
ClientNo
Lilac Safadi Normalization
41
Review Example
PG4
PG16
Pno pAddress
18-Oct-00
22-Apr-01
1-Oct-01
22-Apr-01
24-Oct-01
iDate iTime
10:00
09:00
12:00
13:00
14:00
comments
Replace crockery
Good order
Damp rot
Replace carpet
Good condition
StaffNo
SG37
SG14
SG14
SG14
SG37
CarReg
M23JGR
M53HDR
N72HFR
M53HDR
N72HFR
Lawrence St,
Glasgow
5 Novar Dr.,
Glasgow
sName
Ann
David
David
David
Ann
STAFF_PROPERTY_INSPECTION
Unnormalized relation
Lilac Safadi Normalization
42
UNF 1NF
PG4
PG4
PG4
PG16
PG16
Pno pAddress
18-Oct-00
22-Apr-01
1-Oct-01
22-Apr-01
24-Oct-01
iDate iTime
10:00
09:00
12:00
13:00
14:00
comments
Replace crockery
Good order
Damp rot
Replace carpet
Good condition
StaffNo
SG37
SG14
SG14
SG14
SG37
CarReg
M23JGR
M53HDR
N72HFR
M53HDR
N72HFR
Lawrence St, Glasgow
Lawrence St,Glasgow
5 Novar Dr., Glasgow
5 Novar Dr., Glasgow
5 Novar Dr., Glasgow
sName
Ann
David
David
David
Ann
STAFF_PROPERTY_INSPECTION
1NF
Lilac Safadi Normalization
43
1NF 2NF
Pno pAddressiDate iTime comments StaffNo CarRegsName
STAFF_PROPERTY_INSPECTION
Pno, iDate iTime, comments, StaffNo, Sname, CarRegPno pAddress Partial DependencyStaffNo SnameiDate, StaffNo CarRegiDate, iTime, CarReg Pno, pAddress, comments, StaffNo, SnameiDate, iTime, StaffNo Pno, pAddress, Comments
Lilac Safadi Normalization
44
1NF 2NF
Pno iDate iTime comments StaffNo CarRegsName
PROPERTY_INSPECTION
Pno, iDate iTime, comments, StaffNo, Sname, CarRegStaffNo Sname Transitive DependencyiDate, StaffNo CarRegiDate, iTime, CarReg Pno, comments, StaffNo, SnameiDate, iTime, StaffNo Pno, comments
Pno pAddress
PROPERTY
2NF
2NF
Lilac Safadi Normalization
45
2NF 3NF
Pno iDate iTime comments StaffNo CarReg
PROPERTY_INSPECTION
StaffNo sName
STAFF
3NF
PROPERTY(Pno, pAddres)STAFF(StaffNo, sName)PROPERTY_INSPECT(Pno, iDate, iTime, comments, staffNo, CarReg)
3NF
Lilac Safadi Normalization
46
3NF BCNF
Pno iDate iTime comments StaffNo CarReg
PROPERTY_INSPECTION
Pno, iDate iTime, comments, staffNo, CarReg)StaffNo, iDate carRegCarReg, iDate, iTime pno, comments, staffNoStaffNo, iDate, iTime pno, comments
STAFF_CAR(StaffNo, iDate, CarReg)PROPERTY_INSPECT(pno, iDate, iTime, comments, StaffNo, CarReg)
3NF
Lilac Safadi Normalization
47
Multi-Valued Dependency (MVD)
Represents a dependency between attributes A, B, C in a relation, such that for
each value of A, there is a set of values for B and a set of values of values for C.
However, the set of values for B & C are independent of each others.
Denoted by: A B, A C
Example
BranchNo SName, BranchNo OName
SName OName
BRANCH_STAFF_OWNER
BranchNo
B003B003B003B003
AnnDavidAnnDavid
CarolCarolTinaTina
Lilac Safadi Normalization
49
Fourth Normal Form (4NF)
Fourth normal form (4NF): A BCNF relation with NO nontrivial MVD
BCNF relation
SName OName
BRANCH_STAFF_OWNER
BranchNo
B003B003B003B003
AnnDavidAnnDavid
CarolCarolTinaTina
Lilac Safadi Normalization
50
BCNF 4NF
1. Start with a BCNF relation2. Examine FDs for a relation3. If nontrivial MVD exists, remove the MVD by placing the attributes in a new
relation along with a copy of their determinant
4NF 4NF
SName
BRANCH_STAFF
BranchNo
B003B003
AnnDavid
OName
BRANCH_OWNER
BranchNo
B003B003
CarolTina
Lilac Safadi Normalization
51
Lossless-Join Dependency
A property of decompostion, which ensures that no spurious tuples are generated
when relations are reunited through a natural join operation
Objectives:
Preserve all the data in the original relation
Does not result in the creation of additional spurious tuples
Lilac Safadi Normalization
52
Join Dependency
A, B, .., Z attributes in relation R satisfies join dependency if
Every legal value of R is equal to the join of its projections on A, B, .., Z
Lilac Safadi Normalization
53
Fifth Normal Form (5NF)
Fifth normal form (5NF): A relation with no join dependency
Description SupplierNo
PROPERTY_ITEM_SUPPLIER
PropertyNo
PG4PG4PG16
BedChairBed
S1S2S2
Lilac Safadi Normalization
54
4NF 5NF
Description
PROPERTY_ITEM
PropertyNo
PG4PG4PG16
BedChairBed
SupplierNo
ITEM_SUPPLIER
Description
BedChairBed
S1S2S2
SupplierNo
PROPERTY_ITEM
PropertyNo
PG4PG4PG16
S1S2S2
Description SupplierNo
PROPERTY_ITEM_SUPPLIER
PropertyNo
PG4PG4PG4PG16
BedBedChairBed
S1S2S2S2
Original PROPERTY_ITEM_SUPPLIER