Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | lesley-knight |
View: | 215 times |
Download: | 0 times |
ZEIT2301- Database Design
Normalisation
School of Engineering and Information Technology
UNSW@ADFA
Dr Kathryn Merrick
Bldg 16, Rm 212 (Thursdays and Fridays only)
Topic 08: Database Normalisation
• Designing a ‘good’ relational database• Normal forms
• First normal form• Second normal form• Third normal form
A Motivating Example
Suppose we want to develop a database of bike statistics for a program that permits users to find out whether or not each bike can stoppie.
A file for such data might be…
Harley 1.588 0.724 0.9 falseHarley one pax 1.588 0.775 0.9 falseHonda 1.458 0.831 0.9 trueHonda one pax 1.458 0.881 0.9 true
Bike descriptionWheelbase
Centre of mass height
Coefficient of friction
This column contains two types of data (bike type and number of passengers)
This column contains duplicate (redundant) data
Data in this column is not even bike dependent
Can stoppie?
Relations
Relations are data tables with rows and columns
Row
Column
Attributes
Attributes are named columns of relations
Out bike example has five attributes:
Bike description
Wheelbase
Centre of mass height
Coefficient of friction
Can stoppie
Domains
The domain of an attribute is the set of allowable values for that attribute
Attribute domains in our bike example:
Attribute DomainBike description Alphanumeric: size 60;
Wheelbase Numeric: range [0-5]
Centre of mass height Numeric: range [0-5]
Coefficient of friction Numeric: range [0-1]
Can stoppie Boolean: one of [true, false]
Records (Tuples)
A record or tuple is one row of data in a relation
Our bike example has four records
Bike description
Wheelbase
Centre of mass height
Coefficient of friction
Can stoppie
Harley 1.588 0.724 0.9 false
Harley one pax 1.588 0.775 0.9 false
Honda 1.458 0.831 0.9 true
Honda one pax 1.458 0.881 0.9 true
Relational Databases
A relational database is a collection of normalised relations
Normalisation is a technique for producing a set of tables that conform to desirable redundancy and integrity constraints
There are three common normal forms: First normal form Second normal form Third normal form
First Normal Form (1NF)
A table is in 1NF if:
The intersection of every column and record contains only one value
and
It has a primary key attribute that uniquely identifies every record
Keys (Revision)
Superkey: a column or set of columns that uniquely identifies a record in a relation
Candidate key: a superkey with the minimum number of columns
Primary key: the candidate key selected for identification purposes
Foreign key: a column or set of columns in one table that matches a candidate key of another table
12
Decomposition to 1NF
Remove multi-valued attributes
Add a primary key
The bike example in 1NF
Bike name
Number of riders
Wheelbase
Centre of mass height
Coefficient of friction
Can stoppie
Harley 1 1.588 0.724 0.9 false
Harley 2 1.588 0.775 0.9 false
Honda 1 1.458 0.831 0.9 true
Honda 2 1.458 0.881 0.9 true
Bike name*
Number of riders*
Wheelbase
Centre of mass height
Coefficient of friction
Can stoppie
Harley 1 1.588 0.724 0.9 false
Harley 2 1.588 0.775 0.9 false
Honda 1 1.458 0.831 0.9 true
Honda 2 1.458 0.881 0.9 true
Second Normal Form (2NF)
A table is in 2NF if:
It is in 1NF and
The values in each non-primary-key column depend on value in all primary key columns (ie: not a subset of the primary keys)
15
Decomposition for 2NF
Remove non-primary key attributes that are not fully functionally dependant on the primary key
Place them in a new relation with the part of the primary key on which they are functionally dependant (i.e. their determinant)
Consider replacing compound primary keys with non-compound keys
16
Full Functional Dependency
Examine the non key attributes in “creditRecord”:address employer interestRate limit
From the FDs given, the attributes address, employer and interestRate are not dependent on the whole primary key
The attribute limit is fully functionally dependent on the primary key
creditRecord(customer, creditCard,address, employer, limit,
interestRate)
The bike example in 2NF
Bike name* Wheelbase
Harley 1.588
Honda 1.458
Scenario ID*
Bike name
Number of riders
Centre of mass height
Coefficient of friction
Can stoppie
1 Harley 1 0.724 0.9 false
2 Harley 2 0.775 0.9 false
3 Honda 1 0.831 0.9 true
4 Honda 2 0.881 0.9 true
Third Normal Form (3NF)
A table is in 3NF if:
It is in 1NF
and
It is in 2NF
and
The values in each non-primary-key column depend on values in only the primary key columns
19
Transitive Functional Dependency
Cnsider a relation with attributes A, B, and C.
If B is functional dependent on A (A B), and C is functional dependent on B (B C), then C is transitively dependent on A.
A B, B C
If any non-key attribute is transitively dependent on the primary key, the relation is not in 3NF.
The bike example in 3NF
Bike name*
Number of riders*
Centre of mass height
Harley 1 0.724
Harley 2 0.775
Honda 1 0.831
Honda 2 0.881
Road conditions*
Coefficient of friction
Icy 0.1
Wet 0.5
Dry 0.9
Scenario ID*
Bike name
Number of riders
Road conditions
Can stoppie
1 Harley 1 Dry false
2 Harley 2 Dry false
3 Honda 1 Dry true
4 Honda 2 Dry true
Bike name*
Wheelbase
Harley 1.588
Honda 1.458
21
“Codd’s Law of Normalization”
Thou shalt depend upon the key (1NF), the whole key (2NF), and nothing but the key (3NF)!
22
Normalization Exercise: 1NF?
member No
name homeCity hobby sport sportHQ
833 Chang Sydney coin collecting Netball Lyneham
834 Jones Canberra drag racing, video games
AFL Essendon
927 Wicken Perth video games AFL Essendon
968 Aparti Darwin coin collecting, drag racing
Rugby Randwick
972 Nixon Perth tiddlywinks Netball Lyneham
member(memberNo, name, homeCity, hobby, sport, sportHQ)
Session 1, 2009 23
Normalization Exercise: 1NF
member No
name home City
sport sportHQ
833 Chang Sydney Netball Lyneham
834 Jones Canberra AFL Essendon
927 Wicken Perth AFL Essendon
968 Aparti Darwin Rugby Randwick
972 Nixon Perth Netball Lyneham
member(memberNo, name, homeCity,hobby, sport, sportHQ)memberHobby(memberNo, hobby)
member No
hobby
833 coin collecting
834 drag racing
834 video games
927 video games
968 coin collecting
968 drag racing
972 tiddlywinks
Both tables are in 1NF: All attributes are single valued and depend on PK
24
Normalization Exercise: 2NF?
member No
name home City
sport sportHQ
833 Chang Sydney Netball Lyneham
834 Jones Canberra AFL Essendon
927 Wicken Perth AFL Essendon
968 Aparti Darwin Rugby Randwick
972 Nixon Perth Netball Lyneham
member(memberNo, name, homeCity, sport, sportHQ)
FDs:memberNo name, homeCity, sport, sportHQsport sportHQ
Do all non-key attributes depend on the whole PK?
Session 1, 2009 25
Normalization Exercise: 2NF
member No
name home City
sport sportHQ
833 Chang Sydney Netball Lyneham
834 Jones Canberra AFL Essendon
927 Wicken Perth AFL Essendon
968 Aparti Darwin Rugby Randwick
972 Nixon Perth Netball Lyneham
member(memberNo, name, homeCity, sport, sportHQ)
FDs:memberNo name, homeCity, sport, sportHQsport sportHQ
Do all non-key attributes depend on the whole PK?• PK is not composite. • All attributes depend on the PK. •Table is in 2NF
26
Normalization Exercise: 3NF?
member No
name home City
sport sportHQ
833 Chang Sydney Netball Lyneham
834 Jones Canberra AFL Essendon
927 Wicken Perth AFL Essendon
968 Aparti Darwin Rugby Randwick
972 Nixon Perth Netball Lyneham
member(memberNo, name, homeCity, sport, sportHQ)
FDs:memberNo name, homeCity, sport, sportHQsport sportHQ
Are there any transitive dependencies between non-key attributes?
Session 1, 2009 27
Normalization Exercise: 3NF
member No
name home City
sport
833 Chang Sydney Netball
834 Jones Canberra AFL
927 Wicken Perth AFL
968 Aparti Darwin Rugby
972 Nixon Perth Netball
member(memberNo, name, homeCity, sport)sport(sport, sportHQ)
FDs:memberNo name, homeCity, sport, sportHQsport sportHQ
Decompose based on transitive dependencies.NB. Maintain a link between tables
sport sportHQ
Netball Lyneham
AFL Essendon
Rugby Randwick
Session 1, 2009 28
Normalization Exercise: 3NF
sport(sport, sportHQ)
FDs:memberNo name, homeCity, sport, sportHQsport sportHQ
sport table:• 1NF (all attributes single valued and dependant on PK)
• 2NF (attribute depends on the whole key)
• 3NF (no transitive dependencies between non-key attributes)
sport sportHQ
Netball Lyneham
AFL Essendon
Rugby Randwick
29
Normalization Exercise: 2NF 3NF
memberHobby(memberNo, hobby)
member No
hobby
833 coin collecting
834 drag racing
834 video games
927 video games
968 coin collecting
968 drag racing
972 tiddlywinks
memberHobby table:• 1NF: All attributes are single-valued
• 2NF: Table is all key. No non-key attributes so table is in 2NF
• 3NF: No non-key attributes so no transitive dependencies between them!
Summary
After today’s lecture you should be able to:
Design a normalised relational database in First normal form Second normal form Third normal form