Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | elizabeth-ferguson |
View: | 218 times |
Download: | 2 times |
1
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
NormalisationIntroduction
2
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Outline
motivation: • database design – validation
• redundancy / update anomalies
basis: functional dependencies (FDs)• definitions
• examples
• concepts and terminology
• semantic assumtpions
• (more) advanced theoretical issues (in brief)
normal form: illustration• definition
• example
3
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Database Design
relational model how do we know whether a relational model is good or not? how do we know whether a relation is well designed or not?
normal forms a (semi-)formal way of validating a relational model, from the
point of view of reducing the redundancy of data
4
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Redundancy
Student-Modules
S_id S_name P_ tutor Module Val Res
50021 G. Zolla MFU Database Systems 1cu A
50021 G. Zolla MFU Software Engineering 1cu null
50021 G. Zolla MFU Advanced Programming 1/2cu B
50012 D. Petrescu MFU Computing 1cu null
50012 D. Petrescu MFU Mathematics 1/2cu A
41002 T.A. Flo MFU HCI 1/2cu null
50033 D. Wise MH Database Systems 1cu null
50033 D. Wise MH Algorithms 1cu null
5
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Redundancy
S_id S_name P_ tutor Module Val Res
50021 G. Zolla MFU Database Systems 1cu A
50021 G. Zolla MFU Software Engineering 1cu null
50021 G. Zolla MFU Advanced Programming 1/2cu B
50012 D. Petrescu MFU Computing 1cu null
50012 D. Petrescu MFU Mathematics 1/2cu A
41002 T.A. Flo MFU HCI 1/2cu null
50033 D. Wise MH Database Systems 1cu null
50033 D. Wise MH Algorithms 1cu null
Student-Modules
6
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Redundancy
a relation contains redundant data if it stores the same information more than once
a relational model may have redundancy and at the same time have no redundant relations how? give an example
redundant data may cause update anomalies and may lead to inconsistencies
normalisation deals with redundant data at the level of individual relations
7
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Update anomalies - insertion
insert the fact that 50012 takes “Networks - Introduction”; the name of the student and the name of the personal tutor have to be entered as well; this is prone to errors inconsistent data
the structure of the relation does not prevent such errors from happening
can you identify other kinds of update anomalies on this relation?
S_id S_name P_ tutor Module Val Res
50021 G. Zolla MFU Database Systems 1cu A
50012 D. Petrescu MFU Computing 1cu null
50012 D. Petrescu MFU Mathematics 1/2cu A
50012 D. Pwtrwscu MFFU Networks 1/2cu null
50033 D. Wise MH Algorithms 1cu null
8
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Update anomalies - deletion
delete the fact that 41002 takes “HCI”, in the original table; relevant information will be also deleted - about “T.A Flo” and about “HCI”
the structure of the relation does not prevent such errors from happening
S_id S_name P_ tutor Module Val Res
50021 G. Zolla MFU Database Systems 1cu A
50021 G. Zolla MFU Software Engineering 1cu null
50021 G. Zolla MFU Advanced Programming 1/2cu B
50012 D. Petrescu MFU Computing 1cu null
50012 D. Petrescu MFU Mathematics 1/2cu A
41002 T.A. Flo MFU HCI 1/2cu null
50033 D. Wise MH Database Systems 1cu null
50033 D. Wise MH Algorithms 1cu null
9
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Update anomalies - modification
it is possible to modify an attribute and to bring the relation in an inconsistent state; e.g. it is possible (e.g. by mistake) to modify the value of “Database Systems” to “1/2cu” in just some rows; such situations must be avoided
the structure of the relation does not prevent such errors from happening
S_id S_name P_ tutor Module Val Res
50021 G. Zolla MFU Database Systems 1/2cu A
50021 G. Zolla MFU Software Engineering 1cu null
41002 T.A. Flo MFU HCI 1/2cu null
50033 D. Wise MH Database Systems 1cu null
50033 D. Wise MH Algorithms 1cu null
10
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Update anomalies
update anomalies may lead to inconsistent data are caused by redundancy
normal forms are a “measure” of the amount of redundancy in a relation are defined on the basis of a simpler concept:
functional dependencies
normalisation a way of transforming relations to eliminate redundancies no data should be lost/changed through normalisation
11
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Functional dependency (FD)
R - relation, X and Y - subsets of attributes of R
X Y iff
in every possible legal value of R each X-value has a single Y-value associated
12
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Examples
S_id S_nameS_id P_TutorS_id S_id(S_id, S_name) P_tutor(S_id, S_name, P_tutor) P_tutorModule Val(S_id, Module) Res(S_id, S_name, P_tutor, Module, Val) Res
(S_id, S_name, P_tutor, Module, Val, Res)
13
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Concepts
FD is a semantic concept• you must understand the meaning of the attributes
determinant / dependent trivial / non-trivial left-irreducible
• yes: (S_id, S_name) P_tutor
• no: (S_id, Module) Res
closure irreducible set
14
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Semantic assumptions
FDs are “deduced” from the semantic assumptions (that define the application) (patient, symptom, doctor, practice, diagnosis) a patient is seen only by one doctor
• patient doctor
a patient, for a given symptom, is seen by only one doctor• patient, symptom doctor
a doctor gives only one diagnosis for a symptom of one patient
• patient, symptom, doctor diagnosis
15
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Operations with FDs
inference rules augmentation: if AB then ACBC transitivity : if A B and BC then AC decomposition: if ABC then AB and AC union: if AB and AC then ABC composition: if AB and CD then ACBD
16
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Functional diagram
S_id
City
P_tutor
S_name
S_id
ModuleRes
Module
17
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
FDs and Keys
define a candidate key (CK) in terms of FDs how is a FD expressed in a relation?
18
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Closure
all FDs that can be derived from a given set S• notation S+
Armstrong’s inference rules• for a partial set refer to slide “Operations with FDs”
19
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Irreducible set
S1 covers S2 iff S2+ S1
+
S is irreducible iff• RightHandSide of every FD is non-composite• all FDs in S are left-irreducible• no FD ca be discarded from S without changing S+
a database that enforces S enforces, in fact, S+
the irreducible set of S is S’ iff• S’ - irreducible• S’+ = S+
more efficient to work with the irreducible set
20
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
1NF – First Normal Form
not based on FDs a relation is in 1NF if and only if all the domains of its
attributes contain only scalar values the relational model can only contain relations in 1NF
21
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
2NF – Second Normal Form
a relation (with just one CK) is in 2NF if and only if it is in 1NF and there is no FD from a subset of attributes of the PK to a non-key attribute
22
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
2NF – Examples
not 2NF (S_id, S_name, S_add, M_id, M_name, M_type, M_val, Result) why?
2NF (S_id, S_name, S_add) (M_id, M_name, M_type, M_val) (S_id, M_id, Result)
23
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
3NF – Third Normal Form
a relation (with just one CK) is in 3NF if and only if it is in 2NF and there is no FD between non-key attributes
24
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
3NF - Examples
not 3NF (M_id, M_name, M_type, M_val) why?
3NF (M_id, M_name, M_type) (M_type, M_val)
25
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Normalisation
the process of transforming a relation with redundancies into an “equivalent” set of relations that have less redundancies equivalent – non-loss decomposition
26
Term 2, 2004, Lecture 2, Normalisation - Introduction Marian Ursu, Department of Computing, Goldsmiths College
Conclusion
redundancy update anomalies normal forms – solution functional dependencies normal forms – simple definitions and examples