CS530 Database Architecture Models and Design
Prof. Ian HORROCKSDr. Robert Stevens
Wednesday - Practical Tables
CS530 - Ian Horrocks and Robert Stevens 27/09/20042
In this Section… Topics Covered
– Functional Dependencies
– Normalisation– SQL Data Defn and Manipulation– SQL Query
Examples Classes
CS530 - Ian Horrocks and Robert Stevens 27/09/20043
Informal guidelines Semantics of the attributes
–easy to explain relation–doesn’t mix concepts
Reducing the redundant values in tuples Choosing attribute domains that are
atomic Reducing the null values in tuples Disallowing spurious tuples
CS530 - Ian Horrocks and Robert Stevens 27/09/20044
Functional Dependencies
CS530 - Ian Horrocks and Robert Stevens 27/09/20045
Functional Dependency an attribute A is functionally dependent on a
set of attributes X if and only if– value of A is determined solely by the values of X– values of X uniquely determine a value of A
child → mother
The value of child implies the value of motherValue of mother does NOT imply value of childChild is the determinantMother is the dependent/determined
mother → child
X → A
CS530 - Ian Horrocks and Robert Stevens 27/09/20046
Our case study example
studno name
given family
hons
slot
labmark
exammark
STUDENT
SCHOOL
YEARENROL
YEARREG
REG
TUTOR
YEARTUTOR
STAFFCOURSE
coursenosubject
equip
name
year
faculty
appraiserappraiseeAPPRAISAL
TEACHm n
1 m
1
11m
n mm
1
m
1
roomno
STUDENT(studno,givenname,familyname,hons,tutor,slot,year)
studno → studno, givenname, familyname, hons tutor, slot, year
ENROL(studno,courseno,labmark,exammark)studno, courseno → labmark, exammark
COURSE(courseno,subject,equip)courseno → courseno, subject, equip
STAFF(lecturer,roomno,appraiser)lecturer → lecturer, roomno, appraiserroomno → lecturer, appraiser, roomno
YEAR(year,yeartutor)year → year, yeartutoryeartutor → year, yeartutor
SCHOOL(hons,faculty)hons → hons, faculty
TEACH(courseno,lecturer)courseno, lecturer → courseno, lecturer
CS530 - Ian Horrocks and Robert Stevens 27/09/20047
More Examples of Functional Dependency
part_number
part_ description
quantity_in_stock
studno
courseno
labmark
name
tutor
roomnosubject
CS530 - Ian Horrocks and Robert Stevens 27/09/20048
Use functional dependencies to …check that a relation is legal or good. e.g keys
K is a superkey of relation R if K → R
i.e. whenever t1[k] = t2[k] then t1[R]= t2[R]
K functionally determines all attributes in a tuple in R
STUDENT (studno,name,hons,tutor,slot,year)
studno → studno, name, hons, tutor, slot, year
CS530 - Ian Horrocks and Robert Stevens 27/09/20049
Use functional dependencies to …check that a relation is legal or good. e.g. remove redundancy
Partial Dependencystudno, courseno → subject
(studno, courseno, subject) Transitive Dependency
studno → yeartutor studno → year
year → yeartutor so, studno → yeartutor (studno, yeartutor)
Base functional dependencies F Set of logically implied functional
dependencies CLOSURE F+
CS530 - Ian Horrocks and Robert Stevens 27/09/200410
Normalisation(in Brief)
CS530 - Ian Horrocks and Robert Stevens 27/09/200411
Normalisation Overview Stops information repeating over tables Uses Functional Dependency Uses a number of ‘forms’ 1 through 7
– (1NF, 2NF, 3NF, BCNF, 5NF, DK/NK, 7NF) We shall go to 3rd Look at Background for
more. After you’ve built 10 DBs you’ll just ‘know’
– it’ll become more craft than engineering.
CS530 - Ian Horrocks and Robert Stevens 27/09/200412
UN-Normalised Data
To make it 1NF– Remove Repeating Groups
CS530 - Ian Horrocks and Robert Stevens 27/09/200413
1st NF (the Key)
To make it 2NF– Remove Part-Key Dependencies– Every non-primary-key attribute is fully
functionally dependant on the primary key.
CS530 - Ian Horrocks and Robert Stevens 27/09/200414
2nd NF (the Whole Key)
To make it 3NF– No Transitive dependencies
e.g. A ->B / B ->C therefore A ->C AH06 -> Sony Music / Sony Music -> UK
Track TableCD Table
CS530 - Ian Horrocks and Robert Stevens 27/09/200415
3rd NF (and nothing but the key)Track TableCD Table
Company Table
CS530 - Ian Horrocks and Robert Stevens 27/09/200416
Boyce-Codd Normal FormA relation scheme R is in BCNF if, for all functional dependencies
that hold on R of the form X → Y where R ⊇ X and R ⊇ Yat least one of the following holds
X → Y is trivial X is a candidate key for the scheme R
i.e. X → REvery attribute must depend on the key, the whole key and
nothing but the key
Other Normal Forms: 1NF, 2NF and 3NF ... uses primary key only
BCNF... generalised for candidate keys
CS530 - Ian Horrocks and Robert Stevens 27/09/200417
Round UpIf column (N) is FD on another column (M) then every value of M must define uniquely the value of N.M->NStudent(id, name, staffID, time)Student(jbr, Joe Brown, har, 12-13)Student(spl, Sam Plant, gou, 14-15)Student(spl, Sam Plant, har, 12-13)
id->nameid->staffIDid->time 12-13)
name is Functionally Dependant on id
staffID is NOT Functionally Dependant on id
time is NOT Functionally Dependant on id
Meets(spl, har, 12-13) – meeting onceMeets(spl, har, 12-13) – meeting many
CS530 - Ian Horrocks and Robert Stevens 27/09/200418
Background - NextSQL - Slide 76
CS530 - Ian Horrocks and Robert Stevens 27/09/200419
Further Notes on Normalisation
CS530 - Ian Horrocks and Robert Stevens 27/09/200420
NormalisationGiven a relation R with a set of functional dependencies F,
and a key KWe must identify independent attributes
1. the key identifies all the attributes but…2. ... if an attribute only depends on part of the key, then it
is independent of the rest of it. Attribute is partially dependent on the key
3. ... if an attribute only depends on the key transitively, then it really depends directly on another attribute and is independent of the key.
Attribute is transitively dependent on the key
CS530 - Ian Horrocks and Robert Stevens 27/09/200421
Use functional dependencies to …check constraints on the set of legal relations
Fstudno → name, tutortutor → roomno roomno → tutorcourseno → subjectstudno, courseno → labmark
F+studno, courseno → name partial
studno → roomno transitive
CS530 - Ian Horrocks and Robert Stevens 27/09/200422
Consequences of redundancy Wasted space Potential performance cost Potential inconsistency Inability to represent data
CS530 - Ian Horrocks and Robert Stevens 27/09/200423
Use functional dependencies to …check the EER model mapping correctness
Reader Book
readeridname
fine date
bookidtitle
m nReturnHistory
readerid → readeridreaderid → name
bookid → bookidbookid → title
Many:many relationships that could be weak entity types because they have hidden partial keys.
ReturnHistory(readerid, bookid, date, fine)
readerid, bookid → date ?readerid, bookid → fine ?
CS530 - Ian Horrocks and Robert Stevens 27/09/200424
Using Functional Dependencies to ... check EER mappings
STUDENT(studno, name, labmark)studno → namestudno → labmark ?
COURSE(courseno, subject, roomno)courseno → subjectcourseno → roomno ?
STAFF(staffname, salary)staffname → salarywhere is staffname → roomno ?
COURSEENROLm n
name
STUDENTstudno
subject
courseno
labmark
n
TEACH
STAFF
m
staffname
roomno
salary
Attributes on wrong entities
CS530 - Ian Horrocks and Robert Stevens 27/09/200425
STUDENT(studno, name)studno → name
COURSE(courseno, subject, studno)courseno → subjectcourseno → studno ?
Wrong cardinalities on a relationship type
COURSEENROLn
name
STUDENTstudno
subject
courseno
1
Using Functional Dependencies to ... check EER mappings
CS530 - Ian Horrocks and Robert Stevens 27/09/200426
COURSE (courseno, subject, lecturer,roomno)courseno → subjectcourseno → lecturer ?courseno → roomnolecturer → roomno
Using Functional Dependencies to ... check EER mappings
Missing 1:many relationship type and entity type or missing multi-valued attribute
COURSE
subject
roomno
courseno
lecturer
CS530 - Ian Horrocks and Robert Stevens 27/09/200427
Functional Dependencies are hidden in EER Model
studno name
slotlabmark
STUDENT
ENROL TUTOR
STAFFCOURSE
coursenosubject
name
1m
n m
roomno
CS530 - Ian Horrocks and Robert Stevens 27/09/200428
Using the EER Model and Functional Dependencies
1. Draw EER model2. Map EER schema to relational schema3. For every relation
– List the functional dependencies– what does determine every attribute?– Check that every relation is in BCNF
does the key really solely uniquely identify each attribute?
if its not in BCNF then why? Fix the problem
– normalise and/or– trace back to EER model
4. Are there any functional dependencies missing?5. Optimise the relational schema
CS530 - Ian Horrocks and Robert Stevens 27/09/200429
Database design Extended Entity Relationship
– Top Down– Conceptual/Abstract View
Functional Dependencies– Bottom Up– Implementation View– The Determinancy Approach– Synthesise relations
1. List all attributes
2. Consider the relationships between them those which determine the values of others are entities those whose values are determined by other items are
attributes.
CS530 - Ian Horrocks and Robert Stevens 27/09/200430
Use functional dependencies to…Synthesise
relations STUDENT (studno,givenname,familyname,hons,tutor,slot,year)
studno, courseno labmarkstudno, courseno exammark
ENROL(studno,courseno,labmark,exammark)courseno courseno
courseno subjectcourseno equip
COURSE(courseno,subject,equip)
studno studnostudno familyname
studno givennamestudno hons
studno tutorstudno slot
studno year
lecturer lecturerlecturer roomno
lecturer appraiserroomno lecturer
roomno roomnoroomno appraiser
STAFF(lecturer,roomno,appraiser)
year yearyear yeartutor
yeartutor yearyeartutor yeartutor
YEAR(year,yeartutor)
hons faculty
SCHOOL(hons,faculty)
hons hons
CS530 - Ian Horrocks and Robert Stevens 27/09/200431
er.…
TEACH(courseno,lecturer)
courseno, lecturer courseno, lecturer
TEACH(courseno,lecturer, num_of_lectures)
courseno, lecturer num_of_lectures
CS530 - Ian Horrocks and Robert Stevens 27/09/200432
Complementary Approaches Disadvantages of EER Top Down
1. Not all entity types are represented by nouns or noun-phrases
- association entity types
2. Not all nouns and noun-phrases correspond to entities
- single attribute entities
Disadvantages of determinancy bottom-up
1. Long-winded
2. Hides overall picture of data model
CS530 - Ian Horrocks and Robert Stevens 27/09/200433
The Steps of Normalisation Take one dependency at a time Treat each relation separately and
independently Iterative process
CS530 - Ian Horrocks and Robert Stevens 27/09/200434
Use functional dependencies to…
Systematically create legal relations Derive relations which avoid anomalies in
– Insertion– Deletion– Modification– Accessing
Ensure single valued-ness of facts represented in attributes in keyed relations
Ensure the removal of redundancy in a relation
NORMALISE relations
CS530 - Ian Horrocks and Robert Stevens 27/09/200435
Normalisation Given
– a universal relation that is unnormalised– a set of functional dependencies on the attributes
in the relation
– produce a set of relations where each relation is normalised for the functional dependencies on the attributes in the relation
– Three approaches:– 1. Relational synthesis– 2. Step-wise normalisation– 3. Using BCNF decomposition
CS530 - Ian Horrocks and Robert Stevens 27/09/200436
The Process of Normalisation Usually four steps giving
rise to– First Normal Form (1NF)– Second Normal Form (2NF)– Third Normal Form (3NF)– Boyce-Codd Normal Form
(BCNF)– Fourth Normal Form (4NF)
At each step we consider relationships between the functional dependencies of a relation’s attributes
Normalisation is a:– framework– series of tests
UNNORMALISED ENTITY
step1 remove repeating groups
1st NORMAL FORM
step2 remove partial dependencies
2nd NORMAL FORM
step3 remove transitive dependencies
3rd NORMAL FORM / Boyce-Codd Normal Form
step4 remove multi-dependencies
4th NORMAL FORM
CS530 - Ian Horrocks and Robert Stevens 27/09/200437
First Normal Form Attributes form Repeating Groups When a group of attributes has multiple values
then we say there is a repeating group of attributes in the relation
An relation is in 1NF if there are no repeating groups of attribute types
Any un-normalised relation is transformed to 1NF– Remove all repeating attribute groups– Repeating attribute groups become new relations in
their own right– The key of the original relation must be an attribute
(but not necessarily a key) of the derived relation.
CS530 - Ian Horrocks and Robert Stevens 27/09/200438
First Normal Form : Repeating Groups
STUDENT (studno, name, tutor, roomno)studno → name, tutortutor → roomno, roomno → tutor
STUDENT_DETAILS(studno, name, tutor, roomno, {courseno, labmark, subject})studno → name, tutor courseno → subjecttutor → roomno, roomno → tutor studno, courseno → labmark
ENROL (studno, courseno, subject, labmark)courseno → subjectstudno, courseno → labmark
CS530 - Ian Horrocks and Robert Stevens 27/09/200439
Benefits from First Normal Form Any ‘hidden’ relations (entities) are identified Process results in separation of different objects BUT anomalies may still exist
ENROL (studno, courseno, subject, labmark)– subject appears on every enrolment occurrence.– This may result in anomalies when updating or
deleting tuples– The problem in example is that subject is functionally
dependent only on courseno which is only part of the key
CS530 - Ian Horrocks and Robert Stevens 27/09/200440
Second Normal Form
A relation is in 2NF if it is in 1NF and each non identifying attribute depends upon the whole key (identifier)
Any relation in 1NF is transformed to 2NF– Identify functional dependencies– Re-write relations so that each non-identifying attribute
is functionally dependent on the whole of the key– Decompose ENROL into two relations
ENROL (studno, courseno, subject, labmark)courseno → subjectstudno, courseno → labmark
ENROL’ (studno, courseno, labmark)studno, courseno → labmark
COURSE (courseno, subject)courseno → subject
CS530 - Ian Horrocks and Robert Stevens 27/09/200441
Second Normal Form
STUDENT(studno, name, tutor, roomno)
studno → name, tutortutor → roomnoroomno → tutor
ENROL’ (studno, courseno, labmark)
studno, courseno → labmark
COURSE (courseno, subject)
courseno → subject
CS530 - Ian Horrocks and Robert Stevens 27/09/200442
Third Normal Form An relation is in 3NF if it is in 2NF and all non-
identifying attributes are independent Any relation in 2NF is transformed in 3NF Determine functional dependencies between non
identifying attributes Decompose relation into new relations
STUDENT (studno, name, tutor, roomno)studno → name, tutortutor → roomnoroomno → tutor
STUDENT (studno, name, tutor)studno → name, tutor
TUTOR (tutor, roomno)tutor → roomnoroomno → tutor
CS530 - Ian Horrocks and Robert Stevens 27/09/200443
Student Relational Schema in 3NF STUDENT (studno, name, tutor)
studno → name, tutor
TUTOR (tutor, roomno)tutor → roomnoroomno → tutor
ENROL (studno, courseno, labmark)studno, courseno → labmark
COURSE (courseno, subject)courseno → subject
CS530 - Ian Horrocks and Robert Stevens 27/09/200444
Decomposition: Lossless or Non-additive Join
R is a relational scheme, F is a set of functional dependencies on R. R1 and R2 form a decomposition of R.
The decomposition of R is non-additive if at least one of the following functional dependencies are in F+R1 ∩ R2 → R1R1 ∩ R2 → R2
The decomposition of R is non-additive if for every state r of R that satisfies F (π<R1> (r), ..., π<Rm> (r) ) = r
where condition is the natural join
CS530 - Ian Horrocks and Robert Stevens 27/09/200445
Decomposition: Lossless or Non-additive Join
ENROL’ ∩ COURSE = courseno courseno → subject (courseno, subject) = COURSE
ENROL (studno, courseno, subject, labmark)courseno → subjectstudno, courseno → labmark
ENROL’ (studno, courseno, labmark)studno, courseno → labmark
COURSE (courseno, subject)courseno → subject
CS530 - Ian Horrocks and Robert Stevens 27/09/200446
Lossless or Non-additive Join
STUDENT1 (tutor = tutor)TUTORS = STUDENT
studno → namestudno → tutortutor → roomnoroomno → tutor
studno → namestudno → tutor
tutor → roomnoroomno → tutor
CS530 - Ian Horrocks and Robert Stevens 27/09/200447
Spurious Tuples Lossless or Non-additive JoinTEACH TEACH’
LECTURES
CS530 - Ian Horrocks and Robert Stevens 27/09/200448
Decomposition Algorithm: Decomposition D, relation R
set D := { R } ; while there is a relation schema Q in D that is not in BCNF do begin
– choose a relation schema Q in D that is not in BCNF;– find a functional dependency X→Y in Q that violates BCNF;
violation means that (X)
+ fails to find all of Q, so X
can’t be a key.– replace Q in D by two schemas
R1 (Q - (Y)
+ ∪ X)
–leave copy of X in relation to be the foreign key for R2
and
R2 (X ∪ (Y)+ )
–new relation for functional dependency and its closure, X will be the primary key
end;
CS530 - Ian Horrocks and Robert Stevens 27/09/200449
Lossless or Non-additive Join
X Y Z
X Y X Z
X Y
foreign key
CS530 - Ian Horrocks and Robert Stevens 27/09/200450
Decomposition: Dependency Preservation When an update is made to a database,
should be able to check that update satisfies all functional dependencies.
It is desirable to allow validation of relational database schemes that allow update validation without the computation of joins.
independent manipulation of relations.
CS530 - Ian Horrocks and Robert Stevens 27/09/200451
Dependency Preservation
The union of dependencies that hold on the individual relations in decomposition D must be equivalent to F.
Given F on R, πF(Ri) where Ri ⊆ R
is the set of dependencies X Y in F+ such that the attributes in X ∪ Y are all contained in Ri
Decomposition D = {R1, R2, ..., Rm} of R is dependency
preserving w.r.t. F if (πF(R1)) ∪.... ∪ πF(Rm)))+ = F+
Given the restriction of functional dependencies to a relation is the fds that involve attributes of that relation Fi for Ri
n nU Fi ≠ F possible, but... (U Fi)
+ = F+
i=1 i =1
CS530 - Ian Horrocks and Robert Stevens 27/09/200452
Dependency Preservation STUDENT (studno, name, tutor, roomno, appraiser)
studno → name, tutortutor → roomno, appraiserroomno → tutor, appraiser
STUDENT1 (studno, name, tutor)studno → name, tutor
TUTOR (studno, roomno, appraiser)studno → roomno, appraiser
This is in Boyce-Codd Normal Form and is a lossless (nonadditive) join decomposition but we have lost....
tutor → roomno, appraiserroomno → tutor, appraiser
CS530 - Ian Horrocks and Robert Stevens 27/09/200453STUDENT’ TUTOR = STUDENT
studno → namestudno → tutor
tutor → roomnotutor → appraiserroomno → tutorroomno → appraiserstudno → appraiserstudno → roomno
studno → namestudno → tutor
studno → appraiserstudno → roomno
Dependency Preservation
CS530 - Ian Horrocks and Robert Stevens 27/09/200454
Designing a relational schema Build a relational database
–without redundancy normalisation
–without loss of information or gain of data lossless join decomposition
–without losing dependency integrity dependency preservation
CS530 - Ian Horrocks and Robert Stevens 27/09/200455
Multi-valued Dependencies and
Fourth Normal Form
CS530 - Ian Horrocks and Robert Stevens 27/09/200456
Multi-valued Dependencies a course has many lecturers a course has many texts lecturers and texts are
independent a lecturer teaches many
courses a text is used by many courses
lecturer and text are independent sets for each courseno there is an associated set of lecturers for each courseno there is an associated set of texts the sets are independent.
CS530 - Ian Horrocks and Robert Stevens 27/09/200457
Multi-valued Dependenciescourseno →→ lecturercourseno →→ text
This is in BCNFkey is
{courseno,lecturer,text}
courseno, lecturer,text → courseno,
lecturer,text
trivial dependencies
CS530 - Ian Horrocks and Robert Stevens 27/09/200458
Multi-valued DependenciesEach TEXT is
associated with all the LECTURERS that teach a COURSE
The attribute TEXT contains redundant values.
If TEXT were deleted from rows 1, 2 & 3 the values could be deduced from rows 4,5 & 6
CS530 - Ian Horrocks and Robert Stevens 27/09/200459
Multivalued Dependenciescourseno →→ lecturercourseno →→ text if (c,l,t) and (c,l’,t’)
appear then (c,l,t’) and (c,l’,t)
appear also tuple (c,l,t)
appears if c can be taught by l using text t
for each course all possible combinations of lecturer and text appear
CS530 - Ian Horrocks and Robert Stevens 27/09/200460
Multi-Valued Dependencies Whenever X →→ Y holds in R
so does X →→(R - (XY)).
a MVD is trivial if Y ⊂ X or X ∪ Y = R. i.e. the two attributes form the whole relation
non-trivial MV dependencies need at least 3 attributes.
CS530 - Ian Horrocks and Robert Stevens 27/09/200461
Fourth Normal Form A relation R is in 4NF if it is in 3NF and there are no
multi-valued dependencies between its attribute types A relation R is in 4NF iff whenever there exists a non-
trivial multi-valued dependency in F+ for R
X →→ Y X is a superkey for R, i.e. all attributes are functionally
dependent on X. Any relation in 3NF is transformed in 4NF
– Detect any multi-valued dependencies
– Decompose relation
CS530 - Ian Horrocks and Robert Stevens 27/09/200462
Fourth Normal Form
courseno →→ lecturercourseno →→ text
trivial dependencies only
CS530 - Ian Horrocks and Robert Stevens 27/09/200463
Lossless join decomposition into 4NF Algorithm:
Decomposition D, relation R1.set D := { R } ;2. while there is a relation schema Q in D that is not in 4NF do
beginchoose a relation schema Q in D that is not in
4NF;find a non-trivial MVD X →→ Y in Q that
violates 4NF;replace Q in D by two schemas
(Q -Y) and (X ∪ Y) end;
CS530 - Ian Horrocks and Robert Stevens 27/09/200464
Fourth Normal Form EER modelling
Leads to correctly normalised relational schema
COURSE
STAFF TEXT
teaches recommendation
m
n
m
n
name
texttitle
courseno
CS530 - Ian Horrocks and Robert Stevens 27/09/200465
Fourth Normal Form EER modelling
Leads to relational schema that is not in 4NF
COURSE
STAFF TEXT
Course-Staff-Text
m p
n
name
courseno
texttitle
CS530 - Ian Horrocks and Robert Stevens 27/09/200466
Conclusions Data Normalisation is a technique that ensures
the basic properties of the relational model– no duplicate tuples
– no nested relations
Data normalisation is sometimes used as the only technique for database design—implementation view
A more appropriate approach is to complement conceptual modelling with data normalisation
CS530 - Ian Horrocks and Robert Stevens 27/09/200467
Lossless or Non-additive Join AlgorithmDecomposition D, relation R
1. set D := {R} ;2. while there is a relation schema Q in D that is not in BCNF do
beginchoose a relation schema Q in D that is not in BCNF;find a functional dependency X→Y in Q that violates BCNF;replace Q in D by two schemasR1 (Q - Y) leave copy of X in relation to be foreign key for R2and R2 (X ∪ Y) new relation for functional dependency and its
closure,
X will be the primary key
end;
CS530 - Ian Horrocks and Robert Stevens 27/09/200468
Example
CS530 - Ian Horrocks and Robert Stevens 27/09/200469
Normalisation ExampleBEER_DATABASE
Additional Notes: Warehouses are shared by breweries.
Each beer is unique to the brewer. Each brewery is based in a city.
CS530 - Ian Horrocks and Robert Stevens 27/09/200470
Minimal Sets of Functional Dependencies A set of functional dependencies F is minimal if:
1. Every dependency F has a single determined attribute A
2. We cannot remove any dependency from F and still have a set of dependencies equivalent to F
3. We cannot replace and dependency X → A in F with a dependency A→ X, where A ⊂ X and still have a set of dependencies that is equivalent to F
I.e. a canonical form with no redundancies
(beer, brewery, strength, city, region, warehouse, quantity) beer→ brewery beer→ strength brewery → city city → region beer, warehouse, → quantity
CS530 - Ian Horrocks and Robert Stevens 27/09/200471
Relational Synthesis Algorithm into 3NF:
(beer, brewery, strength, city, region, {warehouse, quantity}) set D := { R } ; P. 426, P. 4311. Find a minimal cover G for F2. For each determinant X of a functional dependency that appears in G
create a relation schema { X ∪ A1, X ∪ A2…X ∪ Am} in D whereX → A1, X → A1, … X → A1m are the only dependencies in G with X as
the determinant;3. Place any remaining (unplaced) attributes in a single relation to ensure attribute
preservation property so we don’t lose anything.4. If none of the relations contains a key of R, create one more relation that
contains attributes that form a key for R.
beer→ brewery (beer, brewery, strength) beer→ strength brewery → city (brewery, city) city → region (city, region) beer, warehouse, → quantity (beer, warehouse, quantity)
CS530 - Ian Horrocks and Robert Stevens 27/09/200472
Step-wise normalisation: (beer, brewery, strength, city, region, {warehouse, quantity})
beer→ brewery, strength partial dependency brewery → city transitive dependency city → region transitive dependency beer, warehouse, → quantity repeating group
1NF remove repeating group(beer, brewery, strength, city, region, {warehouse, quantity})
(beer, warehouse, quantity)beer, warehouse, → quantity
(beer, brewery, strength, city, region)beer→ brewery, strength
transitive dependency brewery → city transitive dependency city → region
CS530 - Ian Horrocks and Robert Stevens 27/09/200473
(beer, brewery, strength, city, region) beer→ brewery, strength brewery → city transitive dependency city → region transitive dependency
2NF no partial dependencies 3NF/BCNF no transitive dependencies
(beer, brewery, strength, city, region)
(city, region)city → region (beer, brewery, strength, city)
beer→ brewery, strength brewery → city
(brewery, city) brewery → city
(beer, brewery, strength)beer→ brewery, strength
Take the most indirect transitive dependencies
CS530 - Ian Horrocks and Robert Stevens 27/09/200474
Using BNCF decomposition algorithm:(beer, brewery, strength, city, region, warehouse, quantity)
beer→ brewery, strength partial dependency brewery → city transitive dependency city → region transitive dependency beer, warehouse, → quantity Directly to BCNFtake a violating dependency and form a relation from it.First choose a direct transitive dependency and its closure
(beer, brewery, strength, city, region, warehouse, quantity) brewery → city
(brewery, city, region)brewery → citycity → region transitive dependency (beer, brewery, strength, warehouse, quantity)
beer→ brewery, strength partial dependencybeer, warehouse, → quantity
CS530 - Ian Horrocks and Robert Stevens 27/09/200475
Using BNCF decomposition algorithm:(beer, brewery, strength, city, region, warehouse, quantity)
beer→ brewery, strength partial dependency brewery → city transitive dependency city → region transitive dependency beer, warehouse, → quantity take a violating dependency and form a relation from it.First the partial dependency and its closure
(beer, brewery, strength, city, region, warehouse, quantity) beer→ brewery, strength
(beer, brewery, strength, city, region)beer→ brewery, strengthbrewery → city transitive dependencycity → region transitive dependencynormalise as before...
(beer, warehouse, quantity)beer, warehouse, → quantity
CS530 - Ian Horrocks and Robert Stevens 27/09/200476
Keys and Indexes, Data Definition, Relational Manipulation and Data
Control Using SQL
CS530 - Ian Horrocks and Robert Stevens 27/09/200477
Keys SuperKey
– a set of attributes whose values together uniquely identify a tuple in a relation
Candidate Key– a superkey for which no proper subset is a
superkey…a key that is minimal . – Can be more than one for a relation
Primary Key– a candidate key chosen to be the main key for the
relation. – One for each relation
Keys can be composite
CS530 - Ian Horrocks and Robert Stevens 27/09/200478
e.g.: Staff(lecturer,roomno,appraiser)
SK = {lecturer,roomno,appraiser},
{lecturer,roomno}, {lecturer, appraiser},
{roomno,appraiser}, {lecturer} and {roomno}
CK = {lecturer} and {roomno}
PK = {lecturer}
CS530 - Ian Horrocks and Robert Stevens 27/09/200479
Foreign Key a (set of) attribute(s) in a relation that exactly
matches a (primary) key in another relation– the names of the attributes don’t have to be the same
but must be of the same domain– a foreign key in a relation A matching a primary key in
a relation B represents a many:one relationship between A and B
Student(studno,name,tutor,year)
Staff(lecturer,roomno,appraiser)
CS530 - Ian Horrocks and Robert Stevens 27/09/200480
Data Definition and Manipulation
CS530 - Ian Horrocks and Robert Stevens 27/09/200481
Languages of DBMS Data Definition Language DDL
– define the logical schema (relations, views etc) and storage schema stored in a Data Dictionary
Data Manipulation LanguageDML
– Manipulative populate schema, update database
– Retrieval querying content of a database
Data Control Language DCL
– permissions, access control etc...
CS530 - Ian Horrocks and Robert Stevens 27/09/200482
Data Definition:Creating tablescreate table accountants as(select studno, name, tutor, year from student where hons = ‘ca’); Can specify column names, default
values and integrity constraints (except referential)
Datatypes and lengths derived from query Not null constraints passed on from query
tables
CS530 - Ian Horrocks and Robert Stevens 27/09/200483
Defining a Relation
create table student
(studentno number(8) primary key,
givenname char(20),
surname char(20),
hons char(3) check (hons in ('cis','cs','ca','pc','cm','mcs')),
tutorid number(4),
yearno number(1) not null,
constraint year_fk
foreign key (yearno) references year(yearno),
constraint super_fk
foreign key (tutorid) references staff(staffid));
CS530 - Ian Horrocks and Robert Stevens 27/09/200484
Data Definition: Create Table
create table enrol(studno number(8),courseno char(5),primary key (studno, courseno),cluster (studno),labmark number(3) check (labmark between 0 and 100),
exammark number(3) check (exammark between 0 and 100),
constraint stud_fk foreign key (studno) references student,
constraint course_fk foreign key (courseno) references course);
CS530 - Ian Horrocks and Robert Stevens 27/09/200485
Data Definition: Altering Relations alter table student
add (address char(20),default null);
alter table student modify (name not null);
this won’t work if there are any nulls in the name column
CS530 - Ian Horrocks and Robert Stevens 27/09/200486
Data Manipulation: Insert Operator
insert (cs310, elec, sun) into course;
Course
insert into course (courseno,subject,equip) values (‘cs310’, ‘elec’, ‘sun’);
insert into course values (‘cs310’, ‘elec’, NULL);
insert into table where search-condition
CS530 - Ian Horrocks and Robert Stevens 27/09/200487
Inserting Tuples into a Relation
insert into weak_students
(studno,name,courseno,exammark)
where (select s.studno,name,courseno,exammark
from enrol, student s
where exammark <= 40 and
enrol.studno = s.studno );
CS530 - Ian Horrocks and Robert Stevens 27/09/200488
Insertion Anomalies An insert operation might voliate the uniqueness and
minimality properties of the primary key of the referential integrity constraint
insert (cs250,databases,sun) into course
Insertion anomalies can be corrected byrejecting the insertioncorrecting the reason for rejecting the update
CS530 - Ian Horrocks and Robert Stevens 27/09/200489
Data Manipulation: Update Operator
Modifies a tuple or tuples of a relation Don’t violate constraints as long as the modified
attributes are not primary keys or foreign keys Update of a primary key corresponds to a
deletion followed by an insertion Update of a foreign key attribute is legal only if
the new value corresponds to an existing tuple in the referenced relation or is null
update enrol set labmark = labmark * 1.1 where courseno = ‘cs250’;
update table set column = expression [where search-condition]
CS530 - Ian Horrocks and Robert Stevens 27/09/200490
Data Manipulation: Delete Operator
Deletes a tuple or a set of tuples from a relation Might violate the referential integrity constraint Anomalies can be overcome by
– rejecting the deletion– cascading the deletion (delete tuples that reference deleted
tuple)– modifying the referencing attribute values
delete from table [where search-condition]
delete from course where equip = ‘pc’;
delete from student where year = ‘3’ and(hons != ‘mi’ or hons <> ‘ si’ );
CS530 - Ian Horrocks and Robert Stevens 27/09/200491
Delete Operator
delete from studentwhere studno in(select student.studnofrom enrol e, teach t, student swhere t.lecturer = ‘woods’and t.courseno = e.coursenoand e.studno = s.studno);
CS530 - Ian Horrocks and Robert Stevens 27/09/200492
Data Control: Data Sharing and Security
Permissions, access control etc...
create view myyear as select * from student where year in
(select year from student where name = user)
with check option
CS530 - Ian Horrocks and Robert Stevens 27/09/200493
Data Control: Data Sharing and Securitygrant privilege, privilege2… | all
on table | viewto userID | roleID
grant select on student to bloggsf;
Grant can be attached to any combination of select, insert, update, delete, alter
Restricting access to parts pf a table can be effected by using the view and grant commands
Privileges can be withdrawn with the revoke command
CS530 - Ian Horrocks and Robert Stevens 27/09/200494
Synonyms for Objects select name from CAROLE.student;
create [public] synonym synonym_name for table | view;
create synonym student for CAROLE.student;
drop synonym mystudent;
CS530 - Ian Horrocks and Robert Stevens 27/09/200495
The Role of the Data Dictionary A set of tables and views to be used by the
RDBMS as a reference guide to the data stored in the database files
Every user retrieves data from views stored in the Data Dictionary
The Data Dictionary stores:– user names of those permitted to access the
database
– names of tables, space definitions, views, indexes, clusters, synonyms etc
– rights and privileges that have been granted
CS530 - Ian Horrocks and Robert Stevens 27/09/200496
Examples Class
CS530 - Ian Horrocks and Robert Stevens 27/09/200497
Relational Query Languages
CS530 - Ian Horrocks and Robert Stevens 27/09/200498
Query Operators Relational Algebra
–tuple (unary) Selection, Projection–set (binary) Union, Intersection, Difference– tuple (binary) Join, Division
Additional Operators–Outer Join, Outer Union
CS530 - Ian Horrocks and Robert Stevens 27/09/200499
A Retrieval DML Must Express Attributes required in a result
– target list
Criteria for selecting tuples for that result– qualifier
The relations that take part in the query– set generators
Independent of the instances in the database Expressions are in terms of the database
schema
CS530 - Ian Horrocks and Robert Stevens 27/09/2004100
Relational Algebra
CS530 - Ian Horrocks and Robert Stevens 27/09/2004101
SQL Retrieval StatementSELECT[all|distinct]
{*|{table.*|expr[alias]|view.*}[,{table.*|expr[alias]}]...}
FROM table [alias][,table[alias]] ...[WHERE condition][CONNECT BY condition
[START WITH condition]][GROUP BY expr [,expr] ...] [HAVING condition][{UNION|UNION ALL|INTERSECT|MINUS}
SELECT ...][ORDER BY {expr|position} [ASC|DESC][,expr|position}[ASC|DESC].[FOR UPDATE OF column [,column] ... [NOWAIT]]
CS530 - Ian Horrocks and Robert Stevens 27/09/2004102
π Project Operator
selects a subset of the attributes of a relation
Result = π (attribute list)(relation name)
attribute list are drawn from the specified relation; if the key attribute is in the list then card(result) = card(relation)
resulting relation has only the attributes in the list, in same order as they appear in the list
the degree(result) = number of attributes in the attribute list
no duplicates in the result
CS530 - Ian Horrocks and Robert Stevens 27/09/2004103
π Project Operator
πtutor(STUDENT)
CS530 - Ian Horrocks and Robert Stevens 27/09/2004104
π Project Operator SELECT
select * from student;
select tutorfrom student;
CS530 - Ian Horrocks and Robert Stevens 27/09/2004105
σ Select Operatorselects a subset of the tuples in a relation that satisfy a selection condition
Result = σ (selection condition)(relation name)
a boolean expression specified on the attributes of a specified relation
a relation that has the same attributes as the source relation;
• stands for the usual comparison operators ‘<‘, ‘<>‘, ‘<=‘, ‘>‘, ‘>=‘, etc• clauses can be arbitrarily connected with boolean operators AND, NOT, OR
degree(result) = degree(relation);card(result) <= card(relation)
CS530 - Ian Horrocks and Robert Stevens 27/09/2004106
σ Select Operator
σname=‘bloggs’(STUDENT)
CS530 - Ian Horrocks and Robert Stevens 27/09/2004107
retrieve tutor who tutors Bloggs
πtutor(σname=‘bloggs’(STUDENT))
select tutor from student where name = ‘bloggs’;
CS530 - Ian Horrocks and Robert Stevens 27/09/2004108
SQL retrieval expressions
select studentno, name from student
where hons != ‘ca’ and
(tutor = ‘goble’ or tutor = ‘kahn’); select * from enrol
where labmark > 50; select * from enrol
where labmark between 30 and 50;
CS530 - Ian Horrocks and Robert Stevens 27/09/2004109
select * from enrol
where labmark in (0, 100); select * from enrol
where labmark is null; select * from student
where name is like ‘b%’; select studno, courseno,
exammark+labmark total from enrol
where labmark is not NULL;
CS530 - Ian Horrocks and Robert Stevens 27/09/2004110
Cartesian Product Operator
Definition: The cartesian product of two relations
R1(A1,A2,...,An) with cardinality i and R2(B1,B2,...,Bm) with cardinality j is a relation
R3 with degree k=n+m, cardinality i*j and attributes (A1,A2,...,An,B1,B2,...,Bm)
The result, denoted by R1XR2, is a relation that includes all the possible combinations of tuples from R1 and R2
Used in conjunction with other operations
CS530 - Ian Horrocks and Robert Stevens 27/09/2004111
Cartesian Product Example
CS530 - Ian Horrocks and Robert Stevens 27/09/2004112
X Cartesian Product
CS530 - Ian Horrocks and Robert Stevens 27/09/2004113
θ Join OperatorDefinition: The join of two relations R1(A1,A2,...,An) and
R2(B1,B2,...,Bm) is a relation R3 with degree k=n+m
and attributes (A1,A2,...,An, B1,B2,...,Bm) that satisfy
the join condition
• stands for the usual comparison operators ‘<‘, ‘<>‘, ‘<=‘, ‘>‘, ‘>=‘, etc
• comparing terms in the Θ clauses can be arbitrarily connected with boolean operators AND, NOT, OR
The result is a concatenated set but only for those tuples where the condition is true.
It does not require union
compatibility of R1 and R2
Result = R1 (θ join condition) R2
CS530 - Ian Horrocks and Robert Stevens 27/09/2004114
θ Join Operator
CS530 - Ian Horrocks and Robert Stevens 27/09/2004115
More joins
CS530 - Ian Horrocks and Robert Stevens 27/09/2004116
Natural Join Operator Of all the types of θ-join, the equi-join is the
only one that yields a result in which the compared columns are redundant to each other—possibly different names but same values
The natural join is an equi-join but one of the redundant columns (simple or composite) is omitted from the result
Relational join is the principle algebraic counterpart of queries that involve the existential quantifier ∃
CS530 - Ian Horrocks and Robert Stevens 27/09/2004117
Self Join: Joins on the same relation π (lecturer, (staff (appraiser = lecturer) staff)
roomno,appraiser, approom)
select e.lecturer, e.roomno, m.lecturer appraiser, m.roomno approomfrom staff e, staff mwhere e.appraiser = m.lecturer
CS530 - Ian Horrocks and Robert Stevens 27/09/2004118
Exercise
Get student’s name, all their courses, subject of course, labmark for course, lecturer of course and lecturer’s roomno for ‘ca’ students
University Schema STUDENT(studno,name,hons,tutor,year) ENROL(studno,courseno,labmark,exammar
k) COURSE(courseno,subject,equip) STAFF(lecturer,roomno,appraiser) TEACH(courseno,lecturer) YEAR(yearno,yeartutor)
CS530 - Ian Horrocks and Robert Stevens 27/09/2004119
Set Theoretic Operators Union, Intersection and Difference Operands need to be union compatible for the
result to be a valid relation
Definition:Two relations
R1(A1,A2,...,An) and R2(B1,B2,...,Bm)
are union compatible iff: n = m and, dom(Ai)= dom (Bi) for 1 ≤ i ≤ n
CS530 - Ian Horrocks and Robert Stevens 27/09/2004120
∪ Union OperatorDefinition:The union of two relations R1(A1,A2,...,An) and R2(B1,B2,...,Bm) is a relation R3(C1,C2,...,Cn) such that dom(Ci)= dom(Ai) =dom (Bi) for 1 ≤ i ≤ n
The result R1 ∪ R2 is a relation that includes all tuples that are either in R1 or R2 or in both without duplicate tuplesThe resulting relation might have the same attribute names as the first or the second relation
CS530 - Ian Horrocks and Robert Stevens 27/09/2004121
Retrieve all staff that lecture or tutor
Lecturers π(lecturer)TEACHTutors π(tutor)STUDENT
Lecturers ∪ Tutors
CS530 - Ian Horrocks and Robert Stevens 27/09/2004122
∩ Intersection Operator
Definition:The intersection of two relations R1(A1,A2,...,An) and R2(B1,B2,...,Bm) is a relation R3(C1,C2,...,Cn) such that dom(Ci)= dom(Ai) ∩ dom (Bi) for 1 ≤ i ≤ n
The result R1 ∩ R2 is a relation that includes only those tuples in R1 that also appear in R2
The resulting relation might have the same attribute names as the first or the second relation
CS530 - Ian Horrocks and Robert Stevens 27/09/2004123
Retrieve all staff that lecture and tutor
Lecturers π(lecturer)TEACHTutors π(tutor)STUDENT
Lecturers ∩ Tutors
CS530 - Ian Horrocks and Robert Stevens 27/09/2004124
− Difference Operator
Definition:The difference of two relations R1(A1,A2,...,An) and R2(B1,B2,...,Bm) is a relation R3(C1,C2,...,Cn) such that
dom(Ci)= dom(Ai) −dom (Bi) for 1 ≤ i ≤ n
The result R1 − R2 is a relation that includes all tuples that are in R1 and not in R2
The resulting relation might have the same attribute names as the first or the second relation
CS530 - Ian Horrocks and Robert Stevens 27/09/2004125
Retrieve all staff that lecture but don’t tutor
Lecturers π(lecturer)TEACH
Tutors π(tutor)STUDENT
Lecturers -Tutors
CS530 - Ian Horrocks and Robert Stevens 27/09/2004126
Relational Algebra Re Cap
CS530 - Ian Horrocks and Robert Stevens 27/09/2004127
Relational Algebra Relational Algebra
– tuple (unary) Selection - Result = σ (selection condition)(relation
name) Projection - Result = π (attribute list)(relation name)
- π tutor ( σ name=‘bloggs’ (STUDENT) )
– set (binary) Union - Lecturers ∪ Tutors Intersection - Lecturers ∩ Tutors Difference - Lecturers -Tutors
– tuple (binary) Join - Result = R1 wv (θ join condition) R2 Division- (A / B) Not supported as a primitive operator
CS530 - Ian Horrocks and Robert Stevens 27/09/2004128
Relational Algebra– Additional Operators
Outer Join & Outer Union (+) – Pads with nulls – (includes all tuples not just matches - as with Natural / Equi-join)
– Aggregation Functions <grouping attributes> ƒ <function list> (relation name) How many courses is a student enrolled for? studno ƒ COUNT courseno (ENROL)
CS530 - Ian Horrocks and Robert Stevens 27/09/2004129
For Next Lecture
I expect you to have SKIM Read the notes for the next lecture before it’s
delivered. The sequence of: skim read; lecture delivery; SAQ will make revision
a whole lot easier.