+ All Categories
Home > Documents > Lecture on database Normalisation

Lecture on database Normalisation

Date post: 26-Nov-2015
Category:
Upload: bachib
View: 27 times
Download: 2 times
Share this document with a friend
Description:
Lecture on database Normalisation
Popular Tags:
31
Normalisation to BCNF Database Systems Lecture 12 Natasha Alechina
Transcript
  • Normalisation to BCNFDatabase Systems Lecture 12Natasha Alechina

  • In This LectureMore normalisationBrief review of relational algebraLossless decompositionBoyce-Codd normal form (BCNF)Higher normal formsDenormalisationFor more informationConnolly and Begg chapter 14Ullman and Widom chapter 3.6

  • Normalisation so FarFirst normal formAll data values are atomicSecond normal formIn 1NF plus no non-key attribute is partially dependent on a candidate keyThird normal formIn 2NF plus no non-key attribute depends transitively on a candidate key

  • Lossless decompositionTo normalise a relation, we used projectionsIf R(A,B,C) satisfies AB then we can project it on A,B and A,C without losing informationLossless decomposition: R = AB(R) AC(R)where AB(R) is projection of R on AB and is natural join.Reminder of projection:ABCRABAB(R)

  • Relational algebra reminder: selectionABRCD1cc2yde3zaa4ubc5wcdxABCD1cc3zaaxC=D(R)

  • Relational algebra reminder:productABR112yxAC12vwR23uABAC1x1w1x2v1x3u2y1w2y2v2y3uR1R2

  • While I am on the subjectSELECT A,BFROM R1, R2, R3WHERE (some property holds)

    translates into relational algebra

    A,B (R1R2R3)

  • Relational algebra: natural join R1R2 = R1.A,B,C R1.A = R2.A (R1R2) ABR112yxAC12vwR23uABC1xw2yvR1 R2

  • When is decomposition lossless: Module LecturerModuleLecturerTextDBSnzaCBDBSnzaUWRDBnzaUWAPSrcbBRModuleLecturerDBSnzaRDBnzaAPSrcb Module,LecturerRModuleTextDBSCBRDBUWAPSUW Module,TextRDBSB

  • When is decomposition is not lossless: no fdFirstAgeJohnSmith20SFirstLastJohnSmith First,LastSFirstAgeJohn20Mary30Tom20 First,AgeSJohn10LastJohnBrown30MarySmith20TomBrown10JohnBrownMarySmithTomBrown

  • When is decomposition is not lossless: no fdFirstAgeJohnSmith20 First,Last S First,Last S FirstLastJohnSmith First,LastSFirstAgeJohn20Mary30Tom20 First,AgeSJohn10LastJohnBrown30MarySmith20TomBrown10JohnBrownMarySmithTomBrownJohnSmith30JohnBrown20

  • Normalisation ExampleWe have a table representing orders in an online storeEach entry in the table represents an item on a particular orderColumnsOrderProductCustomerAddressQuantityUnitPricePrimary key is {Order, Product}

  • Functional DependenciesEach order is for a single customerEach customer has a single addressEach product has a single priceFrom FDs 1 and 2 and transitivity

    {Order} {Customer}

    {Customer}{Address}

    {Product} {UnitPrice}

    {Order} {Address}

  • Normalisation to 2NFSecond normal form means no partial dependencies on candidate keys{Order} {Customer, Address}{Product} {UnitPrice}

    To remove the first FD we project over{Order, Customer, Address} (R1)and{Order, Product, Quantity, UnitPrice} (R2)

  • Normalisation to 2NFR1 is now in 2NF, but there is still a partial FD in R2{Product} {UnitPrice}To remove this we project over {Product, UnitPrice} (R3)and {Order, Product, Quantity} (R4)

  • Normalisation to 3NFR has now been split into 3 relations - R1, R3, and R4R3 and R4 are in 3NFR1 has a transitive FD on its key

    To remove {Order} {Customer} {Address}we project R1 over{Order, Customer}{Customer, Address}

  • Normalisation1NF: {Order, Product, Customer, Address, Quantity, UnitPrice}2NF:{Order, Customer, Address}, {Product, UnitPrice}, and {Order, Product, Quantity}3NF:{Product, UnitPrice}, {Order, Product, Quantity}, {Order, Customer}, and {Customer, Address}

  • The Stream RelationConsider a relation, Stream, which stores information about times for various streams of courses For example: labs for first yearsEach course has several streamsOnly one stream (of any course at all) takes place at any given timeEach student taking a course is assigned to a single stream for it

  • The Stream RelationCandidate keys: {Student, Course} and {Student, Time}

  • FDs in the Stream RelationStream has the following non-trivial FDs{Student, Course} {Time}{Time} {Course}Since all attributes are key attributes, Stream is in 3NF

  • Anomalies in StreamINSERT anomaliesYou cant add an empty streamUPDATE anomaliesMoving the 12:00 class to 9:00 means changing two rowsDELETE anomaliesDeleting Rebecca removes a stream

  • Boyce-Codd Normal FormA relation is in Boyce-Codd normal form (BCNF) if for every FD A B eitherB is contained in A (the FD is trivial), orA contains a candidate key of the relation, In other words: every determinant in a non-trivial dependency is a (super) key.The same as 3NF except in 3NF we only worry about non-key BsIf there is only one candidate key then 3NF and BCNF are the same

  • Stream and BCNFStream is not in BCNF as the FD {Time} {Course} is non-trivial and {Time} does not contain a candidate key

  • Conversion to BCNFStream has been put into BCNF but we have lost the FD{Student, Course} {Time}Student Course Time

  • Decomposition PropertiesLossless: Data should not be lost or created when splitting relations upDependency preservation: It is desirable that FDs are preserved when splitting relations upNormalisation to 3NF is always lossless and dependency preservingNormalisation to BCNF is lossless, but may not preserve all dependencies

  • Higher Normal FormsBCNF is as far as we can go with FDsHigher normal forms are based on other sorts of dependencyFourth normal form removes multi-valued dependenciesFifth normal form removes join dependencies1NF Relations

    2NF Relations

    3NF Relations

    BCNF Relations

    4NF Relations

    5NF Relations

  • DenormalisationNormalisationRemoves data redundancySolves INSERT, UPDATE, and DELETE anomaliesThis makes it easier to maintain the information in the database in a consistent stateHoweverIt leads to more tables in the databaseOften these need to be joined back together, which is expensive to doSo sometimes (not often) it is worth denormalising

  • DenormalisationYou might want to denormalise ifDatabase speeds are unacceptable (not just a bit slow)There are going to be very few INSERTs, UPDATEs, or DELETEsThere are going to be lots of SELECTs that involve the joining of tablesNumberStreetPostcodeCityAddressNot normalised since {Postcode} {City}NumberStreetPostcodeCityAddress1PostcodeAddress2

  • Normalisation in ExamsGiven a relation with scheme {ID, Name, Address, Postcode, CardType, CardNumber}, the candidate key {ID}, and the following functional dependencies: {ID} {Name, Address, Postcode, CardType, CardNumber} {Address} {Postcode} {CardNumber} {CardType}(i) Explain why this relation is in second normal form, but not in third normal form.(3 marks)

  • Normalisation in Exams(ii) Show how this relation can be converted to third normal form. You should show what functional dependencies are being removed, explain why they need to be removed, and give the relation(s) that result.(4 marks)(iii) Give an example of a relation that is in third normal form, but that is not in Boyce-Codd normal form, and explain why it is in third, but not Boyce-Codd, normal form.(4 marks)

  • Next LecturePhysical DB IssuesRAID arrays for recovery and speedIndexes and query efficiencyQuery optimisationQuery treesFor more informationConnolly and Begg chapter 21 and appendix C.5, Ullman and Widom 5.2.8


Recommended