+ All Categories
Home > Documents > IT-3101 Database Management Systems

IT-3101 Database Management Systems

Date post: 15-Mar-2016
Category:
Upload: tekli
View: 50 times
Download: 0 times
Share this document with a friend
Description:
IT-3101 Database Management Systems. By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University. Lecture 04 Relational Database Design Normalization-Part-1. Outline. Overview of Relational DBMS Normalization(1 st lecture). Normalization. - PowerPoint PPT Presentation
Popular Tags:
34
IT-3101 Database Management Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University
Transcript
Page 1: IT-3101 Database Management Systems

IT-3101Database Management

SystemsBy-

Jesmin AkhterAssistant Professor, IIT, Jahangirnagar University

Page 2: IT-3101 Database Management Systems

Lecture 04 Relational Database Design

Normalization-Part-1

Page 3: IT-3101 Database Management Systems

Outline Overview of Relational DBMS

Normalization(1st lecture)

Page 4: IT-3101 Database Management Systems

The aim of normalization is to eliminate various anomalies (or undesirable aspects) of a relation in order to obtain “better” relations.

The following four problems might exist in a relation scheme: Repetition anomaly Update anomaly Insertion anomaly Deletion anomaly

Slide 4

Normalization

Page 5: IT-3101 Database Management Systems

Repetition Anomaly The NAME,TITLE, SAL attribute values are repeated

for each project that the employee is involved in. Waste of space Complicates updates Contrary to the spirit of databases

ENO

EMP

ENAME TITLE SAL

J. Doe Elect. Eng. 40000M. Smith 34000M. Smith

AnalystAnalyst 34000

A. Lee Mech. Eng. 27000A. Lee Mech. Eng. 27000J. Miller Programmer 24000B. Casey Syst. Anal. 34000L. Chu Elect. Eng. 40000R. Davis Mech. Eng. 27000

E1E2E2E3E3E4E5E6E7E8 J. Jones Syst. Anal. 34000

24

PNO RESP DUR

P1 Manager 12P1 AnalystP2 Analyst 6P3 Consultant 10P4 Engineer 48P2 Programmer 18P2 Manager 24P4 Manager 48P3 Engineer 36P3 Manager 40

Page 6: IT-3101 Database Management Systems

Update Anomaly If any attribute of project (say SAL of an employee) is

updated, multiple tuples have to be updated to reflect the change.

ENO

EMPENAME TITLE SAL

J. Doe Elect. Eng. 40000M. Smith 34000M. Smith

AnalystAnalyst 34000

A. Lee Mech. Eng. 27000A. Lee Mech. Eng. 27000J. Miller Programmer 24000B. Casey Syst. Anal. 34000L. Chu Elect. Eng. 40000R. Davis Mech. Eng. 27000

E1E2E2E3E3E4E5E6E7E8 J. Jones Syst. Anal. 34000

24

PNO RESP DUR

P1 Manager 12P1 AnalystP2 Analyst 6P3 Consultant 10P4 Engineer 48P2 Programmer 18P2 Manager 24P4 Manager 48P3 Engineer 36P3 Manager 40

Page 7: IT-3101 Database Management Systems

Insertion Anomaly

It may not be possible to store information about a new project until an employee is assigned to it.

ENO

EMP

ENAME TITLE SAL

J. Doe Elect. Eng. 40000M. Smith 34000M. Smith

AnalystAnalyst 34000

A. Lee Mech. Eng. 27000A. Lee Mech. Eng. 27000J. Miller Programmer 24000B. Casey Syst. Anal. 34000L. Chu Elect. Eng. 40000

R. Davis Mech. Eng. 27000

E1E2E2E3E3E4E5E6

E7E8 J. Jones Syst. Anal. 34000

24

PNO RESP DUR

P1 Manager 12P1 AnalystP2 Analyst 6P3 Consultant 10P4 Engineer 48P2 Programmer 18P2 Manager 24P4 Manager 48P3 Engineer 36

P3 Manager 40

Page 8: IT-3101 Database Management Systems

Deletion Anomaly If an engineer, who is the only employee on a project,

leaves the company, his personal information cannot be deleted, or the information about that project is lost.

May have to delete many tuples.

ENO

EMPENAME TITLE SAL

J. Doe Elect. Eng. 40000M. Smith 34000M. Smith

AnalystAnalyst 34000

A. Lee Mech. Eng. 27000A. Lee Mech. Eng. 27000J. Miller Programmer 24000B. Casey Syst. Anal. 34000L. Chu Elect. Eng. 40000R. Davis Mech. Eng. 27000

E1E2E2E3E3E4E5E6E7E8 J. Jones Syst. Anal. 34000

24

PNO RESP DUR

P1 Manager 12P1 AnalystP2 Analyst 6P3 Consultant 10P4 Engineer 48P2 Programmer 18P2 Manager 24P4 Manager 48P3 Engineer 36P3 Manager 40

Page 9: IT-3101 Database Management Systems

What to do? Take each relation individually and “improve” it in terms

of the desired characteristics Normal forms

o Atomic values (1NF)o Can be defined according to keys and dependencies.o Functional Dependencies ( 2NF, 3NF, BCNF)o Multivalued dependencies (4NF)

Normalizationo Normalization is a process of concept separation which applies a

top-down methodology for producing a schema by subsequent refinements and decompositions.

o Do not combine unrelated sets of facts in one table; each relation should contain an independent set of facts.

o Universal relation assumption

Page 10: IT-3101 Database Management Systems

Normalization Issues How do we decompose a schema into a desirable

normal form? What criteria should the decomposed schemas follow

in order to preserve the semantics of the original schema? Reconstructability: recover the original relation no spurious

joins Lossless decomposition: no information loss Dependency preservation: the constraints (i.e., dependencies)

that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.

Page 11: IT-3101 Database Management Systems

A Combined Schema Without Repetition

Consider combining relations sec_class(sec_id, building, room_number) and section(course_id, sec_id, semester, year) into one relation section(course_id, sec_id, semester, year,

building, room_number) No repetition in this case

Page 12: IT-3101 Database Management Systems

What About Smaller Schemas? Suppose we had started with inst_dept. How would we know to split up

(decompose) it into instructor and department? Write a rule “if there were a schema (dept_name, building, budget), then

dept_name would be a candidate key” Denote as a functional dependency:

dept_name building, budget In inst_dept, because dept_name is not a candidate key, the building and

budget of a department may have to be repeated. This indicates the need to decompose inst_dept

Not all decompositions are good. Suppose we decompose employee(ID, name, street, city, salary) intoemployee1 (ID, name)employee2 (name, street, city, salary)

The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

Page 13: IT-3101 Database Management Systems

A Lossy Decomposition

Page 14: IT-3101 Database Management Systems

Example of Lossless-Join Decomposition

Lossless join decomposition Decomposition of R = (A, B, C)

R1 = (A, B) R2 = (B, C)

A B

12

A

B

12

r B,C(r)

A (r) B (r) A B

12

C

AB

B

12

C

AB

C

AB

A,B(r)

Page 15: IT-3101 Database Management Systems

Unnormalized (UDF)

First normal form(1NF)

Remove repeating groups

Second normal form(2NF)

Remove partial dependencies

Third normal form(3NF)

Remove transitive dependencies

Boyce-Codd normalform (BCNF)

Remove remaining functional dependency anomalies

Fourth normal form(4NF)

Remove multivalued dependencies

Fifth normal form(5NF)

Remove remaining anomalies

Stages of Normalization

Page 16: IT-3101 Database Management Systems

Repeating GroupsA repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value.

staffNo job dept dname city contact NumberSL10 Salesman 10 Sales Stratford 018111777, 018111888, 079311122 SA51 Manager 20 Accounts Barking 017111777DS40 Clerk 20 Accounts Barking NullOS45 Clerk 30 Operations Barking 079311555

Example We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff.

Repeating Groups are not allowed in a relational design, since all attributes have to be ‘atomic’ - i.e., there can only be one value per cell in a table!

Page 17: IT-3101 Database Management Systems

Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part).

STUDENT

Stud_ID Name Course_ID Units

101 Lennon MSI 250 3.00

101 Lennon MSI 415 3.00

125 Johnson MSI 331 3.00

Stud_ID Name Course_ID Units101 Lennon MSI 250, MSI 415 3.00

125 Johnson MSI 331 3.00

Repeating Groups

STUDENT

Page 18: IT-3101 Database Management Systems

Functional DependencyFormal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time.

Formal Notation: A B This should be read as ‘A determines B’ or ‘B is functionally dependant on A’. A is called the determinant and B is called the object of the determinant.

staffNo job dept dname SL10 Salesman 10 SalesSA51 Manager 20 AccountsDS40 Clerk 20 AccountsOS45 Clerk 30 Operations

Example:

staffNo jobstaffNo deptstaffNo dnamedept dname

Functional Dependencies

Page 19: IT-3101 Database Management Systems

Functional Dependency

Full Functional Dependency: Only of relevance with composite determinants. This is the situation when it is necessary to use all the attributes of the composite determinant to identify its object uniquely.

order# line# qty price A001 001 10 200A002 001 20 400A002 002 20 800A004 001 15 300

Example:

(Order#, line#) qty(Order#, line#) price

Full Functional Dependencies

Compound Determinants: If more than one attribute is necessary to determine another attribute in an entity, then such a determinant is termed a composite determinant.

Page 20: IT-3101 Database Management Systems

Functional DependencyPartial Functional Dependency: This is the situation that exists if it is necessary to only use a subset of the attributes of the composite determinant to identify its object uniquely.

(student#, unit#) grade

Full Functional Dependencies

unit# room

Partial Functional Dependencies

Repetition of data!

student# unit# room grade

9900100 A01 TH224 2

9900010 A01 TH224 14

9901011 A02 JS075 3

9900001 A01 TH224 16

Page 21: IT-3101 Database Management Systems

Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key.

CUSTOMER

Cust_ID Name Order_ID

101 AT&T 1234

101 AT&T 156

125 Cisco 1250

Partial Dependency

Functional Dependency

Page 22: IT-3101 Database Management Systems

Transitive DependencyDefinition: A transitive dependency exists when there is an intermediate functional dependency.

Formal Notation: If A B and B C, then it can be stated that the following transitive dependency exists: A B C

staffNo deptdept dnamestaffNo dept dname

Transitive Dependencies

Repetition of data!

staffNo job dept dname SL10 Salesman 10 Sales

SA51 Manager 20 AccountsDS40 Clerk 20 AccountsOS45 Clerk 30 Operations

Example:

Page 23: IT-3101 Database Management Systems

Transitive Dependency – when a non-key attribute determines another non-key attribute.

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name

111 Mary Jones 1 Acct

122 Sarah Smith 2 Mktg

Transitive Dependency

Transitive Dependency

Page 24: IT-3101 Database Management Systems

Normal Forms: Review

Unnormalized – There are multivalued attributes or repeating groups

1 NF – No multivalued attributes or repeating groups. 2 NF – 1 NF plus no partial dependencies 3 NF – 2 NF plus no transitive dependencies

Page 25: IT-3101 Database Management Systems

Example 1: Determine NF

ISBN Title ISBN Publisher Publisher Address

BOOK

ISBN Title Publisher Address

All attributes are directly or indirectly determined

by the primary key; therefore, the relation is

at least in 1 NF

Page 26: IT-3101 Database Management Systems

Example 1: Determine NF

ISBN Title ISBN Publisher Publisher Address

BOOK

ISBN Title Publisher Address

The relation is at least in 1NF. There is no COMPOSITE

primary key, therefore there can’t be partial dependencies.

Therefore, the relation is at least in 2NF

Page 27: IT-3101 Database Management Systems

Example 1: Determine NF

ISBN Title ISBN Publisher Publisher Address

BOOK

ISBN Title Publisher Address

Publisher is a non-key attribute, and it determines Address, another non-key attribute.

Therefore, there is a transitive dependency, which means that

the relation is NOT in 3 NF.

Page 28: IT-3101 Database Management Systems

Example 1: Determine NF

ISBN Title ISBN Publisher Publisher Address

BOOK

ISBN Title Publisher Address

We know that the relation is at least in 2NF, and it is not in 3 NF. Therefore, we conclude that the relation is in 2NF.

Page 29: IT-3101 Database Management Systems

Example 1: Determine NF

ISBN Title ISBN Publisher Publisher

Address

BOOK

ISBN Title Publisher Address

In your solution you will write the following justification:

1) No M/V attributes, therefore at least 1NF

2) No partial dependencies, therefore at least 2NF

3) There is a transitive dependency (Publisher Address), therefore,

not 3NFConclusion: The relation is in 2NF

Page 30: IT-3101 Database Management Systems

Product_ID Description

ORDER

Order_No Product_ID Description

Example 2: Determine NF

All attributes are directly or indirectly determined by the

primary key; therefore, the relation is at least in 1 NF

Page 31: IT-3101 Database Management Systems

Product_ID Description

Example 2: Determine NF

ORDER

Order_No Product_ID Description

The relation is at least in 1NF. There is a COMPOSITE Primary Key (PK) (Order_No,

Product_ID), therefore there can be partial dependencies. Product_ID, which is a part of PK, determines Description; hence, there is a partial dependency. Therefore, the relation is not 2NF. No sense to check for transitive dependencies!

Page 32: IT-3101 Database Management Systems

Product_ID Description

Example 2: Determine NF

ORDER

Order_No Product_ID Description

We know that the relation is at least in 1NF, and it is not in 2 NF.

Therefore, we conclude that the relation is in 1 NF.

Page 33: IT-3101 Database Management Systems

Product_ID Description

Example 2: Determine NF

ORDER

Order_No Product_ID Description

In your solution you will write the following justification:

1) No M/V attributes, therefore at least 1NF2) There is a partial dependency (Product_ID Description), therefore

not in 2NFConclusion: The relation is in 1NF

Page 34: IT-3101 Database Management Systems

Thank You


Recommended