Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | carmel-morris |
View: | 225 times |
Download: | 3 times |
Database System Concepts, 5th Ed.
Chapter 7: Relational Database DesignChapter 7: Relational Database Design
Pany, ccsu7.2Database System Concepts - 5th Edition
Chapter 7: Relational Database DesignChapter 7: Relational Database Design
Features of Good Relational Design
Atomic Domains and First Normal Form
Decomposition Using Functional Dependencies
Functional Dependency Theory
Algorithms for Functional Dependencies
Database-Design Process
Pany, ccsu7.3Database System Concepts - 5th Edition
The Banking SchemaThe Banking Schema Branch = (branch_name, branch_city, assets)
支行(支行名称,部门所在城市,资产) customer = (customer_id, customer_name, customer_street,
customer_city)
顾客(顾客 id, 顾客名,顾客所在街道,顾客所在城市) loan = (loan_number, amount) 贷款(贷款号,金额) account = (account_number, balance) 账户(账户号,余额) employee = (employee_id. employee_name, telephone_number,
start_date)
雇员(雇员 id ,雇员姓名,电话号码,开始工作日期) dependent_name = (employee_id, dname)
家属姓名(雇员 id, 家属姓名) account_branch = (account_number, branch_name)
loan_branch = (loan_number, branch_name)
Pany, ccsu7.4Database System Concepts - 5th Edition
The Banking SchemaThe Banking Schema borrower = (customer_id, loan_number)
depositor = (customer_id, account_number)
cust_banker = (customer_id, employee_id, type)
works_for = (worker_employee_id, manager_employee_id)
payment = (loan_number, payment_number, payment_date, payment_amount)
savings_account = (account_number, interest_rate)
checking_account = (account_number, overdraft_amount)
Pany, ccsu7.5Database System Concepts - 5th Edition
Combine Schemas?Combine Schemas?
Suppose we combine borrower and loan to get
bor_loan = (customer_id, loan_number, amount )
Result is possible repetition of information (L-100 in example below)
Pany, ccsu7.6Database System Concepts - 5th Edition
A Combined Schema Without RepetitionA Combined Schema Without Repetition
Consider combining loan_branch and loan
loan_amt_br = (loan_number, amount, branch_name)
No repetition (as suggested by example below)
Pany, ccsu7.7Database System Concepts - 5th Edition
What About Smaller Schemas?What About Smaller Schemas?
Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan?
Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key”
Denote as a functional dependency:
loan_number amount
In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan.
Not all decompositions are good. Suppose we decompose employee into
employee1 = (employee_id, employee_name)
employee2 = (employee_name, telephone_number, start_date)
The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.
Pany, ccsu7.8Database System Concepts - 5th Edition
A Lossy DecompositionA Lossy Decomposition
Pany, ccsu7.9Database System Concepts - 5th Edition
关系模型的形式化定义关系模型的形式化定义
1 、关系模型的五元组定义: R < U,D,DOM,F >
R — 关系名, U — 属性组, D — 域,
DOM — 映射(属性与域之间的联系),
F — 数据依赖(属性与属性之间的联系)
2 、关系模型的三元组定义: R < U,F >
当且仅当 U 上的一个关系 r 满足 F 时, r 称为关系模式R<U,F> 的一个关系。
Pany, ccsu7.10Database System Concepts - 5th Edition
数据依赖数据依赖
1 、定义
数据依赖是通过一个关系中属性间值的相等与否体现出来的数据间的相互关系,它是现实世界属性间相互联系的抽象。
2 、种类 函数依赖
数据依赖 多值依赖
连接依赖
Pany, ccsu7.11Database System Concepts - 5th Edition
Functional DependenciesFunctional Dependencies
定义:设 R(U) 是属性集 U 上的关系模式。 X , Y 是 U 的子集。对于R(U) 的任意一个可能的关系 r ,如果 r 中不存在两个元组,它们在 X 上的属性值相等,而在 Y 上的属性值不等,则称“ X 函数确定 Y” 或“ Y函数依赖于 X” ,记作 XY 。 X 称为这个函数依赖的决定属性集。
Constraints on the set of legal relations. Require that the value for a certain set of attributes determines
uniquely the value for another set of attributes. A functional dependency is a generalization of the notion of a key.
学号 姓名 年龄 性别 籍贯 98601 王晓燕 20 女 北京 98602 李 波 23 男 上海 98603 陈志坚 21 男 长沙 98604 张 兵 20 男 上海
学号姓名学号年龄学号性别学号籍贯
Pany, ccsu7.12Database System Concepts - 5th Edition
Functional Dependencies (Cont.)Functional Dependencies (Cont.)
Let R be a relation schema
R and R The functional dependency
holds on (成立) R if and only if for any legal relations r(R), whenever any two tuples t1 and t2 of r agree on the attributes , they also agree on the attributes . That is,
t1[] = t2 [] t1[ ] = t2 [ ] Example: Consider r(A,B ) with the following instance of r.
On this instance, A B does NOT hold, but B A does hold.
1 41 53 7
Pany, ccsu7.13Database System Concepts - 5th Edition
Functional Dependencies (Cont.)Functional Dependencies (Cont.)
K is a superkey for relation schema R if and only if K R
K is a candidate key for R if and only if
K R, and
for no K, R
Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:
bor_loan = (customer_id, loan_number, amount ).
We expect this functional dependency to hold:
loan_number amount
but would not expect the following to hold:
amount customer_name
Pany, ccsu7.14Database System Concepts - 5th Edition
Use of Functional DependenciesUse of Functional Dependencies
We use functional dependencies to:
test relations to see if they are legal under a given set of functional dependencies.
If a relation r is legal under a set F of functional dependencies, we say that r satisfies F.
specify constraints on the set of legal relations
We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F.
Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances.
For example, a specific instance of loan may, by chance, satisfy amount customer_name.
Pany, ccsu7.15Database System Concepts - 5th Edition
Functional Dependencies (Cont.)Functional Dependencies (Cont.)
A functional dependency is trivial (平凡) if it is satisfied by all instances of a relation
Example:
customer_name, loan_number customer_name
customer_name customer_name
In general, is trivial if
Pany, ccsu7.16Database System Concepts - 5th Edition
Closure of a Set of Functional Closure of a Set of Functional DependenciesDependencies
Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F.
For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure of F (函数依赖 F 的闭包) .
We denote the closure of F by F+.
F+ is a superset of F.
Pany, ccsu7.17Database System Concepts - 5th Edition
完全依赖
在 R(U) 中,如果 XY ,并且对于 X 的任何一个真子集 X’ 都有X’Y ,则称 Y 对 X 完全依赖,记作 X Y 。
部分依赖
若 XY 但 Y 不完全依赖于 X ,则称 Y 对 X 部分函数依赖,记作 X
Y 。
传递依赖
在 R(U) 中,如果 XY , YZ ,且 YX, ZY , YX ,则称 Z对 X 传递依赖。记作 X Y 。
f
p
函数依赖的种类函数依赖的种类
传递
Pany, ccsu7.18Database System Concepts - 5th Edition
规范化规范化 【目的】通过研究关系之间的等价问题,找出一些方法来指导我们
定义数据库的逻辑结构,使其具有好的性能(冗余小、数据完整性好、操作方便)。
OF240ZHOU90C1S4
OF347WANG56C4S3
OF235LIU70C2S3
OF240ZHOU75C1S3
OF240ZHOU90C1S2
OF347WANG87C4S1
OF235LIU85C3S1
OF235LIU90C2S1
OF240ZHOU90C1S1
OFFICETAGETNAMEGRADEC-NOS-NO
SCT 关系 关系规范化定义——通常将结构较简单的关系取代结构较复杂的关系的过程称为关系的规范化。
Pany, ccsu7.19Database System Concepts - 5th Edition
Normalization (Normalization ( 范式范式 ))
范式表示符合某一种级别的关系模式的集合。
R 为第几范式写成 RxNF 。
范 式 的 概 念 是 由 Codd 给 出 的 , 并 在 1971 ~ 1972 年 提 出 了1NF、 2NF、 3NF 的概念, 1974年 Codd和 Boyce 又共同提出了BCNF 的概念, 1976年 Fagin 又提出了 4NF ,后来又有人提出了5NF 。
对于各种范式之间的联系是:
5NF 4NF BCNF 3NF 2NF 1NF 。
一个低一级范式的关系模式,通过模式分解可以转换为若干个高一级范式的关系模式的集合,这种过程就叫规范化。
Pany, ccsu7.20Database System Concepts - 5th Edition
Goals of NormalizationGoals of Normalization
Let R be a relation scheme with a set F of functional dependencies.
Decide whether a relation scheme R is in “good” form.
In the case that a relation scheme R is not in “good” form, decompose it into a set of relation scheme {R1, R2, ..., Rn} such that
each relation scheme is in good form
the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency preserving.
Pany, ccsu7.21Database System Concepts - 5th Edition
7.2 First Normal Form7.2 First Normal Form
Domain is atomic if its elements are considered to be indivisible units
Examples of non-atomic domains:
Set of names, composite attributes
Identification numbers like CS101 that can be broken up into parts
A relational schema R is in first normal form if the domains of all attributes of R are atomic
Non-atomic values complicate storage and encourage redundant (repeated) storage of data
Example: Set of accounts stored with each customer, and set of owners stored with each account
We assume all relations are in first normal form (and revisit this in Chapter 9)
Pany, ccsu7.22Database System Concepts - 5th Edition
First Normal Form (Cont’d)First Normal Form (Cont’d)
Atomicity is actually a property of how the elements of the domain are used.
Example: Strings would normally be considered indivisible
Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127
If the first two characters are extracted to find the department, the domain of roll numbers is not atomic.
Doing so is a bad idea: leads to encoding of information in application program rather than in the database.
Pany, ccsu7.23Database System Concepts - 5th Edition
2NF2NF
若 R1NF ,且每个非主属性完全函数依赖于关键字,则 R2NF 。
Example:
R( S_NO, C_NO, GRADE, TNAME, TAGE, OFFIC
E) ;
F = { (S_NO,C_NO)GRADE,
C_NOTNAME, TNAMETAGE, TNAMEOFFICE };
请问 R2NF ?分解: SC( S_NO, C_NO, GRADE ) CTO( C_NO, TNAME, TAGE, OFFICE ) FSC = { (S_NO,C_NO)GRADE }
FCTO ={ C_NOTNAME, TNAMETAGE, TNAMEOFFICE };
Pany, ccsu7.24Database System Concepts - 5th Edition
3NF3NF
若 R2NF ,且 R 的每个非主属性都不传递依赖于关键字,则 R3NF
。Example : SC( S_NO, C_NO, GRADE ) CTO( C_NO, TNAME, TAGE, OFFICE ) FSC = { (S_NO,C_NO)GRADE }
FCTO ={ C_NOTNAME, TNAMETAGE, TNAMEOFFICE }
请问 SC3NF? CTO3NF ?
分解: SC( S_NO, C_NO, GRADE) FSC = { (S_NO,C_NO)GRADE }
CT( C_NO, TNAME) FCT ={ C_NOTNAME }
TO( TNAME, TAGE, OFFICE) FTO={TNAMETAGE,TNAMEOFFICE
}
Pany, ccsu7.25Database System Concepts - 5th Edition
BCNFBCNF
R1NF ,若 XY 且 YX时 X 必含有关键字,则 RBCNF 。 一个满足 BCNF 的关系框架 R :
所有非主属性对每一候选关键字都是完全依赖; 所有主属性对每一不包含它的候选关键字也是完全依赖; 没有任何属性完全依赖于非候选关键字的任何一组属性。
消除了主属性对候选关键字的部分依赖和传递依赖。
S-NO NAME C-NO GRADE
S1 WANG C1 90
S1 WANG C2 90
S1 WANG C3 85
S2 LI C1 90
S3 CHEN C1 75
S3 CHEN C2 70
分析:此关系有两个候选关键字(S-NO,C-NO)
(NAME,C-NO)
而 S-NONAME
因此 RBCNF
Pany, ccsu7.26Database System Concepts - 5th Edition
Boyce-Codd Normal FormBoyce-Codd Normal Form
is trivial (i.e., )
is a superkey for R
A relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form
where R and R, at least one of the following holds:
Example : schema not in BCNF:
bor_loan = ( customer_id, loan_number, amount )
because loan_number amount holds on bor_loan but loan_number is not a superkey
Pany, ccsu7.27Database System Concepts - 5th Edition
BCNF and Dependency PreservationBCNF and Dependency Preservation
Constraints, including functional dependencies, are costly to check in practice unless they pertain to only one relation
If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving.
Because it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker normal form, known as third normal form.
Pany, ccsu7.28Database System Concepts - 5th Edition
Functional-Dependency TheoryFunctional-Dependency Theory
We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies.
We then develop algorithms to generate lossless decompositions into 3NF( 3NF 分解算法)
We then develop algorithms to test if a decomposition is dependency-preserving (保持函数依赖的分解算法)
Pany, ccsu7.29Database System Concepts - 5th Edition
Closure of a Set of Functional Closure of a Set of Functional DependenciesDependencies
Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F.
For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure of F.
We denote the closure of F by F+.
Pany, ccsu7.30Database System Concepts - 5th Edition
公理
F1 (自反性, reflexivity ):若 XY ,则 XY 或 XX 。
F2 (增广性, augmentation ):若 XY ,则 XZYZ或 XZY。
F3 (传递性, transitivity) ):若 XY , YZ ,则 XZ 。 推理规则
F5 (伪传 性 , pseudotransitivity ) : 若 XY , YWZ , 则XWZ 。
F6 (合成性 , union ):若 XY , XZ ,则 XYZ 。
F7 (分解性 , decomposition ):若 XYZ ,则 XY , XZ 。
FDFD 公理及推理规则公理及推理规则
Pany, ccsu7.31Database System Concepts - 5th Edition
ExampleExample R = (A, B, C, G, H, I)
F = { A B A CCG HCG I B H}
some members of F+
A H
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG and then transitivity with CG I
CG HI
by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity
Pany, ccsu7.32Database System Concepts - 5th Edition
Procedure for Computing FProcedure for Computing F++ (选讲)(选讲)
To compute the closure of a set of functional dependencies F:
F + = Frepeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity then add the resulting functional dependency
to F +
until F + does not change any further
NOTE: We shall see an alternative procedure for this task later
Pany, ccsu7.33Database System Concepts - 5th Edition
Closure of Attribute SetsClosure of Attribute Sets (重点)(重点)
Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F
Algorithm to compute +, the closure of under F
result := ;while (changes to result) do
for each in F dobegin
if result then result := result end
Pany, ccsu7.34Database System Concepts - 5th Edition
Example of Attribute Set ClosureExample of Attribute Set Closure
R = (A, B, C, G, H, I) F = {A B
A C CG HCG IB H}
(AG)+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Pany, ccsu7.35Database System Concepts - 5th Edition
Example of Attribute Set ClosureExample of Attribute Set Closure
Is AG a candidate key?
1. Is AG a super key?
1. Does AG R? == Is (AG)+ R
2. Is any subset of AG a superkey?
1. Does A R? == Is (A)+ R
2. Does G R? == Is (G)+ R
Pany, ccsu7.36Database System Concepts - 5th Edition
Uses of Attribute ClosureUses of Attribute Closure
There are several uses of the attribute closure algorithm:
Testing for superkey:
To test if is a superkey, we compute +, and check if + contains all attributes of R.
Testing functional dependencies
To check if a functional dependency holds (or, in other words, is in F+), just check if +.
That is, we compute + by using attribute closure, and then check if it contains .
Is a simple and cheap test, and very useful
Computing closure of F
For each R, we find the closure +, and for each S +, we output a functional dependency S.
Pany, ccsu7.37Database System Concepts - 5th Edition
Canonical CoverCanonical Cover (最小覆盖、正则覆盖)(最小覆盖、正则覆盖)
Sets of functional dependencies may have redundant dependencies that can be inferred from the others
For example: A C is redundant in: {A B, B C}
Parts of a functional dependency may be redundant
E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}
E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}
Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies
Pany, ccsu7.38Database System Concepts - 5th Edition
Canonical CoverCanonical Cover
A canonical cover for F is a set of dependencies Fc such that
F logically implies all dependencies in Fc, and
Fc logically implies all dependencies in F, and
No functional dependency in Fc contains an extraneous attribute, and
Each left side of functional dependency in Fc is unique.
To compute a canonical cover for F:repeat
Use the union rule to replace any dependencies in F 1 1 and 1 2 with 1 1 2
Find a functional dependency with an extraneous attribute either in or in
If an extraneous attribute is found, delete it from until F does not change
Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied
Pany, ccsu7.39Database System Concepts - 5th Edition
Lossless-join DecompositionLossless-join Decomposition(分解成简单模式的判别算法)(分解成简单模式的判别算法)
For the case of R = (R1, R2), we require that for all possible relations r on schema R
r = R1 (r ) R2 (r )
A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+:
R1 R2 R1
R1 R2 R2
Pany, ccsu7.40Database System Concepts - 5th Edition
ExampleExample
R = (A, B, C)F = {A B, B C)
Can be decomposed in two different ways
R1 = (A, B), R2 = (B, C)
Lossless-join decomposition:
R1 R2 = {B} and B BC
Dependency preserving
R1 = (A, B), R2 = (A, C) ? Lossless-join decomposition:
R1 R2 = {A} and A AB
Not dependency preserving (cannot check B C without computing R1 R2)
Pany, ccsu7.41Database System Concepts - 5th Edition
Dependency PreservationDependency Preservation
Let Fi be the set of dependencies F + that include only attributes in Ri.
A decomposition is dependency preserving, if
(F1 F2 … Fn )+ = F +
If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive.
Pany, ccsu7.42Database System Concepts - 5th Edition
Overall Database Design ProcessOverall Database Design Process
We have assumed schema R is given
R could have been generated when converting E-R diagram to a set of tables.
R could have been a single relation containing all attributes that are of interest (called universal relation).
Normalization breaks R into smaller relations.
R could have been the result of some ad hoc design of relations, which we then test/convert to normal form.
Pany, ccsu7.43Database System Concepts - 5th Edition
ER Model and NormalizationER Model and Normalization
When an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not need further normalization.
However, in a real (imperfect) design, there can be functional dependencies from non-key attributes of an entity to other attributes of the entity
Example: an employee entity with attributes department_number and department_address, and a functional dependency department_number department_address
Good design would have made department an entity
Functional dependencies from non-key attributes of a relationship set possible, but rare --- most relationships are binary
Pany, ccsu7.44Database System Concepts - 5th Edition
Denormalization for PerformanceDenormalization for Performance
May want to use non-normalized schema for performance
For example, displaying customer_name along with account_number and balance requires join of account with depositor
Alternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributes
faster lookup
extra space and extra execution time for updates
extra coding work for programmer and possibility of error in extra code
Alternative 2: use a materialized view defined as account depositor
Benefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors
Database System Concepts, 5th Ed.
End of ChapterEnd of Chapter
Database System Concepts, 5th Ed.
Proof of Correctness of 3NF Proof of Correctness of 3NF Decomposition AlgorithmDecomposition Algorithm
Pany, ccsu7.47Database System Concepts - 5th Edition
Correctness of 3NF Decomposition Correctness of 3NF Decomposition AlgorithmAlgorithm
3NF decomposition algorithm is dependency preserving (since there is a relation for every FD in Fc)
Decomposition is lossless
A candidate key (C ) is in one of the relations Ri in decomposition
Closure of candidate key under Fc must contain all attributes in R.
Follow the steps of attribute closure algorithm to show there is only one tuple in the join result for each tuple in Ri
Pany, ccsu7.48Database System Concepts - 5th Edition
Correctness of 3NF Decomposition Correctness of 3NF Decomposition Algorithm (Cont’d.)Algorithm (Cont’d.)
Claim: if a relation Ri is in the decomposition generated by the
above algorithm, then Ri satisfies 3NF.
Let Ri be generated from the dependency
Let B be any non-trivial functional dependency on Ri. (We need only consider FDs whose right-hand side is a single attribute.)
Now, B can be in either or but not in both. Consider each case separately.
Pany, ccsu7.49Database System Concepts - 5th Edition
Correctness of 3NF Decomposition Correctness of 3NF Decomposition (Cont’d.)(Cont’d.)
Case 1: If B in :
If is a superkey, the 2nd condition of 3NF is satisfied
Otherwise must contain some attribute not in Since B is in F+ it must be derivable from Fc, by using
attribute closure on .
Attribute closure not have used . If it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey.
Now, using (- {B}) and B, we can derive B
(since , and B since B is non-trivial)
Then, B is extraneous in the right-hand side of ; which is not possible since is in Fc.
Thus, if B is in then must be a superkey, and the second condition of 3NF must be satisfied.
Pany, ccsu7.50Database System Concepts - 5th Edition
Correctness of 3NF Decomposition Correctness of 3NF Decomposition (Cont’d.)(Cont’d.)
Case 2: B is in .
Since is a candidate key, the third alternative in the definition of 3NF is trivially satisfied.
In fact, we cannot show that is a superkey.
This shows exactly why the third alternative is present in the definition of 3NF.
Q.E.D.
Pany, ccsu7.51Database System Concepts - 5th Edition
Figure 7.5: Sample Relation Figure 7.5: Sample Relation rr
Pany, ccsu7.52Database System Concepts - 5th Edition
Figure 7.6Figure 7.6
Pany, ccsu7.53Database System Concepts - 5th Edition
Figure 7.7Figure 7.7
Pany, ccsu7.54Database System Concepts - 5th Edition
Figure 7.15: An Example of Figure 7.15: An Example of Redundancy in a BCNF RelationRedundancy in a BCNF Relation
Pany, ccsu7.55Database System Concepts - 5th Edition
Figure 7.16: An Illegal Figure 7.16: An Illegal RR22 RelationRelation
Pany, ccsu7.56Database System Concepts - 5th Edition
Figure 7.18: Relation of Practice Figure 7.18: Relation of Practice Exercise 7.2Exercise 7.2