1
Database Systems( 資料庫系統 )
September 24, 2014Lecture #2
2
Course Administration• http://www.csie.ntu.edu.tw/~winston/courses/dms/• HW #1 will be on the course homepage now
– It is due on Oct 8.• Next week reading:
– R&G Chapters 3 & 4.1~4.2
3
TA Updates• TAs
– 謝朋儒 , Room 505, Monday 1:10~2:10 pm, email: [email protected]
– 張人尹 , Room 506, Tuesday 1:10~2:10 pm, email: [email protected]
4
Possible Layered Architecture to DMS
Query Optimizationand Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
These layersmust considerconcurrencycontrol andcrash recovery
ApplicationsQueries (SQL)
SELECT S.nameFROM Students SWHERE S.sid = 123456
5
Chapter 2Introduction to Database
Design
6
Scenario
• Say if you are hired by iBeer Retailer as a computer consultant.
• iBeer wants you to design its database system. • How to design it?
7
Database Design
• Step 1: Requirements Analysis– What application (e.g., queries, updates, ..) needs from the
database?– What data to store in the database?– What operations are most frequent and subject to performance
req.
• Step 2: Conceptual Database Design– Data to be stored and the constraints– Come up with the design: Entity-Relation (ER) model – Sketch the design using pictures called entity-relationship
diagrams.
• Step 3: Logical Database Design– Implement the design: relational data model– Easy to map ER diagrams into the relational data model (CH 3).
8
Requirement Analysis• Requirement analysis:
– The Beer retailer wants to keep track of
• Beers on shelves • Beer manufacturers: [name & address]
• Conceptual database design– ER diagram
• Logical database design:– Relational model
Beer names
台灣啤酒
青島啤酒
台灣生啤酒
Beers ManfsManfBy
name name addr
Manufacturer’s names
Manufacturer’s addresses
台灣菸酒公賣局
台北市南昌路一段 4 號
青島啤酒廠 ??
Beer names Manufacturer's names
台灣啤酒 台灣菸酒公賣局 台灣生啤酒 台灣菸酒公賣局
青島啤酒 青島啤酒廠
9
ER Model: Entity
• Proposed by Peter Chen (BS NTU EE ‘68) in 1976. • Entity: A real-world object distinguishable from other
objects (e.g., Joe). • An entity is described by a set of attributes.
– Each attribute has a domain of possible values (.e.g., 20-char. strings)
• Entity Set: a collection of similar entities (rectangle)• Each entity in an entity set is uniquely identified by a
key attribute.
Employees
ssnname
(Joe, Alice, ..)
(123: integer)
(‘Joe’: string)
10
ER Model: Relationship
• Relationship: Association among two or more entities– Joe works in finance department.– A relationship must be uniquely identified by the participating entities,
without reference to the descriptive attributes. For example the pair <ssn, did>
• A relationship can have descriptive attributes.– Joe has worked in finance department since 5/2001.
• Relationship Set: Collection of similar relationships.
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
(5/2001)
(finance dept)(Joe)
11
ER Model: Relationship (An Instance)
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
Joe
Alice
Mary
Peter
Finance
AccountingResearch
Legal
3/3/932/2/923/1/922/1/921/1/92Many-to-Many
12
Ternary Relationship
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
capacityLocationsaddress
(Joe) (finance dept)
(Taipei)
13
Roles in Relationship
Reports_To
name
Employees
supervisor
ssn
subordinate
(Roles)
14
Key Constraints• Describe at most once (entity) relationship
– Manages relationship: each department has at most one manager (okay to have none).
– One department can appear at most once in Manages relationship set, also called one-to-many relation.
dname
budgetdidsincename
ssn
Employees DepartmentsManages
Joe
Alice
Mary
Peter
Finance
AccountingResearch
Legal
3/3/932/2/923/1/92
15
More Key Constraints
1-to-1 1-to-Many Many-to-Many
Women Give Birth Babies
Married WomenMen Befriends WomenMen
16
Participation Constraints
• Describe all (entity) participation relationship– Must every department have a manager?
• If yes, this is a participation constraint– All Departments entities must participate in the Manages
relationship set (total participation).
lotname dname
budgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
17
Weak Entities• A weak entity can be identified uniquely only by
considering the key of another (owner) entity.– Pname = partial key (of the weak entity set, i.e.,
“Dependents”)– Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak entities).– Weak entity set must have total participation in this
identifying relationship set.
name
agepname
DependentsEmployees
ssn
Policy
cost
(Alicia) (2)
(Hao)
18
ISA (`is a’) Hierarchies
• As in C++ and OO languages, attributes are inherited from superclass.
• A ISA B, every A entity is also considered to be a B entity. • Why using ISA?
• Add descriptive attributes specific (make sense) to a subclass.• Identify entities that make sense to a relationship (policy).
subclass entities
superclass entity
Contract_Emps
namessn
Employees
hourly_wagesISA
Hourly_Emps
contractid
hours_worked
19
ISA (`is a’) Constraints
• Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed)
• Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
Contract_Emps
namessn
Employees
hourly_wagesISA
Hourly_Emps
contractid
hours_worked
20
Aggregation
• Create relationship set from relationship sets.
• Aggregation: relationship set turns into an entity set– So that they can
participate in (other) relationships.
budgetdidpid
started_on
pbudgetdname
until
DepartmentsProjects Sponsors
Employees
Monitors
namessn
since
21
Design Guideline
1. Avoid redundancy.2. Don’t use an entity set when an
attribute will do.3. Limit the use of weak entity sets.
22
Avoiding Redundancy
• Redundancy occurs when we say the same thing in two different ways.
• Redundancy is bad– wastes space – encourages inconsistency.
• The two instances of the same fact may become inconsistent if we change one and forget to change the other instance.
23
Redundancy Example
Beers ManfsManfBy
name
This design states the manufacturer of a beer twice: as an attribute and as a related entity.
name
manf
addr
24
Fix Redundancy
Beers ManfsManfBy
name
This design gives the address of each manufacturer exactly once.
name addr
25
Example: Bad
Beers
name
This design repeats the manufacturer’s address once for each beer.Why is it bad?
•Manf updates its address.•Loses the address if there are temporarily no beers for a manufacturer.
manf manfAddr
26
Exercise 2.2 (R-G Book)A university database contains information about
professors (identified by social security number) and courses (identified by courseid). Professors teach courses; each of the following situations concerns the Teaches relationship set. For each situation, draw an ER diagram that describes it.
• Professors can teach the same course in several semesters, and each offering must be recorded.
27
• Professors can teach the same course in several semesters, and only the most recent such offering needs to be recorded.
• Every professor must teach some courses
28
• Every professor teaches exactly one course (no more, no less)
• Every professor teaches exactly one course (no more, no less), and every course must be taught by some professor
29
Exercise 2.3 (R-G Book)
• Professors have an SSN, a name, an age, a rank, and a research specialty.• Projects have a project number, a sponsor name (e.g., NSF), a starting
date, an ending date, and a budget.
30
• Graduate students have an SSN, a name, an age, and a degree program• Each project is managed by exactly one professor (known as PI)• Each project is worked in by one or more professors (known as Co-PIs)• Each project is worked on by one or more graduate students (known as RAs)
31
• When graduate students work on a project, a professor must supervise their work on the project. Graduate students can work on multiple projects, in which case they will have a potentially different supervisor for each one
• Departments have a department number, a department name, and a main office.• Department has a professor (known as Chairman) who runs the department.
32
• Professors work in one or more departments, and for each department that they work in, a time percentage is associated with their job
• Graduate students have one major department in which they are working on their degree.
• Each graduate student must have another, more senior graduate student as an advisor.
33
Summary
• ER model is popular for conceptual design– Sketch the design of a database informally using pictures
• Basic constructs in ER model: – entities, relationships, and attributes (of entities and
relationships).
• Some additional constructs: – weak entities, ISA hierarchies, and aggregation.
• Several kinds of integrity constraints:– key constraints, participation constraints, and
overlap/covering constraints for ISA hierarchies.
• Design guideline in ER model
34
Entity Sets Versus Attributes
• Modeling a concept with a new entity set should satisfy at least one of the following conditions:
– It is more than the name of something; it has at least one nonkey attribute.
or– It is the “many” in a many-one or many-
many relationship.
35
Example: Okay
Beers ManfsManfBy
name
•Manfs deserves to be an entity set because of the nonkey attribute addr.•Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy.
name addr
36
Example: Beers Entity not Needed
Beers ManfsManfBy
name
•Beers can be an attribute rather than an entity.
name addr
37
Example: Okay
Beers
name
There is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name.
manf
38
Example: Bad
Beers ManfsManfBy
name
Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set.
name
39
Don’t Overuse Weak Entity Sets
• Beginning database designers often doubt that anything could be a key by itself.– They make all entity sets weak, supported by all
other entity sets to which they are linked.
• In reality, we usually create unique ID’s for entity sets.– Examples include social-security numbers,
automobile VIN’s etc.
40
When Do We Need Weak Entity Sets?
• The usual reason is that there is no global authority capable of creating unique ID’s.
• Example: it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world.