+ All Categories
Home > Documents > CS405G: Introduction to Database Systems

CS405G: Introduction to Database Systems

Date post: 31-Jan-2016
Category:
Upload: garron
View: 27 times
Download: 3 times
Share this document with a friend
Description:
CS405G: Introduction to Database Systems. Final Review. Database Design. E-R model. E-R model Entities Attributes Relationships. Database Design. 8/21/2014. 8/21/2014. 5. 5. From E-R Diagram to Relations. Relations Schemas Converting E-R diagram to relations Keys Super keys - PowerPoint PPT Presentation
23
CS405G: Introduction to Database Systems Final Review
Transcript
Page 1: CS405G: Introduction to Database Systems

CS405G: Introduction to Database Systems

Final Review

Page 2: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 2

Database Design

Page 3: CS405G: Introduction to Database Systems

E-R model

• E-R model• Entities

• Attributes

• Relationships

Page 4: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 4

Page 5: CS405G: Introduction to Database Systems

04/22/23 5

Database Design

504/22/23

Page 6: CS405G: Introduction to Database Systems

From E-R Diagram to Relations

• Relations• Schemas• Converting E-R diagram to relations• Keys

• Super keys• Candidate keys• Primary keys

• Relational integrity constraints

Page 7: CS405G: Introduction to Database Systems

Key Constraints

• Superkey: – (Uniqueness constraints) A set of attributes where

no two distinct tuples can have the same values– Every relation has at least one superkey:

• The set of all attributes.

• Key: A minimal superkey– Uniqueness constraint (superkey)– Minimum Constraint

• No attribute can be removed and still satisfy the uniqueness constraints.

04/22/23 7Jinze Liu @ University of Kentucky

Page 8: CS405G: Introduction to Database Systems

04/22/23 8

Relational Integrity Constraints

• Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints:

1. Domain constraints1. The value of a attribute must come from its domain

2. Key constraints3. Entity integrity constraints4. Referential integrity constraints

804/22/23

Page 9: CS405G: Introduction to Database Systems

Database Normalization

• Functional Dependency• Functional Closure• Keys

– Redefined– Based on functional dependency

• DB Norm Form– 1st, 2nd, 3rd, BCNF

Page 10: CS405G: Introduction to Database Systems

04/22/23 Luke Huan Univ. of Kansas 10

Database Query

Page 11: CS405G: Introduction to Database Systems

Relational Algebra and SQL

• Relational algebra• SQL query

SFW Group by …, Having Subqueries

Relationship between R.A. and SQL

Page 12: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 12

Relational algebra

• Core set of operators:– Selection, projection, cross product, union, difference, and

renaming• Additional, derived operators:

– Join, natural join, intersection, etc.• Compose operators to make complex queries

RelOp

RelOp

A language for querying relational databases based on operators:

Page 13: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 13

Summary of core operators• Selection:• Projection:• Cross product:• Union:• Difference: • Renaming:

– Does not really add “processing” power

σp R

πL R

R X S

R SR - Sρ S(A1, A2, …) R

Page 14: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 14

Summary of derived operators

• Join: • Natural join:• Intersection:

R p S

R S

R S

Page 15: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 15

Classification of relational operators

• Selection: σp R

• Projection: πL R

• Cross product: R X S• Join: R p S

• Natural join: R S• Union: R U S• Difference: R - S• Intersection: R ∩ S

MonotoneMonotoneMonotoneMonotoneMonotoneMonotoneMonotone w.r.t. R; non-monotone w.r.t S

Monotone

Page 16: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 16

Update Operations on Relations

• Update operations– INSERT a tuple.– DELETE a tuple.– MODIFY a tuple.

• Constraints should not be violated in updates

Page 17: CS405G: Introduction to Database Systems

04/22/23 Luke Huan Univ. of Kansas 17

Basic queries: SFW statement

• SELECT A1, A2, …, An

FROM R1, R2, …, Rm

WHERE condition;• Also called an SPJ (select-project-join) query• (almost) Equivalent to relational algebra query

π A1, A2, …, An (σ condition (R1 X R2 X … X Rm))

Page 18: CS405G: Introduction to Database Systems

04/22/23 Luke Huan Univ. of Kansas 18

Semantics of SFW

• SELECT E1, E2, …, En

FROM R1, R2, …, Rm

WHERE condition;• For each t1 in R1:

For each t2 in R2: … … For each tm in Rm: If condition is true over t1, t2, …, tm: Compute and output E1, E2, …, En as a row

• t1, t2, …, tm are often called tuple variables

• Not 100% correct, we will see

Page 19: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 19

Operational semantics of GROUP BY

SELECT … FROM … WHERE … GROUP BY …;

• Compute FROM • Compute WHERE • Compute GROUP BY: group rows according to

the values of GROUP BY columns• Compute SELECT for each group

– For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group

Number of groups = number of rows in the final output

Page 20: CS405G: Introduction to Database Systems

04/22/23 Jinze Liu @ University of Kentucky 20

Database Design

Page 21: CS405G: Introduction to Database Systems

21

physical data organization• Storage hierarchy (DC vs. Pluto) ! count I/O’s• Disk geometry: three components of access cost;

random vs. sequential I/O• Data layout

– Record layout (handling variable-length fields, NULL’s)– Block layout (NSM, PAX) ! inter-/intra-record locality

• Access paths– Primary versus secondary indexes– Tree-based indexes: ISAM, B+-tree! Again, reintroduce redundancy to improve performance! Fundamental trade-off: query versus update cost

Page 22: CS405G: Introduction to Database Systems

Performance Issues on Indexes

• Indexes– ISAM– B+ Tree

• Metrics – Storage– IO-costs

• Operations– Single value query & range query– Insertion and deletion

22

Page 23: CS405G: Introduction to Database Systems

Query Processing Implementation

• Typical Query Processings– Selection– Join– Set operations.

• Typical approaches– Sequential scans in unsorted database– Sorted database– What are the tradeoffs.

23


Recommended