+ All Categories

(.ppt)

Date post: 03-Nov-2014
Category:
Upload: tess98
View: 891 times
Download: 1 times
Share this document with a friend
Description:
 
Popular Tags:
49
INFS614, GMU The Relational Model The Relational Model Lecture 3 INFS614, Fall 2008
Transcript
Page 1: (.ppt)

INFS614, GMU 1

The Relational ModelThe Relational Model

Lecture 3

INFS614, Fall 2008

Page 2: (.ppt)

INFS614, GMU 2

Relational ModelRelational Model Relational Model = Structure + Operations

– Structure: Relations (or Tables)– Operations: Relational Algebra, SQL.

Most widely implemented model.– Vendors: IBM DB2, Microsoft SQL Server, Oracle, etc.

Our design+implementation approach:Step 1: ER design (ERD)Step 2: Translate to Relational (Relational Schema)Step 3: Querying over the relational model

Page 3: (.ppt)

INFS614, GMU 3

Relational Database: Relational Database: DefinitionsDefinitions

Relational database: a set of relations

Relation: made up of 2 parts:– Instance : a table, with rows and columns.

#Rows = cardinality, #fields = degree / arity.– Schema : specifies name of relation, plus

name and type of each column. E.G. Students(sid: string, name: string, login:

string, age: integer, gpa: real).

We can think of a relation as a set of rows or tuples (i.e., all rows are distinct).

Page 4: (.ppt)

INFS614, GMU 4

Example: Instance of Example: Instance of Students RelationStudents Relation

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.2

53650 Smith smith@math 19 3.8

Cardinality = 3, degree = 5, all rows distinct;

Do all columns in a relation instance have to be distinct?

The order in which the rows are listed is not important;

Page 5: (.ppt)

INFS614, GMU 5

Another Example: Another Example: Employees RelationEmployees Relation

Employees Schema:Employees(ssn:integer,name:string,rank:char,salary:

float)

An instance of Employees:

Page 6: (.ppt)

INFS614, GMU 6

Example: Employees Relation Example: Employees Relation (Contd.)(Contd.)

An instanceinstance of Employees = { <633909767, Richard Boon, A,

75689.09>, <674627883,Adolfo Laurenti, B, 67890.00>, <193838904,Will Smith,C,50000.00>,…}Set of tuples

(or rows)

Page 7: (.ppt)

INFS614, GMU 7

Relational Database : Relational Database : DefinitionsDefinitions

InstanceInstance : a set of tuples of the relation

A tuple : < a1:d1, …,an:dn >, aj is an attribute name,dj is the value of the attribute aj , dj either belongs to Domain(aj ) or is NULLNULL

An instanceinstance of Employees = { <ssn:633909767,name:Richard Boon, rank:A,

salary:75689.09>,<ssn:674627883,name:Adolfo Laurenti, rank:B,

salary:67890.00>,<ssn:193838904,name:Will Smith, rank:C, salary:50000.00>,

… }

Page 8: (.ppt)

INFS614, GMU 8

Relational Database: Relational Database: DefinitionsDefinitions

Relational database: a set of relations;

Relational database schema: the collection of schemas for the relations in the database;

Page 9: (.ppt)

INFS614, GMU 9

Example: A Company Database Example: A Company Database SchemaSchema

A First Schema:

Employees(ssn:integer,name:string,rank:integer,sala

ry:float)Projects(pid:integer,pname:string,budget:float)Location(address:string,capacity:integer)Departments(did:integer,dname:string,budget:float)Manages(ssn:integer,did: integer,since:date)Reports_To(ssnSubordinate:integer,ssnSupervisor:

integer)Works_for(ssn:integer,pid: integer,hours:float)Works_in(ssn:integer,did: integer,address:string)

Page 10: (.ppt)

INFS614, GMU 10

Relational Query Relational Query LanguagesLanguages A major strength of the relational

model: supports simple, powerful querying of data.

Queries can be written intuitively, and the DBMS is responsible for efficient evaluation.– The key: precise semantics for relational

queries.– Allows the optimizer to extensively re-order

operations, and still ensure that the answer does not change.

Page 11: (.ppt)

INFS614, GMU 11

The SQL Query LanguageThe SQL Query Language

Developed by IBM (system R) in the 1970s

Need for a standard since it is used by many vendors

Standards: – SQL-86– SQL-89 (minor revision)– SQL-92 (major revision, current standard)– SQL-99 (major extensions)

Page 12: (.ppt)

INFS614, GMU 12

Creating Relations in SQLCreating Relations in SQL

Creates the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified.

As another example, the Enrolled table holds information about courses that students take.

CREATE TABLE Students

(sid CHAR(20), name CHAR(20), login CHAR(10),

age INTEGER, gpa REAL)

CREATE TABLE Enrolled

(sid CHAR(20), cid CHAR(20), grade CHAR(2))

Page 13: (.ppt)

INFS614, GMU 13

Adding and Deleting Adding and Deleting TuplesTuples We can insert a single tuple using:

INSERT

INTO Students (sid, name, login, age, gpa)VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)

Can delete all tuples satisfying some condition (e.g., name = Smith):

DELETE FROM Students SWHERE S.name = ‘Smith’

Powerful variants of these commands are available; more later!

Page 14: (.ppt)

INFS614, GMU 14

Querying Relational DataQuerying Relational Data

To find all 18 year old students, we can write:

SELECT *FROM Students SWHERE S.age=18

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.2

53650 Smith smith@math 19 3.8

Instance of Students:

Page 15: (.ppt)

INFS614, GMU 15

Querying Relational Data Querying Relational Data (Contd.)(Contd.)

The result is:

SELECT *FROM Students SWHERE S.age=18

•To find just names and logins, replace the first line:

SELECT S.name, S.login

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@ee 18 3.2

Page 16: (.ppt)

INFS614, GMU 16

Updating TuplesUpdating Tuples

Can update tuples using:

UPDATE Students S SET S.age = S.age + 1, S.gpa = S.gpa -1WHERE S.sid = 53688

Page 17: (.ppt)

INFS614, GMU 17

Querying Multiple Querying Multiple RelationsRelations

What does the following query compute?

SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid=E.sid AND E.grade=“A”

Page 18: (.ppt)

INFS614, GMU 18

Querying Multiple Querying Multiple RelationsRelations

S.name E.cid

Smith Topology112

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

we get:

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.2

53650 Smith smith@math 19 3.8

Instance ofStudents:

Instance ofEnrolled:

Page 19: (.ppt)

INFS614, GMU 19

Destroying and Altering Destroying and Altering RelationsRelations

Destroys the relation Students. The schema information and the tuples are deleted.

DROP TABLE Students

The schema of Students is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field.

ALTER TABLE Students ADD COLUMN firstYear: INTEGER

Page 20: (.ppt)

INFS614, GMU 20

Integrity Constraints (ICs)Integrity Constraints (ICs)

IC: condition that must be true for any instance of the database; e.g., domain constraints.– ICs are specified when schema is defined.– ICs are checked when relations are modified.

A legal instance of a relation is one that satisfies all specified ICs. – DBMS should not allow illegal instances.

If the DBMS checks ICs, stored data is more faithful to real-world meaning.– Avoids data entry errors, too!

Page 21: (.ppt)

INFS614, GMU 21

Primary Key ConstraintsPrimary Key Constraints

A set of fields is a (candidate) key for a relation if :1. No two distinct tuples can have same values in all

key fields, and2. This is not true for any subset of the key.– Part 2 false? A superkey.– If there’s >1 candidate keys for a relation, one of the

keys is chosen (by DBA) to be the primary key. E.g., sid is a key for Students. (What about

name?) The set {sid, gpa} is a superkey.

Page 22: (.ppt)

INFS614, GMU 22

Primary and Candidate Primary and Candidate Keys in SQLKeys in SQL

Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key.

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid) )

“For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

Used carelessly, an IC can prevent the storage of database instances that arise in practice!

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) )

Page 23: (.ppt)

INFS614, GMU 23

Foreign Keys, Referential Foreign Keys, Referential IntegrityIntegrity

In addition to Students we have a second relation: Enrolled(sid: string, cid: string, grade: string)

Only students listed in the Students relation should be allowed to enroll for courses.

sid name login age gpa

53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

EnrolledStudents

Page 24: (.ppt)

INFS614, GMU 24

Foreign Keys, Referential Foreign Keys, Referential IntegrityIntegrity

Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’.

E.g. sid is a foreign key referring to Students:– Enrolled(sid: string, cid: string, grade: string)– If all foreign key constraints are enforced,

referential integrity is achieved, i.e., no dangling references.

– Can you name a data model w/o referential integrity?

Links in HTML!

Page 25: (.ppt)

INFS614, GMU 25

Foreign Keys, Referential IntegrityForeign Keys, Referential Integrity

Another Example :– Only employees in the Employees Relation

should be allowed to be managers: ssn is a Foreign Key respect to

Employees– Only projects in the Project Relation should

be allowed to be managed : pid is a Foreign Key respect to Projects

Employees Managers Projects

ssn pid hours 534559257 1 2 123456789 1 56 231896598 53 8 193838902 18 36 354681756 18 46

pid pname pbudget 1 XA011 5000000.00 53 Y 7560000.00 18 X 250000.00

Page 26: (.ppt)

INFS614, GMU 26

Foreign Keys in SQLForeign Keys in SQL

Only students listed in the Students relation should be allowed to enroll for courses.

CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students )

sid name login age gpa

53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

EnrolledStudents

Page 27: (.ppt)

INFS614, GMU 27

Foreign Keys, Referential Foreign Keys, Referential IntegrityIntegrity A Foreign KeyForeign Key must correspond to the primary

key of the referenced relation A Foreign KeyForeign Key states a ReferentialReferential ICIC

between two relations : a tuple in one relation that refers to another must refer to an existing tuple in that relation.

Referential Integrity ConstraintsReferential Integrity Constraints is used to maintain the consistency among tuples of two related relations

Page 28: (.ppt)

INFS614, GMU 28

Enforcing Referential Enforcing Referential IntegrityIntegrity

Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students.

What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!)

What should be done if a Students tuple is deleted?– Also delete all Enrolled tuples that refer to it.– Disallow deletion of a Students tuple that is referred to.– Set sid in Enrolled tuples that refer to it to a default sid.– (In SQL, also: Set sid in Enrolled tuples that refer to it to a

special value null, denoting `unknown’ or `inapplicable’.) Similarly if primary key value of a Students tuple is

updated.

Page 29: (.ppt)

INFS614, GMU 29

Referential Integrity in Referential Integrity in SQL/92SQL/92

SQL/92 supports all 4 options on deletes and updates.– Default is NO ACTION

(delete/update is rejected)

– CASCADE (also delete all tuples that refer to deleted tuple)

– SET NULL / SET DEFAULT (sets foreign key value of referencing tuple)

CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students

ON DELETE CASCADE

ON UPDATE NO ACTION )

Page 30: (.ppt)

INFS614, GMU 30

Where do ICs Come From?Where do ICs Come From? ICs are based upon the semantics of the real-

world enterprise that is being described in the database relations.

We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance.– An IC is a statement about all possible instances!– From example, we know name is not a key, but the

assertion that sid is a key is given to us.

Key and foreign key ICs are the most common; more general ICs supported too.

Page 31: (.ppt)

INFS614, GMU 31

Logical DB Design: ER to Logical DB Design: ER to RelationalRelational

Entity sets to tables.

CREATE TABLE

Employees (ssn CHAR(11), name CHAR(20), lot INTEGER, PRIMARY KEY (ssn))

Employees

ssnname

lot

Page 32: (.ppt)

INFS614, GMU 32

Relationship Sets to Relationship Sets to TablesTables

In translating a relationship set to a relation, attributes of the relation must include:

– Keys for each participating entity set (as foreign keys).

This set of attributes forms a superkey for the relation.

– All descriptive attributes.

CREATE TABLE Works_In( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)

Page 33: (.ppt)

INFS614, GMU 33

Translating Ternary Relationship Translating Ternary Relationship SetSet

Works_In0(ssn:integer,did:integer,address:string)

CREATE TABLE Works_In0(ssn CHAR(11),

did INTEGER, address CHAR(60), PRIMARY KEY (ssn, did, address),FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (did) REFERENCES Departments,FOREIGN KEY (address) REFERENCES Locations)

Works-In0 Departments

dnamedid

Locationsaddress Capacity

dbudget

Employees

salaryname

ssnrank

Page 34: (.ppt)

INFS614, GMU 34

Review: Key ConstraintsReview: Key Constraints

Each dept has at most one manager, according to the key constraint on Manages.

Translation to relational model?

Many-to-Many1-to-1 1-to Many Many-to-1

dname

budgetdid

since

lot

name

ssn

ManagesEmployees Departments

Page 35: (.ppt)

INFS614, GMU 35

Translating ER Diagrams with Key Translating ER Diagrams with Key ConstraintsConstraints

Map relationship to a table:– Note that did is the

key now!– Separate tables for

Employees and Departments.

Since each department has a unique manager, we could instead combine Manages and Departments.

CREATE TABLE Manages( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)

CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees)

Page 36: (.ppt)

INFS614, GMU 36

Review: Participation Review: Participation ConstraintsConstraints

Does every department have a manager?– If so, this is a participation constraint: the participation of

Departments in Manages is said to be total (vs. partial). Every did value in Departments table must appear in a row

of the Manages table (with a non-null ssn value!)

lot

name dnamebudgetdid

sincename dname

budgetdid

since

Manages

since

DepartmentsEmployees

ssn

Works_In

Page 37: (.ppt)

INFS614, GMU 37

Participation Constraints Participation Constraints in SQLin SQL

We can capture participation constraints involving one entity set in a binary relationship, but little else (without resorting to CHECK constraints).

CREATE TABLE Dept_Mgr( did INTEGER,

dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE NO ACTION)

Page 38: (.ppt)

INFS614, GMU 38

Participation ConstraintsParticipation Constraints Is it possible to express this participation constraint using

only key and foreign key constraints?

Works_for( ssn: integer, pid: integer, hours: float)

Works_For Projects

pid pnamehours

Employees

salary

name

ssn

rank

CREATE TABLE Works_for(ssn INTEGER, pid INTEGER, hours float,PRIMARY KEY (ssn,pid),FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (pid) REFERENCES Projects)

NO

Page 39: (.ppt)

INFS614, GMU 39

Review: Weak EntitiesReview: Weak Entities A weak entity can be identified uniquely only by

considering the primary key of another (owner) entity.– Owner entity set and weak entity set must

participate in a one-to-many relationship set (1 owner, many weak entities).

– Weak entity set must have total participation in this identifying relationship set.

lot

name

agepname

DependentsEmployees

ssn

Policy

cost

Page 40: (.ppt)

INFS614, GMU 40

Translating Weak Entity Translating Weak Entity SetsSets

Weak entity set and identifying relationship set are translated into a single table.– When the owner entity is deleted, all owned

weak entities must also be deleted.

CREATE TABLE Dep_Policy ( pname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11), PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)

Page 41: (.ppt)

INFS614, GMU 41

Review: ISA HierarchiesReview: ISA Hierarchies

Contract_Emps

namessn

Employees

lot

hourly_wages

ISA

Hourly_Emps

contractid

hours_worked

As in C++, or other PLs, attributes are inherited.If we declare A ISA B, every A entity is also considered to be a B entity.

Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed)

Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no)

Page 42: (.ppt)

INFS614, GMU 42

Translating ISA Hierarchies Translating ISA Hierarchies to Relationsto Relations

General approach:– 3 relations: Employees, Hourly_Emps and Contract_Emps.

Hourly_Emps: Every employee is recorded in Employees. For hourly emps, extra info recorded in Hourly_Emps (hourly_wages, hours_worked, ssn); must delete Hourly_Emps tuple if referenced Employees tuple is deleted).

Queries involving all employees easy, those involving just Hourly_Emps require a join to get some attributes.

Alternative: Just Hourly_Emps and Contract_Emps.– Hourly_Emps: ssn, name, lot, hourly_wages, hours_worked.– Each employee must be in one of these two subclasses.

Page 43: (.ppt)

INFS614, GMU 43

Translating ISA Hierarchies Translating ISA Hierarchies to Relationsto Relations

General approach:

CREATE TABLE Hourly_Emps ( hourly_wages REAL, hours_worked REAL, ssn CHAR(11), PRIMARY KEY (ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)

Similarly for Contract_Emps TABLE

Page 44: (.ppt)

INFS614, GMU 44

Translating AggregationsTranslating Aggregations

Monitors until

SponsorsDepartments

dbudget

Projects

dname

did pid pnamesince

Employees

salary

name

ssn

rank

Page 45: (.ppt)

INFS614, GMU 45

Translating AggregationsTranslating Aggregations

Sponsors(did, pid, since)

Monitors(ssn, did, pid,until)

CREATE TABLE Sponsors(did INTEGER,

pid INTEGER,since DATE,

PRIMARY KEY (did,pid),FOREIGN KEY (did) REFERENCES Departments,FOREIGN KEY (pid) REFERENCES Projects)

CREATE TABLE Monitors(ssn INTEGER,did INTEGER,

pid INTEGER,until DATE,

PRIMARY KEY (ssn, did, pid),FOREIGN KEY (did,pid) REFERENCES Sponsors,FOREIGN KEY (ssn) REFERENCES Employees)

Page 46: (.ppt)

INFS614, GMU 46

Translating AggregationsTranslating Aggregations

If: Every sponsored project has a monitor, and

the attribute “since” is not required for Sponsors ….

Every possible instance of the Sponsors relationship is obtained by looking at the set of pairs <pid,did> in the relation Monitors

Therefore, we can omit the Sponsors relationrelation

Page 47: (.ppt)

INFS614, GMU 47

Review: Binary vs. Review: Binary vs. Ternary RelationshipsTernary Relationships

If each policy is owned by just 1 employee:– Key constraint

on Policies would mean policy can only cover 1 dependent!

What are the additional constraints in the 2nd diagram?

agepname

DependentsCovers

name

Employees

ssn lot

Policies

policyid cost

Beneficiary

agepname

Dependents

policyid cost

Policies

Purchaser

name

Employees

ssn lot

Bad design

Better design

Page 48: (.ppt)

INFS614, GMU 48

Binary vs. Ternary Binary vs. Ternary Relationships (Contd.)Relationships (Contd.)

The key constraints allow us to combine Purchaser with Policies and Beneficiary with Dependents.

Participation constraints lead to NOT NULL constraints.

CREATE TABLE Policies ( policyid INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (policyid), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)

CREATE TABLE Dependents ( pname CHAR(20), age INTEGER, policyid INTEGER, PRIMARY KEY (pname, policyid), FOREIGN KEY (policyid) REFERENCES Policies, ON DELETE CASCADE)

Page 49: (.ppt)

INFS614, GMU 49

Relational Model: Relational Model: SummarySummary

A tabular representation of data. Simple and intuitive, currently the most widely

used. Integrity constraints can be specified by the

DBA, based on application semantics. DBMS checks for violations. – Two important ICs: primary and foreign keys– In addition, we always have domain constraints.

Powerful and natural query languages exist. Rules to translate ER to relational model


Recommended