Relational Database Theory
• The Relational Theory
– Ways of working with data
• Types of “Models”
–File database model
–Hierarchical database model
–Network database model
–Relational database model
Relational Database Theory
• The Relational Theory
– Meaning of database model
• The way data is organized & stored
• The way data is manipulated
Relational Database Theory
• Relational Model of Data
– Published in 1970 by Dr. Edgar (Ted) Codd – IBM
• “A Relational Model of Data for Large Shared Data Banks”
Relational Database Theory
• Relational Model of Data– Purpose
• Achieve program/data structure independence
• Treat data in a disciplined way–Apply rigor of mathematics–Uses Set Theory – sets of related
data• Improve programmer productivity
Relational Database Theory
• The Relational Model– Relational uses familiar concepts
• The data is perceived as organized in tables– Relational also incorporates the rigor of
mathematics• Rows of the table are treated as elements in a
set• Manipulation of rows is based on set
operations – (Vinn Diagrams)– User works with a set of rows at a time
Relational Database Theory
• Relational also impacts Data Design
– Files were often constructed to support an application
– Tables are designed to describe one thing or Entity in the database
Relational Database Theory
• Example of a Relation:– ANIMAL – Entity (Relation)
ANAME AFAMILY WEIGHT
Candice Camel 1800
Zona Zebra 900
Sam Snake 5
Elmer Elephant 5000
Leonard Lion 1200
Relational Database Theory
• Definition of a Relation
– Data is organized & stored in structures called relations
– A relation is a table that adheres to certain rules
• A relation can be called a table
Relational Database Theory
• Definition of a Relation
– A relation is a table containing all the data about some entity
• An entity is a thing or object that is important in this application area
• Data items in the table are related
Relational Database Theory
• Relational Data Structure
ANAME AFAMILY WEIGHT
Candice Camel 1800
Zona Zebra 900
Sam Snake 5
Elmer Elephant 5000
Leonard Lion 1200
Name Species WeightDomains
Primary Key
Relation
Attributes
Tuples
Relational Database Theory
• Relational Data Structure Definitions
– Relation
• The Table
– Tuple
• A Row
– Attribute
• A Column
Relational Database Theory
• Relational Data Structure Definitions– Primary Key
• A unique identifier for the table– Domain
• A pool of legal values from which an attribute value is selected
–Related to meaning–Has a Data Type
Relational Database Theory
• Relational Data Structure Definitions
– Degree
• The number of attributes
– Cardinality
• The number of tuples
Relational Database Theory
• Relational Table Rules
– A Relation is a table that adheres to the following rules:
• There are No Duplicate Tuples in the table
–The tuples in the table are treated as a mathematical set
Relational Database Theory
• Relational Table Rules
–By definition, a set is a collection of unique elements
• There must be a primary key (unique identifier) for each tuple
Relational Database Theory
• Relational Table Rules• There is no order to the tuples
(top to bottom)• There is no order to the attributes
(left to right)–By convention, the primary key
attribute is usually the first one on the left side of the table
Relational Database Theory
• Attributes
– Each attribute has a datatype
• Examples: Integer, character, date, user-defined
– The data value of an attribute can be null
Relational Database Theory
• Attributes– Each attribute value is atomic
• There is One & Only One data value in each cell of the table
• There are no Lists or Arrays• One fact per field, one field per
fact– Can be called a Field (MS Access)
Relational Database Theory
• Relational Data Structure: Design– Each relation contains data about
only one entity• Each row corresponds to one
unique occurrence of the entity– A relation does not contain arrays,
lists or repeating groups• No multi-valued attributes
Relational Database Theory
– Tables are designed according to Rules of Normalization
• Each data item in the table is determined
–By the Primary Key
–By the Whole Primary Key
–Only by the Primary Key
Relational Database Theory
– Normalization avoids well-known update problems
• Optimizes design to minimize redundancy & storage requirements
Relational Database Theory
• Example: Table with repeating group–Animal
ANAME AFAMILY WEIGHT FOOD
Candice Camel 1800 Hay
Buns
Zona Zebra 900 Brush
Sam Snake 5 Mice
People
Elmer Elephant 5000 Leaves
Leonard Lion 1200 People
Meat
Relational Database Theory
• Example: Table with no repeating group
ANAME FOOD
Candice Hay
Candice Buns
Zona Brush
Sam Mice
Sam People
Elmer Leaves
Leonard People
Leonard Meat
ANAME AFAMILY WEIGHT
Candice Camel 1800
Zona Zebra 900
Sam Snake 5
Elmer Elephant 5000
Leonard Lion 1200
Animal
Animal-Food
Relational Database Theory
• A Database Models the Real World– A Database represents Reality– The database is a collection of relations
• A relation represents an entity type• Each tuple represents one occurrence
of that entity type• Each occurrence of an entity is unique
Relational Database Theory
• A Database Models the Real World
– A database contains information about
• Entities
• Relationships between entities
• Rules about the entities’ data & the relationships
Relational Database Theory
• Relational Databases Support Relationships– Relational databases support
relationships between entities• Relationship is established by a
Foreign Key• Repeat the Primary Key of one
table in the related table(s)
Relational Database Theory
• Example: The Zoo has an “Adopt-an-Animal” program– A zoo member can adopt an animal
MID MNAME MADDR *** ANAME
171 N. Harrison 1400 Blush Rd
Zona
144 J. Montagano
1108 5th Ave Leonard
194 J. Spence 1244 Lark Ln Candice
303 E. Wingate 5222 Gains Dr Candice
101 H. Yarchun 177 Beach Rd
270 K. Steeg 140 Crystal Dr Zona
291 S. Ackerman 1172 Park Dr Sam
301 K. Snyder 196 279th Ave
ANAME AFAMILY WEIGHT
Candice Camel 1800
Zona Zebra 900
Sam Snake 5
Elmer Elephant 5000
Leonard Lion 1200
AnimalForeign KeyZoo-Member
Relational Database Theory
• Example: Another Relationship
ANAME FOOD
Candice Hay
Candice Buns
Zona Brush
Sam Mice
Sam People
Elmer Leaves
Leonard People
Leonard Meat
ANAME AFAMILY WEIGHT
Candice Camel 1800
Zona Zebra 900
Sam Snake 5
Elmer Elephant 5000
Leonard Lion 1200
Animal
Animal-FoodComposite Primary Key
Foreign Key
Relational Database Theory
• Relational Integrity Rules– Entity Integrity
• No part of the Primary Key (PK) may be Null
– Referential Integrity• The value of a Foreign Key (FK) must
either–Be Null or–Be one of the values of the PK in
the related table
Relational Database Theory
• Keys, Keys, and More Keys
– Characteristic of a Primary Key (PK)
• Unique
• Mandatory
• Unchanging
• Under the control of IT organization
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Candidate Key
–A minimal set of attributes that can be used as the unique identifier for a table
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Primary Key
–One of the candidate keys
• Alternate Key
–A candidate key that is not the primary key
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Foreign Key
–A primary key of a related table
–Indicates relationships
Relational Database Theory
• Keys, Keys, and More Keys– Names or Types of Keys
• Composite Key–A key composed of more than one
attribute• Search Key
–One or more attributes on which a retrieval is based
»Indexes
Relational Database Theory
• Characteristics of Relationships
– Referential integrity applies to the relationship between entities
• Also known as an existence constraint or an enterprise rule
• For every relationship, referential integrity must be defined
Relational Database Theory
• Relationships have Cardinality– One-To-One– One-To-Many– Many-To-Many
• Relationships have Optionality– Each entity’s participation is either
• Mandatory or• Optional
Relational Database Theory
• Cardinality reflects Business Rules
– One-To-One Relationship
• One animal is cared for by one zoo worker
• One zoo worker cares for one animal
Relational Database Theory
• Cardinality reflects Business Rules
– One-To-Many Relationship
• One animal is cared for by many zoo workers
• One zoo worker cares for only one animal
Relational Database Theory
• Cardinality reflects Business Rules
– Many-To-Many Relationship
• One animal is cared for by many zoo workers
• One zoo worker cares for many animals
Relational Database Theory
• Mandatory Relationship
– The Foreign Key Cannot be Null
– Every purchase order must have a supplier
– In the example below the FK, SNO, cannot be Null
Relational Database Theory
• Example:
ONO SNO ODATE ***
7001 1234 03/09/02
7002 2079 03/10/02
7003 2079 03/12/02
***
SUPPLIER
SNO SNAME SADDR
1234 Farm & Feed
7000 Booth Rd
2079 The Grain House
2001 Larkin Dr
***
PORDER
Relational Database Theory
• Example: FK can be Null
ANID ANAME AFAMILY WEIGHT
0001 Candice
Camel 1800
0002 Zona Zebra 900
0003 Sam Snake 5
0004 Elmer Elephant 5000
0005 Leonard
Lion 1200
ANIMALMID MNAME MADDR *** ANID
171 N. Harrison 1400 Blush Rd
0002
144 J. Montagano
1108 5th Ave 0005
194 J. Spence 1244 Lark Ln 0001
303 E. Wingate 5222 Gains Dr 0001
101 H. Yarchun 177 Beach Rd
270 K. Steeg 140 Crystal Dr 0002
291 S. Ackerman 1172 Park Dr 0003
301 K. Snyder 196 279th Ave
Foreign KeyZOO-MEMBER
Relational Database Theory
• What happens when a Tuple is deleted?
– For every relationship, there are three possible delete options
• Cascades
–Delete the target tuple and
–Delete the related tuples
Relational Database Theory
• Restricted–Delete restricted to cases for
which there are no related tuples
• Nullifies–Delete the target tuple and–Set the FK to null in the related
tuples
Relational Database Theory
• Relational Algebra Operations
– Select
– Project
– Join
– Union
– Intersect
– Difference
Relational Database Theory
• Our Zoo Database Tables
ANID ANAME AFAMILY WEIGHT
0001 Candice
Camel 1800
0002 Zona Zebra 900
0003 Sam Snake 5
0004 Elmer Elephant 5000
0005 Leonard
Lion 1200
ANIMAL
MID MNAME MADDR *** ANID
171 N. Harrison 1400 Blush Rd
0002
144 J. Montagano
1108 5th Ave 0005
194 J. Spence 1244 Lark Ln 0001
303 E. Wingate 5222 Gains Dr 0001
101 H. Yarchun 177 Beach Rd
270 K. Steeg 140 Crystal Dr 0002
291 S. Ackerman 1172 Park Dr 0003
301 K. Snyder 196 279th Ave
ZOO-MEMBER ANIMAL-FOOD
ANID FOOD
0001 Hay
0001 Buns
0002 Brush
0003 Mice
0003 People
0004 Leaves
0005 People
0005 Meat
Relational Database Theory
• Relational Algebra: SELECT
– Extracts specified tuples from a relation (or get rows from a table)
Relational Database Theory
• Example: SELECT out from the ANIMAL-FOOD table (display) the rows where FOOD=PEOPLE
ANIMAL-FOODANID FOOD
0001 Hay
0001 Buns
0002 Brush
0003 Mice
0003 People
0004 Leaves
0005 People
0005 Meat
ANID FOOD
0003 People
0005 People
RESULTS
Relational Database Theory
• Relational Algebra: PROJECT
– Extracts specified attributes(columns) from a relation (or get columns from a table)
Relational Database Theory
• Example: PROJECT from the ZOO-MEMBER table columns (MID, NAME)
MID MNAME MADDR *** ANID
171 N. Harrison 1400 Blush Rd
0002
144 J. Montagano
1108 5th Ave 0005
194 J. Spence 1244 Lark Ln 0001
303 E. Wingate 5222 Gains Dr 0001
101 H. Yarchun 177 Beach Rd
270 K. Steeg 140 Crystal Dr 0002
291 S. Ackerman 1172 Park Dr 0003
301 K. Snyder 196 279th Ave
ZOO-MEMBERMID MNAME
171 N. Harrison
144 J. Montagano
194 J. Spence
303 E. Wingate
101 H. Yarchun
270 K. Steeg
291 S. Ackerman
301 K. Snyder
RESULTS
Relational Database Theory
• Relational Algebra: JOIN
– Join the data in two tables
• Concatenate one row from Table 1 with one row from Table 2
–Usually based on a common column called the join condition
Relational Database Theory
• Example: JOIN T1 and T2 based on the AFAMILY column
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0002 Zebra Zebra 03
RESULT
Relational Database Theory
• Different types of Joins– Equijoin – means a row in T1 is joined with a row in T2 where
the values in the common column(s) are equal– This is the most common type of join
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0002 Zebra Zebra 03
RESULT
Join T1 and T2 where T1.AFAMILY=T2.AFAMILY
Relational Database Theory
• Natural Join– The rows of T1 are joined with the rows of T2 where the PK
value in one table equals the FK value in the other table• Where column name are the same• Don’t use this in a Production Database – renaming causes
problems
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0002 Zebra Zebra 03
RESULT
T1 NATURAL JOIN T2
Relational Database Theory
• Inner Join– The rows of T1 are joined with the rows of
T2 based on the join condition specified• Only rows from T1 with a matching row
in T2 are in the result• Often an Inner Join is both a Natural & a
Equijoin
Relational Database Theory
• Example: Inner Join– T1 INNER JOIN T2 on
T1.AFAMILY=T2.AFAMILY
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0002 Zebra Zebra 03
RESULT
Relational Database Theory
• Outer Join– The rows of T1 are joined with the rows of
T2• All rows from one of the tables are
included in the result even if there is no matching row in the other table
Relational Database Theory
• Example: Outer Join– T1 RIGHT OUTER JOIN T2 on T1.AFAMILY=T2.AFAMILY
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
Snake 05
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0002 Zebra Zebra 03
Snake 05
RESULT
Relational Database Theory
• Cross Join– Every row in T1 is joined with every row in
T2• All possible combinations of rows in the
two tables• Also called a Cartesian Product
Relational Database Theory
• Example: Cross Join– T1 CROSS JOIN T2
ANID AFAMILY
0001 Camel
0002 Zebra
T1
AFAMILY AREA
Camel 01
Zebra 03
T2
ANID AFAMILY AFAMILY AREA
0001 Camel Camel 01
0001 Camel Zebra 03
0002 Zebra Camel 01
0002 Zebra Zebra 03
RESULT
Relational Database Theory
• An RDBMS manipulates Data using Relational Algebra Operations– There are (usually) several sequences of
operations to answer a query• One sequence may be more efficient
than another– A relational DBMS internally has routines
that do the relational algebra
Relational Database Theory
– A relational DBMS generates a sequence or plan of relational algebra operations to accomplish the request
– A relational DBMS has a query optimizer to develop an efficient query plan• A least-cost optimizer generates several
execution plans and chooses the least-cost one; i.e.. Least amount of I/O
Relational Database Theory
• Union, Intersection, and Minus
Union – union together (append) the result tables from two queries
Intersect – take only the rows that are identical in the result tables from two queries
Difference – take only the rows in the first result table that have no identical rows in the second result table
Relational Database Theory
• Relational Algebra: UNION– Union together the results of two queries
• Result contains every element in either one or both sets
– Query 1• Select the rows from ANIMAL where
WEIGHT > 2000 into T1• Project from T1(ANID) into result 1
Relational Database Theory
– Query 2• Select the rows from ANIMAL-FOOD
where FOOD=PEOPLE into T2• Project from T2(ANID) into Result 2
– Query 1 UNION Query 2
Relational Database Theory
ANID
0003
0004
0005
ANID
0003
0005
ANID
0004
RESULT 1 RESULT 2 RESULT
UNION
Relational Database Theory
• Relational Algebra: INTERSECTION– Take only the rows (tuples) that are
identical in the result tables of two queries• Query 1
– Select out the rows from ANIMAL where WEIGHT > 1000 into T1
– Project from T1(ANID) into Result 1
Relational Database Theory
• Query 2– Project from ZOO-MEMBER(ANID) into
Result 2• Query 1 INTERSECT Query 2
ANID
0001
0005
ANID
0002
0005
0001
0003
ANID
0001
0004
0005
RESULT 1 RESULT 2 RESULT
INTERSECT
Relational Database Theory
• Relational Algebra: Minus/Difference/Except– Subtract from the results of one query from
the results of a second query• Query 1
– Project from ANIMAL(ANID) into Result 1• Query 2
– Project from ZOO-MEMBER(ANID) into Result 2
Relational Database Theory
• Query 1 EXCEPT Query 2
ANID
0004
ANID
0002
0005
0001
0003
ANID
0001
0002
0003
0004
0005
RESULT 1 RESULT 2 RESULT
EXCEPT
Relational Database Theory
• Strengths of the Relational Approach– Simple
• People are familiar with tables• Few rules• Few operations
– Easy to learn• Relational algebra is straightforward• Multiple high-level, non-procedural
languages are available -SQL