Post on 20-Dec-2020
transcript
Relational Algebra
Topic 2Lesson 8 – Relational Algebra
2
Chapter 5 section 1 Connolly and Begg
3
What is Relational Algebra?
What is an algebra? A mathematical system consisting of:Operands --- variables or values from which new values can
be constructed.Operators --- symbols denoting procedures that construct new
values from given values.
What is relational algebra? Operands are relations or variables that represent a relationOperators are designed to do the most common things that
we need to do with relations in a database.
4
Properties of relational algebraRelational algebra operations work on one or more relations
to define another relation without changing the original relations.
Both operands and results are relations, so output from one operation can become input to another operation.
Allows expressions to be nested, just as in arithmetic. This property is the closure property.
Relational Algebra: a collection of operations that users can perform on relations to obtain a desired result
5
Basic Operations of Relational Algebra
6
Basic operations• SELECT : filter by rows• PROJECT: filter by columns• UNION: combine n compatible relations into one relation• INTERSECT: of relation A and B
are the tuples in both A and B• SET DIFFERENCE: (A-B) tuples in A but not in B• CARTESIAN PRODUCT: (AxB) each tuple in A is
concatenated with each tuple in B
7
SelectionProvides a filter to the tuples in the relation R. The operation
is represented via the 𝛔 (Greek letter sigma). AND (∧) , OR (V) and NOT (∽ ) are supported for creating compound predicates.
Example: 𝛔(id = 3) (student)
SQL representation?
( )spredicate R
sid Name School Credits_Earned Credits_Req Yr_grad3 Li Khoury 50 120 2020
RESULT
8
ProjectionProvides a filter to the columns in the relation R. It defines a
result which is a subset of the relation R. The operation is represented via the Π (Greek letter pi)
Example: Π id, name, school (student)
SQL representation?
( )Õcol1, ..., coln Rsid Name School1 Smith Khoury
2 Shah D’Amore McKim
3 Li Khoury
RESULT
9
Union• Union of two relations R and S defines a relation that
contains all the tuples of R, as well as the tuples of S, duplicate tuples are eliminated.
• R and S must be union-compatible.• If R and S have I and J tuples, respectively, union is obtained
by concatenating them into one relation with a maximum of (I+ J) tuples.
SYNTAX: 𝑅⋃𝑆
Example: (student1) ⋃ (student2)
10
UNION Result Example: (student1) ⋃ (student2)
sid Name School Credits_Earned Credits_Req Yr_grad3 Li Khoury 50 120 2020
1 Smith Khoury 32 120 2019
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
11
Set Difference• Identifies the tuples that are in R but not in S• R and S must be union-compatible.
SYNTAX:
Example: Π id, name, school (student1) − Π id, name, school (student2)
-R S
12
Set Difference ExampleExample: Π id, name, school (student3) − Π id, name, school (student2)
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
sid Name School Credits_Earned Credits_Req Yr_grad3 Li Khoury 50 120 2020
sid Name School1 Smith Khoury
13
Intersection • Identifies the tuples that are in R as well as in S• R and S must be union-compatible.• Can be derived from set difference R ⋂ S = R – (R – S)
SYNTAX:
Example: Π id, name, school (student1) ⋂ Π id, name, school (student2)
ÇR S
RS
14
Example: IntersectionExample: Π id, name, school (student3) ⋂ Π id, name, school (student2)
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
sid Name School Credits_Earned Credits_Req Yr_grad3 Li Khoury 50 120 2020
sid Name School3 Li Khoury
15
Cartesian Product • Defines a relation that is the concatenation of every tuple of
relation R with every tuple of relation S.
SYNTAX:Example: Π id, name, school (student1) X (student_major)
´R S SID MID1 1
1 3
2 1
3 2
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
16
Result Cartesian Product
Student.sid Name School Student_major.sid mid1 Smith Khoury 1 1
1 Smith Khoury 1 3
1 Smith Khoury 2 1
1 Smith Khoury 3 2
3 Li Khoury 1 1
3 Li Khoury 1 3
3 Li Khoury 2 1
3 Li Khoury 3 2
17
Example: Cartesian Product Example: Π id, name, school (student1) X (available_major)
Can I do a cartesian product between relations with no relationship ?
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
mid major
1 CS
2 DS
3 Accounting
sid Name School1 Smith Khoury
3 Li KhouryX
18
Result Cartesian ProductYes – but the results are meaningless.
sid Name School mid Major1 Smith Khoury 1 CS
1 Smith Khoury 2 DS
1 Smith Khoury 3 Accounting
3 Li Khoury 1 CS
3 Li Khoury 2 DS
3 Li Khoury 3 Accounting
19
RA and SQL practice workSelect the id, name, school from student1 and the student2
table Select the id, name, school from student for the student with
id 3.Select the id, name, school from student for the students with
id 3 or id 2.Select the id, name, school from student where the name is
Smith and the School name is KhourySelect the id, name, school from student1 as well as the id,
name, school from student2 table who are not found in the student3 table
20
RA for examples
Π id, name, school (student1) ⋃ Π id, name, school (student2)Π id, name, school 𝛔(id = 3) (student)𝛔(id = 3 V id = 2 ) Π id, name, school (student)𝛔(school = ‘Khoury’ ∧ name = ‘Smith’ ) Π id, name, school (student)
𝛔(id = 3 AND school = ‘Khoury’) Π id, name, school (student)
Π id, name, school (student1) ⋃( Π id, name, school (student2)
- Π id, name, school (student3) )
Can you convert these RA expressions to SQL?
21
SQL representation (1)
Π id, name, school (student1) ⋃ Π id, name, school (student2)SELECT id, name, school FROM student1UNIONSELECT id, name, school FROM student2;Π id, name, school 𝛔(id = 3) (student)SELECT id, name, school FROM student WHERE id = 3;𝛔(id = 3 V id = 2 ) Π id, name, school (student)SELECT id, name, school FROM student
WHERE id = 3 OR id = 2; 𝛔(school = ‘Khoury’ ∧ name = ‘Smith’ ) Π id, name, school (student)SELECT id, name, school FROM student
WHERE name = ‘Smith’ AND school = ‘Khoury’;
22
SQL representation (2)
𝛔(id = 3 ∧ school = ‘Khoury’) Π id, name, school (student)SELECT id, name, school FROM student WHERE id = 3
AND school = ‘Khoury’ ;
Π id, name, school (student1) ⋃( Π id, name, school (student2)
- Π id, name, school (student3) ) SELECT id, name, school FROM student2UNION SELECT id, name, school FROM student2
WHERE id, name, school NOT IN ( SELECT id, name, school FROM student3) ;
23
Relational Algebra Derived Operations
24
Derived operationsTheta join or a conditional JOIN INNER JOIN : the inner join of A AND B is
the cartesian product with a filter on the rowsNATURAL JOIN: the natural join of A AND B is the cartesian product with
an equality filter on all the common named columns between the 2 tables
SEMIJOIN: the semijoin of A and B is the tuples in A filters by some values from B
LEFT OUTER JOIN: the left outer join of A and B, is an INNER JOIN result, plus the tuples in A that do not have a corresponding value in B
RIGHT OUTER JOIN: the right outer join of A and B, is an INNER JOIN result, plus the tuples in B that do not have a corresponding value in A
DIVISION: division of A/B, A and B must have a field in common, call the field c. For all tuples in B, there must be a tuple in A for each value in B (for the common field).
25
JOIN VariationsVarious forms of join operation
– Theta join– Equijoin (a particular type of Theta join)– Natural join– Outer join– Semijoin
26
JOIN operationJoin is a derivative of the Cartesian product.It is equivalent to performing a Selection, using join predicate
as the selection criteria, over the cartesian product of the two operand relations.
JOIN is one of the most difficult operations to implement efficiently in a RDBMS and is one reason why RDBMSs have intrinsic performance problems.
27
Theta JOIN 𝜽-JoinProvides a filter to the cross product of two tables. The filter
may contain one of the following operations:
Filters may contain conjunction or disjunction, AND (∧) and OR (V)
Syntax:
Example: Student ⋈student.id=available_major.id available_major
If the filter involves equality, then it is called an equijoin
< £ > ³ = ¹( , , , , , ).
28
Natural JOINA natural join performs an equijoin over all the field names
that the two tables have in common. No filter needs to be provided since the common fields determine the filter. The result does NOT contain two copies of the common fields. It removes one copy of the common fields in the result.
Syntax:
Example: Student ⋈ student_major
29
OUTER JOINFor the result to contain tuples that do not have matching
values in the join column, use Outer join.There is LEFT OUTER and RIGHT OUTER. (Left) outer join is
join in which tuples from R that do not have matching values in common columns of S are also included in result relation.
Syntax:
Example: Student ⟕ student_major (LEFT) Includes students who do not have a major
Student ⋈ student_major ⟖ available_major (RIGHT)Included majors that a student has not declared
30
SemijoinDefines a relation that contains the tuples of R that participate
in the join of R with S. S provides a filter for the R tuples.
Syntax:
Example: student ⊳student.sid =student_major.sid student_major
31
Applying Aggregate OperationsApplies a list of aggregate operations to a collection of tuples.
The AL (aggregate list) list contains pairs of operations and attribute names:
(<aggregate_function>, <attribute>) pairs.
SYNTAX:
Supported aggregate functions are: COUNT, SUM, AVG, MIN, and MAX.
( )ÁAL R
32
Example: Aggregate Function
Count the number of schools in the student table.
𝕱(COUNT, school) (student)
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
2 Shah D’Amore McKim 64 128 2019
3 Li Khoury 50 120 2020
COUNT(school)
2
33
Grouping Operation
• Groups tuples of R by grouping attributes, GA, and then applies aggregate function list, AL, to define a new relation.
• AL contains one or more (<aggregate_function>, <attribute>) pairs.
• Resulting relation contains the grouping attributes, GA, along with results of each of the aggregate functions.
– SYNTAX: ( )ÁGA AL R
34
Example: Aggregate with GROUP BY
Count the number of students per school.
school 𝕱(COUNT, sid) (student)
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
2 Shah D’Amore McKim 64 128 2019
3 Li Khoury 50 120 2020
school COUNT(sid)
Khoury 2
D’Amore McKim 1
35
Rename OperationFor many operations, an alias or an attribute name can be
assigned to either a relation or an attribute the rename operation, also known as the rho operation (𝜌 ) provides this functionality.
Rename a relation:
Syntax 𝜌new_name(old_name)
Rename an attribute:
Syntax 𝜌old à new1,..,oln à newn (old_name)
36
Rename a Relation result
Syntax 𝜌new_name(old_name) or
𝜌(new_name, old_name)
𝜌khoury_students(𝛔 school = ‘Khoury’ (student))
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
khoury_students
37
Rename Fields
Syntax 𝜌old à new1,..,oln à newn (old_name)
𝜌id à student_id, school à college (khoury_student)
Student_id Name college Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
38
Rename Fields and Table
Syntax 𝜌new_name(old à new1,..,oln à newn) (old_name)
New_name(f1, f2, f3) ← (old_expression)
khoury_student (student_id, name, college) ←Π id, name, school (𝛔 school = ‘Khoury’ (student)))
Student_id Name college1 Smith Khoury
3 Li Khoury
sid Name School Credits_Earned Credits_Req Yr_grad1 Smith Khoury 32 120 2019
3 Li Khoury 50 120 2020
39
Division OperationDivision is the inverse operation for the cross product. Defines
a relation over the attributes C that consists of set of tuples from R that match a combination of every tuple in S.
SYNTAX:
Definition of division using basic operations:
More on this operation later.
÷R S
( )¬Õ
¬Õ ´ -
¬ -
1 c
2 c 1
1 2
T RT ((S T ) R)T T T
40
Student, Register and Lecturer relations1. Name of people who registered for ‘History 101’
2. Find the lecturersteaching History 101, whose Students GPA >3.2
SID Name Login DoB GPA
55515 Smith smith@ccs Jan 10,1990 3.82
55516 Jones jones@hist Feb 11, 1992 2.98
55517 Ali ali@math Sep 22, 1989 3.11
55518 Smith smith@math Nov 30, 1991 3.32
LID Name CID
45 Fisk History 101
46 Alder Biology 220
47 Wong History 101
48 Foster Music 101
Sid CId LID Grade
55515 History 101 45 C
5516 History 101 47 a
5515 Music 101 48 B
5516 Biology 220 46 C
55515 Biology 220 46 A
55517 History 101 45 B
55518 Music 101 48 A
41
Summary
In this module you learned:Relational operators: selection, projection, union,
intersect, set difference, cross product, natural join, theta join, equijoin, semijoin, aggregate operations, grouping operations, division.
Relational Data Model