Date post: | 22-Dec-2015 |
Category: |
Documents |
Upload: | mary-johnston |
View: | 221 times |
Download: | 0 times |
Murali Mani
What is Relational Algebra?
Defines operations (data retrieval) for relational model
SQL’s DML (Data Manipulation Language) has data retrieval facilities, which are equivalent to that of relational algebra.
SQL and relational algebra are not for complex operations; they support efficient, easy access of large data sets.
Murali Mani
Basics Relational Algebra is defined on bags, rather than
relations (sets). Bag or multiset allows duplicate values; but order is
not significant. We can write an expression using relational algebra
operators with parentheses Need closure – an operator on bag returns a bag. Relational algebra includes set operators, and other
operators specific to relational model.
Murali Mani
Set Operators Union, Intersection, Difference, cross product Union, Intersection and Difference are
defined only for union compatible relations. Two relations are union compatible if they
have the same set of attributes and the types (domains) of the attributes are the same.
Eg of two relations that are not union compatible: Student (sNumber, sName) Course (cNumber, cName)
Murali Mani
Union: Consider two bags R1 and R2 that are union-
compatible. Suppose a tuple t appears in R1 m times, and in R2 n times. Then in the union, t appears m + n times.
A B
1 2
3 4
1 2
R1
A B
1 2
3 4
5 6
R2 A B
1 2
1 2
1 2
3 4
3 4
5 6
R1 R2
Murali Mani
Intersection: ∩ Consider two bags R1 and R2 that are union-
compatible. Suppose a tuple t appears in R1 m times, and in R2 n times. Then in the intersection, t appears min (m, n) times.
A B
1 2
3 4
1 2
R1
A B
1 2
3 4
5 6
R2
A B
1 2
3 4
R1 ∩ R2
Murali Mani
Difference: - Consider two bags R1 and R2 that are union-
compatible. Suppose a tuple t appears in R1 m times, and in R2 n times. Then in R1 – R2, t appears max (0, m - n) times.
A B
1 2
3 4
1 2
R1
A B
1 2
3 4
5 6
R2
A B
1 2
R1 – R2
Murali Mani
Bag semantics vs Set semantics Union is idempotent for sets:
(R1 R2) R2 = R1 R2 Union is not idempotent for bags. Intersection is idempotent for sets and bags. Difference is idempotent for sets, but not for
bags. For sets and bags, R1 R2 = R1 – (R1 – R2).
Murali Mani
Cross Product (Cartesian Product): X Consider two bags R1 and R2. Suppose a tuple
t1 appears in R1 m times, and a tuple t2 appears in R2 n times. Then in R1 X R2, t1t2 appears mn times.
A B
1 2
1 2
R1
B C
2 3
4 5
4 5
R2 A R1.B R2.B C
1 2 2 3
1 2 2 3
1 2 4 5
1 2 4 5
1 2 4 5
1 2 4 5
R1 X R2
Murali Mani
Basic Relational Operations Select, Project, Join Select: denoted σC (R): selects the subset of
tuples of R that satisfies selection condition C. C can be any boolean expression, its clauses can be combined with AND, OR, NOT.
A B C
1 2 5
3 4 6
1 2 7
1 2 7
R σ(C ≥ 6) (R)
A B C
3 4 6
1 2 7
1 2 7
Murali Mani
Select
Select is commutative: σC2 (σC1 (R)) = σC1 (σC2
(R)) Select is idempotent: σC (σC (R)) = σC (R) We can combine multiple select conditions
into one condition. σC1 (σC2 (… σCn (R)…)) = σC1 AND C2 AND … Cn (R)
Murali Mani
Project: πA1, A2, …, An (R)
Consider relation (bag) R with set of attributes AR. πA1, A2, …, An (R), where A1, A2, …, An AR returns the tuples in R, but only with columns A1, A2, …, An.
A B C
1 2 5
3 4 6
1 2 7
1 2 8
R πA, B (R)
A B
1 2
3 4
1 2
1 2
Murali Mani
Project: Bag Semantics vs Set Semantics
For bags, the cardinality of R = cardinality of πA1, A2, …, An (R).
For sets, cardinality of R ≥ cardinality of πA1,A2,
…, An (R). For sets and bags
project is not commutative project is idempotent
Murali Mani
Natural Join: R ⋈ S Consider relations (bags) R with attributes
AR, and S with attributes AS. Let A = AR ∩ AS. R ⋈ S can be defined as
πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 =S.A2 AND … R.An=S.An (R X S))where A = {A1, A2, …, An}The above expression says: select those tuples in R X S that agree in values for each of the A attributes, and project the resulting tuples such that we have only one value for each A attribute.
Murali Mani
Theta Join: R ⋈C S Theta Join is similar to cartesian product,
except that we can specify any condition C. It is defined as
R ⋈
C S = (σC (R X S))
A B
1 2
1 2
R1
B C
2 3
4 5
4 5
R2
R1 ⋈
R1.B<R2.BR2
A R1.B R2.B C
1 2 4 5
1 2 4 5
1 2 4 5
1 2 4 5
Murali Mani
Outer Join: R ⋈o S Similar to natural join, however, if there is a
tuple in R, that has no “matching” tuple in S, or a tuple in S that has no matching tuple in R, then that tuple also appears, with null values for attributes in S (or R).
A B C
1 2 3
4 5 6
7 8 9
R1
B C D
2 3 10
2 3 11
6 7 12
R2
R1 ⋈o R2
A B C D
1 2 3 10
1 2 3 11
4 5 6 null
7 8 9 null
null 6 7 12
Murali Mani
Left Outer Join: R ⋈oLS
Similar to natural join, however, if there is a tuple in R, that has no “matching” tuple in S, then that tuple also appears, with null values for attributes in S (note: a tuple in S that has no matching tuple in R does not appear).
A B C
1 2 3
4 5 6
7 8 9
R1
B C D
2 3 10
2 3 11
6 7 12
R2
R1 ⋈o
L R2
A B C D
1 2 3 10
1 2 3 11
4 5 6 null
7 8 9 null
Murali Mani
Right Outer Join: R ⋈oRS
Similar to natural join, however, if there is a tuple in S, that has no “matching” tuple in R, then that tuple also appears, with null values for attributes in R (note: a tuple in R that has no matching tuple in S does not appear).
A B C
1 2 3
4 5 6
7 8 9
R1
B C D
2 3 10
2 3 11
6 7 12
R2
R1 ⋈o
R R2
A B C D
1 2 3 10
1 2 3 11
null 6 7 12
Murali Mani
Renaming: ρS(A1, A2, …, An) (R) Rename relation R to S, attributes of R are
renamed to A1, A2, …, An ρS (R) renames relation R to S, keeping the
attributes same.
B C D
2 3 10
2 3 11
6 7 12
R2
X C D
2 3 10
2 3 11
6 7 12
ρS(X, C, D) (R2)
SB C D
2 3 10
2 3 11
6 7 12
ρS (R2)
S
Murali Mani
Example: Introducing new relationsFind the semijoin of 2 relations R, S. Semijoin denoted R ⋉ S is defined as the tuples in R, such that for a tuple t1 in R, if there exists a tuple t2 in S, and t1 and t2 agree in all attributes common to R and S, then t1 appears in the result.
R1 = R S⋈R2 = πAR
(R1)
R ⋉ S = R2 R⋂
Murali Mani
Duplicate Elimination: (R)
Convert a bag to a set.
R
A B
1 2
3 4
1 2
1 2
(R)
A B
1 2
3 4
Murali Mani
Extended Projection: πL (R) Here L can be
An attribute (just like simple projection) An expression x → y, where x and y are names of
attributes, this renames attribute x to y. An expression E → z, where E is any expression
involving attributes, constants, and arithmetic and string operators. This has an attribute called z whose values are given by E.
B C D
2 3 10
2 3 11
6 7 12
R πB→A, C+D→X, C, D (R)
A X C D
2 13 3 10
2 14 3 11
6 19 7 12
Murali Mani
Aggregation operators MIN, MAX, COUNT, SUM, AVG AGGB (R) considers only non-null values of R.
R
A B
1 2
3 4
1 null
1 3
MINB (R)
2
MINB (R)
MAXB (R)
4
MAXB (R)
COUNTB (R)
3
COUNTB (R)
SUMB (R)
9
SUMB (R)
AVGB (R)
3
AVGB (R)
COUNT* (R)
4
COUNT* (R)
Murali Mani
Aggregation Operators MIN, MAX, SUM, AVG must be on any 1
attribute. COUNT can be on any 1 attribute or COUNT* (R)
An aggregation operator returns a bag, not a single value ! But SQL allows treatment as a single value.
A B
3 4
σB=MAXB (R) (R)
Murali Mani
Grouping Operator: GL, AL (R) GL, AL (R) groups all attributes in GL, and
performs the aggregation specified in AL.
title year starName
SW1 77 HF
Matrix 99 KR
6D&7N 93 HF
SW2 79 HF
Speed 94 KR
StarsIn
starName, MIN (year)→year, COUNT(title) →num (StarsIn)
starName year num
HF 77 3
KR 94 2
Murali Mani
Sorting Operator: L (R)
It sorts the tuples in R. If L is list A1, A2, …, An, it first sorts by A1, then by A2, and so on.
Sort is used as a last operator in an expression.
A B C
1 2 5
3 1 6
1 2 7
1 3 8
RA B C
1 2 5
1 2 7
1 3 8
3 1 6
A,B (R)