Other Query Languages II
Winter 2006-2007Lecture 12
Last Lecture
• Previously discussed tuple relational calculus– Purely declarative query language– Same expressive power as relational algebra– Could also express unsafe statements
• Statements that generate infinite relations!
• Datalog language based on relational calculus– Very succinct, clean language for stating queries– Some powerful features, such as recursive queries– Not used in commercial database application
development
Domain Relational Calculus• Another form of relational calculus• Instead of tuple variables, uses domain
variables– Values range over an attribute’s domain
• Very similar to tuple relational calculus• Queries have the form:
{ < x1, x2, …, xn > | P(x1, x2, …, xn) }– xi are domain variables– Schema of result specified by < x1, x2, …, xn >– P is a formula composed of atoms
Formal Definitions
• Valid atoms for P:< x1, x2, …, xn > ∈ r
• r is a relation with n attributesx Θ y
• x and y are domain variables• Θ is a comparison
x Θ c• c is a constant
• All atoms are also formulas
Formal Definitions (2)• Compositions of atoms and formulas:
– If P1 is a formula, then so are ŸP1 and (P1)– If P1 and P2 are formulae, then so are:
• P1 ⁄ P2• P1 ¤ P2• P1 ⇒ P2
– If P1(x) is a formula where x is a free domain variable, then so are:
• ∀ x (P1(x)) – for all values in x, P1(x) is true• ∃ x (P1(x)) – there exists a value in x where P1(x) is true
– Shorthand:• ∃ a, b, c (P1(a, b, c)) instead of ∃ a (∃ b (∃ c (P1(a, b, c))))
Example Queries
Find all details of loans over $1200.• In tuple relational calculus:
{ t | t ∈ loan ⁄ t[amount] > 1200 }• In domain relational calculus:
{ < n, b, a > | < n, b, a > ∈ loan ⁄ a > 1200 }– Very similar to tuple relational calculus form
Example Queries (2)Find loan numbers for loans over $1200.• In tuple relational calculus:
{ t | ∃ s ∈ loan ( t[loan_number] = s[loan_number] ⁄s[amount] > 1200 ) }
• In domain relational calculus:{ < n > | ∃ b, a ( < n, b, a > ∈ loan ⁄ a > 1200 ) }
• Difference is when variables are constrained– s is bound to loan immediately, by s ∈ loan– b, a are initially unconstrained, until formula
< n, b, a > ∈ loan
Joining Relations
• This query requires multiple relations:“Find the names of customers with loans at the Perryridge branch, and the loan amounts.”
• Domain relational calculus:{ < c, a > | ∃ n ( < c, n > ∈ borrower ⁄∃ b ( < n, b, a > ∈ loan ⁄ b = “Perryridge” )) }
– All customer names and amounts, such that:• Customer name appears in borrower, and…• Associated loan number in borrower also appears
in loan, with a branch name of “Perryridge”
Joining Relations (2)• Domain relational calculus:
{ < c, a > | ∃ n ( < c, n > ∈ borrower ⁄∃ b ( < n, b, a > ∈ loan ⁄ b = “Perryridge” )) }
• Tuple relational calculus:{ t | ∃ s ∈ borrower (
t[customer_name] = s[customer_name] ⁄∃ u ∈ loan ( u[loan_number] = s[loan_number] ⁄
t[amount] = u[amount] ⁄u[branch_name] = “Perryridge” )) }
• Join operation is more implicit in DRC form– Statement of relationships is very explicit in TRC
Set Operations
“Find customers with an account or a loan.”– Set union operation
• Domain relational calculus:{ < c > | ∃ ln ( < c, ln > ∈ borrower ) ¤∃ an ( < c, an > ∈ depositor ) }
• Can change to set intersection, set difference with simple modifications– Identical to tuple relational calculus
Safety of Expressions
• Like tuple relational calculus, can specify unsafe expressions{ < n, b, a > | Ÿ( < n, b, a > ∈ loan ) }– Same issue as before: result contains values outside
the domain of the formula• What about this:
{ < x > | ∃ y ( < x, y > ∈ r) ⁄∃ z (Ÿ( < x, z > ∈ r ) ⁄ P(x, z)) }
– Domain variable z can range over infinite values!– Can’t evaluate second half of this formula
Safety of Expressions (2)• For an expression:
{ < x1, x2, …, xn > | P(x1, x2, …, xn) }• Considered safe if these rules hold:
– All values that appear in expression’s result are from dom(P)
– For every “there exists” subformula of form ∃ x (P1(x)), the subformula is true iff there is a value x in dom(P1) such that P1(x) is true
– For every “for all” subformula of form ∀ x (P1(x)), the subformula is true iff P1(x) is true for all values x from dom(P1)
– Ensures that no domain variable has an infinite set of values
Safety of Expressions (3)
• For this query:{ < x > | ∃ y ( < x, y > ∈ r) ⁄
∃ z (Ÿ( < x, z > ∈ r ) ⁄ P(x, z)) }• Formula ∃ z (Ÿ( < x, z > ∈ r ) ⁄ P(x, z)) is
true for values of z outside of formula’s domain– Not safe.
Domain Relational Calculus
• Has same expressive power as tuple relational calculus, and relational algebra– (if restricted to safe expressions)– As before, grouping and aggregation are
extended operations, and must be added• Simpler than tuple relational calculus for
some queries– Schema of an expression is more obvious– Expressing relationships is very easy
Query By Example
• QBE is a query language based on the domain relational calculus
• QBE syntax is two-dimensional– Queries actually look like tables– Query specifies “examples” of what to retrieve
• Two variants:– The original text-based version– A graphical version used in Microsoft Access
Query Design interface
QBE Skeleton Tables
• Skeleton tables specify results to retrieve– Same columns as the actual tables, but with
details of what to retrieve• In forming a query, only required tables
are displayed– Limits clutter
• Example skeleton table:loan loan_number branch_name amount
Example Queries
• Find loan numbers of loans at Perryridgebranch
P. means “print this value”_x is a domain variable (not required here)Perryridge is a literal value
• QBE eliminates duplicates automatically– Can specify ALL. to display all values
loan loan_number branch_name amount
P._x Perryridge
loan loan_number branch_name amount
P.ALL. Perryridge
Example Queries (2)
• To display all tuples in relation:
• Can also specify conditions to apply– Find all loans over $700
• Can apply negation– Find names of all branches not located in Brooklyn
loan loan_number branch_name amount
P.
loan loan_number branch_name amount
P. > 700
branch branch_name branch_city assets
P. ŸBrooklyn
QBE Domain Variables
• Can use variables to constrain results• Example:
– Find all customers who live in same city as Jones
– Second row constrains _y to cities associated with customer name “Jones”
– First row prints customer names with same _y value
customer customer_name customer_street customer_city
P._xJones
_y_y
Multi-Relation Queries• QBE supports multiple-relation queries• Approach is simple:
– Use same variable name in multiple skeleton tables– QBE constrains results to have matching values
• Example:Find names of customers with a loan at Perryridgebranch
loan loan_number branch_name amount
_x Perryridge
borrower customer_name loan_number
P._y _x
Multi-Relation Queries (2)
• Can also perform “not-in” queries• Example:
Find names of customers with an account, but no loan.
borrower customer_name loan_number
ÿ _x
depositor customer_name account_number
P._x
The Condition Box• QBE also has a “condition box” for more general
constraints– Easier way to formulate some queries
• Example:Find loan numbers of loans made to Smith or Jones.
– Can state this withoutcondition box too, butit’s more confusing
borrower customer_name loan_number
_n P.
conditions
_n = Smith or _n = Jones
Microsoft Access QBE• MS Access includes a QBE interface
– Called Graphical Query-by-Example (GQBE)• Like text-based QBE, tables are selected for
each query– Can also select other queries, to build a query against
a derived relation• Unlike QBE, joins are represented graphically,
with links between tables– Don’t need to express relationships with variables
• Somewhat different layout than text-based QBE
Example GQBE Queries
Find loan numbers of loans at Perryridge branch:
Example Queries (2)
Report total account deposits per customer city:
Example GQBE Queries (3)
• GQBE queries are translated into SQL• Previous query, in SQL:
SELECT customer.customer_city,Sum(account.balance) AS SumOfbalance
FROM (customer INNER JOIN depositor ONcustomer.customer_name = depositor.customer_name)INNER JOIN account ONdepositor.account_number = account.account_number
GROUP BY customer.customer_city;
– Like all auto-generated SQL, it’s grungy…
Query By Example• Another query language, based on domain
relational calculus• Language is inherently visual in nature
– Allows queries to be stated in a reasonably simple, intuitive way
– User gives “examples” of what they want to retrieve• Not used in large-scale applications
– MS Access is widely used, but primarily for small, simple databases
– Graphical QBE interface is much simpler than SQL for casual database users
Upcoming Events
• Next lecture is the midterm review– Make sure to come to lecture
• Next week we start focusing on database schema design– Entity-relationship model– Dependencies and normal forms