+ All Categories
Home > Documents > Advance SQL

Advance SQL

Date post: 04-Apr-2018
Category:
Upload: ahmad-shdifat
View: 232 times
Download: 1 times
Share this document with a friend

of 103

Transcript
  • 7/30/2019 Advance SQL

    1/103

    SQL and More Databases Final

  • 7/30/2019 Advance SQL

    2/103

    Simple SQL Queries

    A SQL query has a form:SELECT . . .

    FROM . . .

    WHERE . . .;

    The SELECT clause indicates which attributes shouldappear in the output.

    The FROM gives the relation(s) the query refers to The WHERE clause is a Boolean expression indicating

    which tuples are of interest.

    The query result is a relation

    Note that the result relation is unnamed.

  • 7/30/2019 Advance SQL

    3/103

    Example SQL Query

    Relation schema:

    Course (courseNumber, name, noOfCredits)

    Query:Find all the courses stored in the database

    Query in SQL:

    SELECT

    FROM Course;

    Note:

    means all the attributes in the relationsinvolved.

  • 7/30/2019 Advance SQL

    4/103

    Example SQL Query

    Relation schema:

    Movie (title, year, length, filmType)

    Query:Find the titles of all movies stored in the database

    Query in SQL:

    SELECT title

    FROM Movie;

  • 7/30/2019 Advance SQL

    5/103

    Example SQL Query

    Relation schema:

    Student (ID, firstName, lastName, address, GPA)

    Query:Find the ID of every student who has GPA > 3

    Query in SQL:

    SELECT ID

    FROM Student

    WHERE GPA > 3;

  • 7/30/2019 Advance SQL

    6/103

    Example SQL Query

    Relation schema:

    Student (ID, firstName, lastName, address, GPA)

    Query:Find the ID and last name of every student with first name John,

    who has GPA > 3

    Query in SQL:

    SELECT ID, lastName

    FROM Student

    WHERE firstName = John AND GPA > 3;

  • 7/30/2019 Advance SQL

    7/103

    WHERE clause The expressions that may follow WHERE are conditions

    Standard comparison operators includes{ =, , , = }

    The values that may be compared include constants and

    attributes of the relation(s) mentioned in FROM clause Simple expression

    Aop Value

    AopB

    Where A, B are attributes and op is a comparison operator

    We may also apply the usual arithmetic operators, +,-,*,/, etc. tonumeric values before comparing them

    (year - 1930) * (year - 1930) 100

    The result of a comparison is a Boolean value TRUE orFALSE

    Boolean expressions can be combined by the logical operatorsAND, OR, and NOT

  • 7/30/2019 Advance SQL

    8/103

    Example SQL Query

    Relation schema:

    Movie (title, year, length, filmType)

    Query:Find the titles of all color movies produced in 1990

    Query in SQL:

    SELECT title

    FROM Movie

    WHERE filmType = color AND year = 1990;

  • 7/30/2019 Advance SQL

    9/103

    Example SQL Query

    Relation schema:Movie (title, year, length, filmType)

    Query:Find the titles of all color movies that are either made after 1970 or

    are less than 90 minutes long

    Query in SQL:SELECT title

    FROM Movie

    WHERE (year > 1970 OR length < 90) ANDfilmType = color;

    Note on precedence rules:

    AND takes precedence overOR, andNOT takes recedence over both

  • 7/30/2019 Advance SQL

    10/103

    Products and Joins

    SQL has a simple way to couple relations in one query

    list each relevant relation in the FROM clause

    All the relations in the FROM clause are coupled throughCartesian product (, in algebra)

  • 7/30/2019 Advance SQL

    11/103

    Cartesian Product

    From Set Theory:

    The Cartesian Product of two sets R and S is the

    set ofall pairs (a, b) such that: a R and b S. Denoted as R S

    Note:

    In general, R S S R

  • 7/30/2019 Advance SQL

    12/103

    ExampleInstance S:Instance R:

    R x S:

    B C D

    2 5 6

    4 7 8

    9 10 11

    A B

    1 2

    3 4

    A R.B S.B C D

    1 2 2 5 6

    1 2 4 7 8

    1 2 9 10 11

    3 4 2 5 6

    3 4 4 7 8

    3 4 9 10 11

  • 7/30/2019 Advance SQL

    13/103

    ExampleInstance of Course:Instance of Student:

    SELECT FROM Student, Course;ID firstName lastName GPA Address courseNumber name noOfCredits

    111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3

    111 Joe Smith 4.0 45 Pine av. Comp353 Databases 4

    222 Sue Brown 3.1 71 Main st. Comp352 Data structures 3

    222 Sue Brown 3.1 71 Main st. Comp353 Databases 4

    333 Ann Johns 3.7 39 Bay st. Comp352 Data structures 3

    333 Ann Johns 3.7 39 Bay st. Comp353 Databases 4

    ID firstName lastName GPA Address

    111 Joe Smith 4.0 45 Pine av.

    222 Sue Brown 3.1 71 Main st.

    333 Ann Johns 3.7 39 Bay st.

    courseNumber name noOfCredits

    Comp352 Data structures 3

    Comp353 Databases 4

  • 7/30/2019 Advance SQL

    14/103

    ExampleInstance of Course:Instance of Student:

    SELECT ID, courseNumberFROM Student, Course;

    ID firstName lastName GPA Address

    111 Joe Smith 4.0 45 Pine av.

    222 Sue Brown 3.1 71 Main st.

    333 Ann Johns 3.7 39 Bay st.

    courseNumber name noOfCredits

    Comp352 Data structures 3

    Comp353 Databases 4

    ID courseNumber

    111 Comp352

    111 Comp353

    222 Comp352

    222 Comp353

    333 Comp352

    333 Comp353

  • 7/30/2019 Advance SQL

    15/103

    Example

    Relation schemas:

    Student (ID, firstName, lastName, address, GPA)

    Ugrad (ID, major) Query:

    Find all information available about every undergraduate student

    We can try to compute the Cartesian product ( )

    SELECT FROM Student, Ugrad;

  • 7/30/2019 Advance SQL

    16/103

    ExampleInstance of Ugrad:Instance of Student:

    SELECT FROM Student, Ugrad;ID firstName lastName GPA Address ID major

    111 Joe Smith 4.0 45 Pine av. 111 CS

    111 Joe Smith 4.0 45 Pine av. 333 EE

    222 Sue Brown 3.1 71 Main st. 111 CS

    222 Sue Brown 3.1 71 Main st. 333 EE

    333 Ann Johns 3.7 39 Bay st. 111 CS

    333 Ann Johns 3.7 39 Bay st. 333 EE

    ID firstName lastName GPA Address

    111 Joe Smith 4.0 45 Pine av.

    222 Sue Brown 3.1 71 Main st.

    333 Ann Johns 3.7 39 Bay st.

    ID major

    111 CS

    333 EE

    Which tuples should

    be in the query result

    andwhich shouldnt?

  • 7/30/2019 Advance SQL

    17/103

    ExampleInstance of Ugrad:Instance of Student:

    SELECT FROM Student, Ugrad

    WHERE Student.ID = Ugrad.ID;

    ID firstName lastName GPA Address ID major

    111 Joe Smith 4.0 45 Pine av. 111 CS

    333 Ann Johns 3.7 39 Bay st. 333 EE

    ID firstName lastName GPA Address

    111 Joe Smith 4.0 45 Pine av.

    222 Sue Brown 3.1 71 Main st.

    333 Ann Johns 3.7 39 Bay st.

    ID major

    111 CS

    333 EE

  • 7/30/2019 Advance SQL

    18/103

    Joins in SQL

    The above query is an example ofJoin operation

    There are various kinds of joins and we will study them

    later in detail To join relations R1,,Rn in SQL:

    List all these relations in the FROM clause

    Express the conditions in the WHERE clause in order to get the

    desired join

  • 7/30/2019 Advance SQL

    19/103

    Joining Relations

    Relation schemas:

    Movie (title, year, length, filmType)

    Owns (title, year, studioName) Query:

    Findtitle, length, andstudio nameof every movie

    Query in SQL:

    SELECT Movie.title, Movie.length, Owns.studioName

    FROM Movie, Owns

    WHERE Movie.title = Owns.titleANDMovie.year= Owns.year;

    Is Owns in Owns.studioName necessary?

  • 7/30/2019 Advance SQL

    20/103

    Joining Relations

    Relation schemas:

    Movie (title, year, length, filmType)

    Owns (title, year, studioName) Query:

    Find the title and length of every movie produced by Disney

    Query in SQL:

    SELECTMovie.title, length

    FROM Movie, Owns

    WHERE Movie.title = Owns.titleANDMovie.year= Owns.yearANDstudioName = Disney;

  • 7/30/2019 Advance SQL

    21/103

    Joining Relations Relation schemas:

    Movie (title, year, length, filmType)Owns (title, year, studioName)

    StarsIn (title, year, starName) Query:

    Find the title and length of each movie with Julia Roberts,produced by Disney

    Query in SQL:SELECT Movie.title, Movie.lengthFROM Movie, Owns, StarsInWHERE Movie.title = Owns.titleANDMovie.year= Owns.year

    ANDMovie.title = StarsIn.titleANDMovie.year= StarsIn.year

    ANDstudioName = Disney ANDstarName = Julia Roberts;

  • 7/30/2019 Advance SQL

    22/103

    Example

    title year starName

    T1 1990 JR

    T2 1991 JR

    title year studioName

    T1 1990 Disney

    T2 1991 MGM

    title year length filmTyp

    e

    T1 1990 124 color

    T2 1991 144 color

    SELECT Movie.title, Movie.lengthFROM Movie, Owns, StarsInWHERE Movie.title = Owns.title AND Movie.year = Owns.yearAND

    Movie.title = StarsIn.title AND Movie.year = StarsIn.yearAND

    studioName = Disney ANDstarName = Julia Roberts;

    title length

    T1 124

    MovieOwns

    StarsIn

  • 7/30/2019 Advance SQL

    23/103

    Example

    Relation schemas:Movie (title, year, length, filmType, studioName, producerC#)Exec (name, address, cert#, netWorth)

    Query:Find thenameof theproducerof Star Wars

    Query in SQL:

    SELECTExec.nameFROM Movie, Exec

    WHERE Movie.title = Star WarsAND

    Movie.producerC# = Exec.cert#;

  • 7/30/2019 Advance SQL

    24/103

    Example

    Relation schemas:Movie (title, year, length, filmType, studioName, producerC#)Exec (name, address, cert#, netWorth)

    Query:Find the nameof the producer of Star Wars

    Query with Subquery:SELECTname

    FROM Exec

    WHERE cert# =( SELECT producerC#

    FROM Movie

    WHERE title =Star Wars);

  • 7/30/2019 Advance SQL

    25/103

    Example Relation schemas:

    Movie(title, year, length, filmType, studioName, producerC#)Exec(name, address, cert#, netWorth)

    StarsIn(title, year, starName) Query: Find the names of the producers ofHarrison Fords movies Query in SQL:

    SELECTnameFROM ExecWHEREcert# IN(SELECTproducerC#

    FROM MovieWHERE (title, year)IN(SELECTtitle, year

    FROM StarsIn

    WHEREstarName= Harrison Ford));

  • 7/30/2019 Advance SQL

    26/103

    Example Relation schemas:

    Movie(title, year, length, filmType, studioName, producerC#)Exec(name, address, cert#, netWorth)

    StarsIn(title, year, starName) Query:Find names of the producers of Harrison Fords movies

    Query in SQL:SELECT Exec.name

    FROM Exec, Movie, StarsInWHERE Exec.cert# = Movie.producerC# AND

    Movie.title = StarsIn.title ANDMovie.year= StarsIn.yearAND

    starName = Harrison Ford;

  • 7/30/2019 Advance SQL

    27/103

    Correlated Subqueries

    Relation schema:Movie(title, year, length, filmType, studioName, producerC#)

    Query:Find movie titles that appear more than once

    Query in SQL:SELECT title

    FROM Movie OldWHERE year< ANY (SELECT year

    FROM Movie

    WHERE title = Old.title);

    Note the scopes of the variables in this query.

  • 7/30/2019 Advance SQL

    28/103

    Correlated Subqueries Query in SQL

    SELECT title

    FROM Movie Old

    WHERE year ANY (SELECT yearFROM Movie

    WHERE title = Old.title);

    The condition in the outerWHERE is true only if there is a movie with samtitle as Old.title that has a lateryear The query will produce a title one fewer times than there are movies with that title

    What would be the result if we used , instead of ? For a movie title appearing 3 times, we would get 3 copies of the title in the output

  • 7/30/2019 Advance SQL

    29/103

    Aggregation in SQL

    SQL provides five operators that apply to a column ofa relation and produce some kind of summary

    These operators are called aggregations These operators are used by applying them to a

    scalar-valued expression, typically a column name, ina SELECTclause

  • 7/30/2019 Advance SQL

    30/103

    Aggregation Operators SUM

    the sum of values in the column

    AVG

    the average of values in the column

    MIN

    the least value in the column

    MAX the greatest value in the column

    COUNT

    the number of values in the column, including the duplicates, unlessthe keyword DISTINCT is used explicitly

  • 7/30/2019 Advance SQL

    31/103

    Example

    Relation schema:Exec(name, address, cert#, netWorth)

    Query:

    Find the average net worth of all movie executives Query in SQL:

    SELECTAVG(netWorth)

    FROM Exec; The sum of all values in the column netWorth divided by

    the number of these values

    In general, if a tuple appears n times in a relation, it will be

    counted n times when computing the average

  • 7/30/2019 Advance SQL

    32/103

    Example

    Relation schema:Exec (name, address, cert#, netWorth)

    Query:How many tuples are there in the Exec relation?

    Query in SQL:SELECTCOUNT(*)

    FROM Exec;

    The use of* as a parameter is unique to COUNT;

    using * does not make sense for other aggregation operations

  • 7/30/2019 Advance SQL

    33/103

    Example

    Relation schema:Exec (name, address, cert#, netWorth)

    Query:How many different names are there in the Exec relation?

    Query in SQL:SELECTCOUNT (DISTINCT name)

    FROM Exec;

    In query processing time, the system first eliminates the duplicatesfrom column name, and then counts the number of values there

  • 7/30/2019 Advance SQL

    34/103

    Aggregation -- Grouping

    Often we need to consider the tuples in an SQL query ingroups, with regard to the value of some other column(s)

    Example: suppose we want to compute:

    Total length in minutes of movies produced by each studio:Movie(title, year, length, filmType, studioName, producerC#)

    We must group the tuples in the Movie relation according totheir studio, and get the sum of the length values within eachgroup; the result would be something like:

    studio SUM(length)

    Disney 12345

    MGM 54321

  • 7/30/2019 Advance SQL

    35/103

    Aggregation - Grouping

    Relation schema:Movie(title, year, length, filmType, studioName, producerC#)

    Query:What is the total length in minutes produced by each studio? Query in SQL:

    SELECT studioName, SUM(length)

    FROM Movie

    GROUP BY studioName;

    Whatever aggregation used in the SELECT clause will be appliedonly within groups

    Only those attributes mentioned in the GROUP BY clause mayappear unaggregated in the SELECT clause

    Can we use GROUP BY without using aggregation? (Yes/No)

  • 7/30/2019 Advance SQL

    36/103

    Aggregation -- Grouping

    Relation schema:Movie(title, year, length, filmType, studioName, producerC#)

    Exec(name, address, cert#, netWorth)

    Query:For each producer (name), list the total length of the films produced

    Query in SQL:SELECT Exec.name, SUM(Movie.length)

    FROM Exec, Movie

    WHERE Movie.producerC# = Exec.cert#

    GROUP BY Exec.name;

  • 7/30/2019 Advance SQL

    37/103

    Aggregation HAVING clause

    We might be interested in not all but some groups of tuplesthat satisfy certain conditions

    We can follow a GROUP BY clause with a HAVING clause

    HAVING is followed by some conditions about the group

    We can notuse a HAVING clause without GROUP BY

  • 7/30/2019 Advance SQL

    38/103

    Aggregation HAVING clause Relation schema:

    Movie (title, year, length, filmType, studioName, producerC#)

    Exec(name, address, cert#, netWorth)

    Query:For those producers who made at least one film prior to 1930, list thetotal length of the films produced

    Query in SQL:SELECT Exec.name, SUM(Movie.length)FROM Exec, Movie

    WHERE producerC# = cert#

    GROUP BY Exec.name

    HAVING MIN(Movie.year) 1930;

  • 7/30/2019 Advance SQL

    39/103

    Aggregation HAVING clause This query chooses the group based on the property of the group

    SELECT Exec.name, SUM(Movie.length)FROM Exec, MovieWHERE producerC# = cert#GROUP BY Exec.nameHAVING MIN(Movie.year) < 1930;

    This query chooses the movies based on the property of each movie tuple

    SELECT Exec.name, SUM(Movie.length)FROM Exec, MovieWHERE producerC# = cert# AND Movie.year < 1930GROUP BY Exec.name;

    Note the difference!

  • 7/30/2019 Advance SQL

    40/103

    Order By The SQL statements/queries we looked at so far return an unordered

    relation/bag(except when using ORDER BY)

    Movie (title, year, length, filmType, studioName, producerC#)

    SELECT Exec.name, SUM(Movie.length)

    FROM Exec, Movie

    WHERE producerC# = cert#

    GROUP BY Exec.name

    HAVING MIN(Movie.year) < 1930

    ORDER BY Exec.name ASC;

    In general:

    ORDER BY A1 ASC, B DESC, C ASC;

  • 7/30/2019 Advance SQL

    41/103

    Database Modifications SQL & Database Modifications?

    Next we will look at SQL statements that do not return something,but ratherchange the state of the database

    There are three types of such SQL statements/transactions: Insert tuples into a relation

    Delete certain tuples from a relation

    Update values of certain components of certain existing tuples

    We refer to these types of operations collectively as databasemodifications, and refer to such requests astransactions

  • 7/30/2019 Advance SQL

    42/103

    Insertion The insertion statement consists of:

    The keyword INSERT INTO

    The name of a relation R

    A parenthesized list of attributes of the relation R The keyword VALUES

    A tuple expression, that is, a parenthesized list of concrete values,one for each attribute in the attribute list

    The form of an insert statement:

    INSERTINTOR(A1, An)VALUES(v1, vn);

    A tuple is created and added, where vi is the value ofattribute

    Ai, fori=1,2,,n

  • 7/30/2019 Advance SQL

    43/103

    Insertion

    Relation schema:StarsIn (title, year, starName)

    Update the database:Add Sydney Greenstreet to the list of stars ofThe Maltese Falcon

    In SQL:

    INSERT INTO StarsIn (title,year, starName)

    VALUES(The Maltese Falcon, 1942, Sydney Greenstreet);

    Another formulation of this query:

    INSERT INTO StarsIn

    VALUES(The Maltese Falcon, 1942, Sydney Greenstreet);

  • 7/30/2019 Advance SQL

    44/103

    Insertion The previous insertion statement was very simple

    It added only one tuple into a relation

    Instead of using explicitvalues for one tuple, we can

    compute a set of tuples to be inserted using a subquery This subquery replaces the keyword VALUES and the tuple

    expression in the INSERT statement

  • 7/30/2019 Advance SQL

    45/103

    Insertion

    Database schema:Studio(name, address, presC#)

    Movie(title, year, length, filmType, studioName, producerC#)

    Update the database:Add to Studio, all studio names mentioned in the Movie relation

    If the list of attributes does not include all attributes of relationR, then the tuple created has default values for the missingattributes

    Since there is no way to determine an address or apresident for such a studio value, NULL will be used for theattributes address and presC#

  • 7/30/2019 Advance SQL

    46/103

    Insertion

    Database schema:Studio(name, address, presC#)

    Movie(title, year, length, filmType, studioName, producerC#)

    Update the database:Add to Studio, all studio names mentioned in the Movie relation

    In SQL:

    INSERT INTO Studio(name)SELECT DISTINCT studioName

    FROM Movie

    WHERE studioName NOT IN(SELECT name

    FROM Studio);

  • 7/30/2019 Advance SQL

    47/103

    Deletion A deletion statement consists of :

    The keyword DELETE FROM

    The name of a relation R

    The keyword WHERE A condition

    The form of a delete statement:

    DELETE FROM RWHERE condition ; The effect of executing this statement is that every tuple in

    relation Rsatisfying the condition will be deleted from R

    Note: unlike the INSERT, we need a WHERE clause here

  • 7/30/2019 Advance SQL

    48/103

    Deletion

    Relation schema:StarsIn(title, year, starName)

    Update:Delete: Sydney Greenstreet was a star in The Maltese Falcon

    In SQL:DELETE FROM StarIn

    WHEREtitle = The Maltese Falcon ANDstarName = Sydney Greenstreet;

  • 7/30/2019 Advance SQL

    49/103

    Deletion

    Relation schema:Exec(name, address, cert#, netWorth)

    Update:Delete every movie executive whose net worth is < $10,000,000

    In SQL:DELETE FROM Exec

    WHERE netWorth < 10,000,000;

    Anything wrong here?!

  • 7/30/2019 Advance SQL

    50/103

    Deletion

    Relation schema:Studio(name, address, presC#)

    Movie(title, year, length, filmType, studioName, producerC#)

    Update:Delete from Studio, all movies produced by studios not mentioned in

    Movie (i.e., we dont want to have non-producing studios)

    In SQL:DELETE FROM Studio

    WHERE name NOT IN (SELECT StudioNameFROM Movie);

  • 7/30/2019 Advance SQL

    51/103

    Defining Database Schema

    To create a table in SQL:

    CREATE TABLEname (list of elements);

    Principal elements are attributes and theirtypes, but key

    declarations and constraints may also appear

    Example:

    CREATE TABLE Star (

    name CHAR(30),address VARCHAR(255),

    genderCHAR(1),

    birthdate DATE

    );

  • 7/30/2019 Advance SQL

    52/103

    Defining Database Schema

    To delete a table:

    DROP TABLEname;

    Example:DROP TABLE Star;

  • 7/30/2019 Advance SQL

    53/103

    Data types

    INT orINTEGER

    REAL orFLOAT

    DECIMAL(n, d) -- NUMERIC(n, d) DECIMAL(6, 2), e.g., 0123.45

    CHAR(n)/BIT(B) fixed length character/bit string Unused part is padded with the "pad character, denoted as

    VARCHAR(n) / BIT VARYING(n) variable-length strings upto n characters

  • 7/30/2019 Advance SQL

    54/103

    Data types (contd) Time:

    SQL2 format is TIME 'hh:mm:ss[.ss...]'

    Date:

    SQL2 format is DATE yyyy-mm-dd (m =0 or 1)

    The default format of date in Oracle is dd-mon-yy

    Example:

    CREATE TABLE Days(d DATE);INSERT INTO Days VALUES(08-aug-02);

    Oracle function to_date converts a specified format intodefault format, e.g., INSERT INTO Days VALUES (to_date('2002-08-08', 'yyyy-mm-dd'));

  • 7/30/2019 Advance SQL

    55/103

    Altering Relation Schemas Adding Columns

    Add an attribute to a relation R with

    ALTER TABLE R ADD column declaration ;

    Example: Add attribute phone to table Star ALTER TABLE StarADD phone CHAR(16);

    Removing Columns

    Remove an attribute from a relation R using DROP: ALTER TABLE R DROP COLUMN column_name ;

    Example: Remove column phone from Star

    ALTER TABLE StarDROP COLUMN phone;

    Note: Cant drop if it is the only column

  • 7/30/2019 Advance SQL

    56/103

    Attribute Properties

    We can assert that the value of an attribute to be:

    NOT NULL

    every tuple must have a real (non-null) value for this attribute

    DEFAULTvalue

    Null is the default value for every attribute A

    The owner of the relation can define some other value as the

    default, instead of NULL

  • 7/30/2019 Advance SQL

    57/103

    Attribute PropertiesCREATE TABLE Star (

    nameCHAR(30),

    addressVARCHAR(255),

    genderCHAR(1) DEFAULT?,birthdateDATE NOT NULL);

    Example: Add an attribute with a default value:

    ALTER TABLE StarADDphoneCHAR(16) DEFAULTunlisted;

    INSERT INTO Star(name, birthdate) VALUES (Sally ,0000-00-00)name address gender birthdate phoneSally NULL ? 0000-00-00 unlisted

    INSERT INTO Star(name, phone) VALUES (Sally,333-2255)

    this insertion could not be performed since the value forbirthdate is

    not given and it is disallowed to use NULL as the default

  • 7/30/2019 Advance SQL

    58/103

    Schema Refinement

    Functional Dependencies:Essential to Normalization

    Theory

  • 7/30/2019 Advance SQL

    59/103

    Functional Dependencies

    Consider the relation:Movie (title, year, length, filmType, studioName, starName)

    What are the functional dependencies?title, year length

    title, year filmType

    title, year studioName title, year length, filmType, studioName

    Note that the FD title, year starName does not hold

  • 7/30/2019 Advance SQL

    60/103

    Logical Implication: Reasoning with FDs

    Consider relation R(A, B,C)with the set of FDs:

    F = {A B, B C}

    We can deduce from F that A C also holds on R.How? Apply the definition

    To detect possible redundancy, is it necessary to

    consider all the given FDs?As shown above, there might be some additional hidden

    (nontrivial) FDs implied by a given set of FDs

  • 7/30/2019 Advance SQL

    61/103

    Logical Implication (Contd)

    ConsiderR(A1,A2,A3,A4,A5) with FDs:

    F = { A1 A2,A2 A3, A2A3 A4,A2A3A4 A5 }

    Prove that FA5 A

    1Solution method: Provide a counter-example; give a relation

    instance r of R that satisfies every FD in F but notA5 A1

    A1 A2 A3 A4 A5t1: 0 1 1 1 1

    t2: 1 1 1 1 1 A desired instance rofR.

  • 7/30/2019 Advance SQL

    62/103

    Closure of a set of FDs

    Defn: The closure of F, denoted F+, is the set ofFDs that are logically implied by F

    How can we compute F+? Definitely, F+ includes Fbut possibly more FDs

    We need to know how to reasonabout FDs

  • 7/30/2019 Advance SQL

    63/103

    Equivalence

    Defn: Suppose R is a relation schema, and S and T aresets of functional dependencies on R.

    T and S are equivalent (ST)

    Example: Suppose R = {A,B,C}, and

    S = {A B, B C, A C}

    T = {A B, B C}Can show that ST

  • 7/30/2019 Advance SQL

    64/103

    Armstrongs Axioms [1974]

    R is a relation schema, and X, Y and Z are subsets ofR.

    Reflexivity

    IfY X, then X Y(trivial FDs)

    Augmentation

    IfX Y, then XZ YZ, for every Z

    Transitivity

    IfX Y and Y Z, then X Z

    These are sound and complete inferencerules for FDs

  • 7/30/2019 Advance SQL

    65/103

    Additional rules / axioms

    Other useful rules that follow from Armstrong Axioms

    Union (Combining) Rule

    IfX Y and X Z, then X YZ Decomposition (Splitting) Rule

    IfX YZ, then X Y and X Z

    Pseudotransitivity Rule IfX Y and WY Z, then XW Z

    NOTE: X, Y, Z, and W are sets of attributes

  • 7/30/2019 Advance SQL

    66/103

    ExampleDiscovering hidden FDs

    Consider a relation schema R = {A, B, C, G, H, I} withFDs F = { A B, A C, CG H, CG I, B H }

    Using these rules, we can derive the following FDs Since A B and B H, then A H, by transitivity

    Since CG H and CG I, then CG HI, by union

    Since A C then AG CG, by augmentation

    Now, since AG CG and CG I, then AG I, bytransitivity (Do AG H)

    Many trivial dependencies can be derived(!)by augmentation

  • 7/30/2019 Advance SQL

    67/103

    Computing the Closure of Attributes

    Given a set F of FDs and a set X of attributes, how do wecompute the closure ofX w.r.t. F?

    Starting with X, we repeatedly expand the set, by adding the right

    hand side (RHS) of every FD, once we have included its LHD inthe set.

    When the set cannot be expanded anymore, we have obtained

    the result, X+

  • 7/30/2019 Advance SQL

    68/103

    An Algorithm to Compute X+ under FX + X (initialization step)

    repeat

    for each FD W Z in Fdo:

    if W X+ then

    X + X + Z // include Z to the result

    untilX+ does not change anymore

    Complexity question: Inthe worst case, how many timesthe repeat statement will be executed?

  • 7/30/2019 Advance SQL

    69/103

    Example

    Consider a relation scheme R = { A, B, C, D, E, F } withthe set ofFDs { AB C, BC AD, D E, CF B }

    Compute{A, B}+

    Execution result at each iteration:

    X+ = {A, B}

    Using AB C, we get X+ = {A, B, C}

    Using BC AD, we get X+

    = {A, B, C, D} Using D E, we get X+ = {A, B, C,D, E}

    No more change to X+ is possible.

    X+= {A, B}+= {A, B, C, D, E}

    Does the order in which FDs appear matter in this computation?

  • 7/30/2019 Advance SQL

    70/103

    Implication Problem Revisited

    Is a given FD X Y implied by a set Fof FDs?

    That is to ask whetherX Y is in F+?

    How to answer this question?An algorithm for this:

    Compute X+ underF, and

    Check ifY is in X+

    If yes, then F X Y

    Otherwise F X Y

  • 7/30/2019 Advance SQL

    71/103

    Example

    Consider a relation schema R = { A, B, C, D, E, F } withthe FDs F = { AB C, BC AD, D E, CF B }

    True/false:F AB D? Two steps:

    Compute X+= {A, B}+= {A, B, C, D, E}

    Check ifD X+

    Yes, AB D is implied by F

  • 7/30/2019 Advance SQL

    72/103

    Example

    Consider a relation scheme R = { A, B, C, D, E, F } withFDs F = { AB C, BC AD, D E, CF B }

    Is D A implied by F? Two steps:

    Compute X+= {D}+= {D, E}

    Check ifA X+

    Since A is not in {D, E}, we conclude that D Ais notimplied byF

  • 7/30/2019 Advance SQL

    73/103

    Closures and Keys

    When will X+ include all attributes of a relation R?

    Clearly, the answer is yes iffXis a (superkey) key ofR

    To check ifX

    is a candidate key ofR, we may check if:

    1. X+ contains all attributes ofR, i.e., X+ = R, and

    2.No proper subset S ofX has this property, i.e., A X, {XA}+ R

    Knowledge about keys is essential for Normal forms

  • 7/30/2019 Advance SQL

    74/103

    Canonical Cover

    Number of iterations of the algorithm for computing the

    closure of a set of attributes depends on the number of

    FDs in F

    The same will be observed for other algorithms that we will study

    (such as the decomposition algorithms)

    Can we minimize F?

  • 7/30/2019 Advance SQL

    75/103

    Covers FDs can be represented in several different ways without changing

    the set of legal/valid instances of the relation

    Let F and Gbe sets of FDs. We say GfollowsfromF, if every

    relation instance that satisfies F also satisfies G. In symbols: FG.

    We may also say: Gis implied byForGis covered byF.

    If both FG and GF hold, then we say that G and F are

    equivalent and denote this by FG Note that FG iffF+G+

    IfFG we may also say: G is a coverofF and vice versa

  • 7/30/2019 Advance SQL

    76/103

    Canonical Cover

    Let Fbe a set of FDs. A canonical / minimal cover

    ofF is a set Gof FDs that satisfies the following:

    1. G is equivalent to F; that is, GF

    2. G is minimal; that is, if we obtain a set Hof FDs from

    Gby deleting one or more of its FDs, or by deleting

    one or more attributes from some FD in G, then F H

    3. Every FD in G is of the form X A, where A is a

    single attribute

  • 7/30/2019 Advance SQL

    77/103

    Canonical Cover

    A canonical coverG is minimal in two respects:

    1. Every FD in G is required in order forG to be equivalent to F

    2. Every FD in G is as small as possible, that is, each attribute on the left hand side is necessary.

    Recall: the RHS of every FD in G is a single attribute

  • 7/30/2019 Advance SQL

    78/103

    Computing Canonical Cover

    Given a set Fof FDs, how to compute a canonical coverG ofF?

    Step 1:Put the FDs in the standard form

    Initialize G:=F

    Replace each FD X A1A2Ak inG with XA1, XA2, , XAk Step 2: Minimize the left hand side of each FD

    E.g., for each FD AB C in G, check if A or B on the LHS is redundant ,

    i.e.,(G {AB C } {A C })+F+?

    Step 3: Deleteredundant FDs

    For each FD X A in G, check if it is redundant, i.e., whether

    (G {X A })+ F+?

  • 7/30/2019 Advance SQL

    79/103

    Computing Canonical Cover

    R = { A, B, C, D, E, H}

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

    Step oneput FDs in the standard form

    All present FDs are in the standard form

    G = {A B, DE A, BC E, AC E, BCD A, AED B}

  • 7/30/2019 Advance SQL

    80/103

    Computing Canonical Cover

    Step two Check for left redundancy

    For every FD X A in G, check if the closure of a subset ofX

    determines A. If so, remove redundant attribute(s) from X

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    81/103

    Computing Canonical Cover

    G = { A B, DE A, BC E, AC E, BCD A, AED B }

    A B

    obviously OK (no left redundancy)

    DE A

    D+= D

    E+= E

    OK (no left redundancy)

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    82/103

    Computing Canonical Cover

    G = { A B, DE A, BC E, AC E, BCD A, AED B }

    BC E

    B+= B

    C+= C

    OK (no left redundancy)

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    83/103

    Computing Canonical Cover

    G = { A B, DE A, BC E, AC E, BCD A, AED B }

    AC E

    A+= AB

    C+= C

    OK (no left redundancy)

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    84/103

    Computing Canonical Cover

    G = { A B, DE A, BC E, AC E, BCD A, AED B }

    BCD A

    B+

    = B C+= C

    D+= D

    BC+

    = BCE CD+= CD

    BD+= BD

    OK (no left redundancy)

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    85/103

    Computing Canonical Cover

    G = { A B, DE A, BC E, AC E, BCD A, AED B }

    AED B

    A+

    = AB

    E & D are redundant

    we can remove themfrom AED B

    G = { A B, DE A, BC E, AC E, BCD A, A B }

    G = { DE A, BC E, AC E, BCD A, A B }

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    86/103

    Computing Canonical Cover

    Step 3Check for redundant FDs

    For every FD X A in G

    Remove X A from G; call the result G Compute X+underG

    IfA X+, then X A is redundant and hence we remove

    the FD X A from G (that is, we rename G to G)

    R = { A, B, C, D, E, H}

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    87/103

    Computing Canonical Cover

    G = { DE A, BC E, AC E, BCD A, A B }

    Remove DE A from G

    G = { BC E, AC E, BCD A, A B }

    Compute DE+underG DE+= DE (computed underG)

    Since A DE, the FD DE A is not redundant

    G = { DE A, BC E, AC E, BCD A, A B }

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    88/103

    Computing Canonical Cover

    G = { DE A, BC E, AC E, BCD A, A B }

    Remove BC E from G

    G = { DE A, AC E, BCD A, A B }

    Compute BC+underG BC+= BC

    BC E is not redundant

    G = { DE A, BC E, AC E, BCD A, A B }

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    89/103

    Computing Canonical Cover

    G = { DE A, BC E, AC E, BCD A, A B }

    Remove AC E from G

    G = { DE A, BC E, BCD A, A B }

    Compute AC+underG AC+= ACBE

    Since E ACBE, AC E is redundant remove it from G

    G = { DE A, BC E, BCD A, A B }

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    90/103

    Computing Canonical Cover

    G = { DE A, BC E, BCD A, A B }

    Remove BCD A from G

    G = { DE A, BC E, A B }

    Compute BCD+underG BCD+= BCDEA

    This FD is redundant remove it from G

    G = { DE A, BC E, A B }

    R = { A, B, C, D, E, H }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    91/103

    Computing Canonical Cover

    G = { DE A, BC E,A B }

    Remove A B from G

    G = { DE A, BC E }

    Compute A+underG A+= A

    This FD is not redundant (Another reason why this is true?)

    G = { DE A, BC E, A B }

    G is a minimal cover forF

    R = { A, B, C, D, E, F }

    F = { A B, DE A, BC E, AC E, BCD A,AED B }

  • 7/30/2019 Advance SQL

    92/103

    Several Canonical Covers Possible?

    Relation R={A,B,C} with F = {A B, A C,B A, B C, C B, C A}

    Several canonical covers exist G = {A B, B A, B C, C B}

    G = {A B, B C, C A}

    A B

    C

    A B

    C

    A B

    C

    Can you find more ?

  • 7/30/2019 Advance SQL

    93/103

    How to Deal with Redundancy?

    Name Address RepresentingFirm SpokesPerson

    Carrie Fisher 123 Maple Star One Joe Smith

    Harrison Ford 789 Palm dr. Star One Joe SmithMark Hamill 456 Oak rd. Movies & Co Mary Johns

    Relation Instance:

    Relation Schema:

    Star(name, address, representingFirm, spokesPerson)

    We can decompose this relation into two smaller relations

    F = { name address, representingFirm, spokePerson,

    representingFirm spokesPerson }

  • 7/30/2019 Advance SQL

    94/103

    How to Deal with Redundancy?

    Relation Schema:

    Star(name, address, representingFirm, spokesperson)

    Decompose this relation into the following relations:

    Star(name, address, representingFirm)with F1={ name address, representingFirm }

    andFirm (representingFirm, spokesPerson)with F2={ representingFirm spokesPerson }

    F = { representingFirm spokesPerson }

  • 7/30/2019 Advance SQL

    95/103

  • 7/30/2019 Advance SQL

    96/103

    Decomposition

    A decomposition of a relation schema R consists of replacing R bytwo or more non-empty relation schemas such that each one is asubset ofR and together they include all attributes ofR. Formally,

    R = {R1,,Rm} is a decomposition if all conditions below hold:

    (0)Ri, for all i in {1,,m}(1)R1 Rm= R

    (2)Ri Rj, for different i and j in {1,,m} When m = 2, the decomposition R = { R1, R2 } is called binary

    Not every decomposition of R is desirable Properties of a decomposition?

    (1) Lossless-join this is a must

    (2) Dependency-preserving this is desirable

    Explanation follows

  • 7/30/2019 Advance SQL

    97/103

    ExampleRelation Instance: Decomposed into:

    B C

    2 3

    2 5

    A B C

    1 2 3

    4 2 5

    A B

    1 2

    4 2

    To recover information, we join the relations:

    A B C

    1 2 3

    4 2 5

    4 2 3

    1 2 5

    Why do we have new tuples?

  • 7/30/2019 Advance SQL

    98/103

    Lossless-Join Decomposition

    R is a relation schema and F is a set of FDs overR.

    A binary decomposition ofR into relation schemas R1 andR2with attribute sets X andY is said to be a lossless-joindecomposition with respect to F, if for every instance rofR that satisfies F, we have X( r) Y( r) = r

    Thm: Let R be a relation schema and Fa set of FDs on R.

    A binary decomposition ofR into R1 and R2with attributesets X andY is lossless iff X Y XorX Y Y,i.e., this binary decomposition is lossless if the commonattributes ofX andY form a key of R1orR2

  • 7/30/2019 Advance SQL

    99/103

    Example: Lossless-joinRelation Instance: Decomposed into:

    B C

    2 3

    A B C

    1 2 3

    4 2 3

    A B

    1 2

    4 2

    To recover the original relation r, we join the two relations:

    A B C

    1 2 3

    4 2 3

    F = { B C }

    No new tuples !

  • 7/30/2019 Advance SQL

    100/103

    Example: Dependency PreservationRelation Instance:

    Decomposed into:

    B C D

    2 5 7

    3 6 8

    A B

    1 2

    4 3

    F = { B C, B D, A D }A B C D1 2 5 7

    4 3 6 8

    Can we enforce A D?How ?

  • 7/30/2019 Advance SQL

    101/103

    Dependency-Preserving Decomposition

    A dependency-preserving decomposition allows us to enforceevery FD, on each insertion or modification of a tuple, byexamining just one single relation instance

    Let R be a relation schema that is decomposed into two schemaswith attribute sets X andY, and let Fbe a set of FDs overR. Theprojection of F on X (denoted by FX) is the set of FDs in F

    + thatinvolve only attributes in X

    Recall that a FD U V in F+

    is in FX if all the attributes in Uand V are in X;In this case wesay this FD is relevant to X

    The decomposition of < R,F > into two schemas with attribute sets

    X andY is dependency-preserving if( FX FY )+F+

  • 7/30/2019 Advance SQL

    102/103

    Normal Forms

    Given a relation schema R, we must be able to determinewhether it is good or we need to decompose it intosmaller relations, and if so, how?

    To address these issues, we need to study normal forms

    If a relation schema is in one of these normal forms, weknow that it is in some good shape in the sense that

    certain kinds of problems (related to redundancy) cannot arise

  • 7/30/2019 Advance SQL

    103/103

    1NF2NF3NFBCNF

    Normal Forms

    The normal forms based on FDs are First normal form (1NF)

    Second normal form (2NF)

    Third normal form (3NF) Boyce-Codd normal form (BCNF)

    These normal forms have increasingly restrictiverequirements


Recommended