CS 265 Final Exam Spring 2016 Name: KEY - my.vanderbilt.edu · 2019-01-17 · CS 265 Final Exam...

CS 265 Final Exam Spring 2016 Name: _____________KEY_______________ I will not use a source other than my brain on this exam: _________________________________ (please sign)

1. Among other things, a bank’s database needs to store information about patrons (keyed by PId, with additional attributes Name and Address). •  A patron has some combination of one or more loans from the bank and/or one or more deposit accounts with the bank. •  Each loan and account number is associated with exactly one patron. •  Each loan is identified by a loan-number with a record of the current balance. •  An account is identified by an account-number, with an additional indication of the current balance.

1

© Douglas H. Fisher, Vanderbilt University. Licensed for personal use. Do not repost without permission

Loan LoanNum PK Balance

Account AcctNum PK Balance

Patron PId PK Name Address

1..*

1..1

1..*

1..1

a) Briefly state which of the constraints above, if any, that this UML violates (unless you really hacked something ugly up to make it work as is):

b) Give a UML that satisfies all constraints stated above (including possibly the one on left).

Loan Balance

Account Balance

Contract IDNum PK

1..1 1..*

Patron SSN PK PName PAddress

Disjoint, Full coverage

Violates: A patron has some combination of one or more loans from the bank and/or one or more deposit accounts with the bank, since it doesn’t allow a combination of 0 of one type, and one or more of another

The solution of the in-class exercise of February 18 (https://my.vanderbilt.edu/cs265/files/2016/02/InClassPracticeUMLKey.pdf), was to introduce a superclass.

+3 for identifying this problem, or another valid problem

+2 for this solution or another valid solution (simply changing the cardinalities at left won’t work)

ALL questions 5 points unless otherwise noted

2.  Moving on to a retailer that sells items to customers on an installment plan. The following constraints should hold. •  A customer is identified by a unique identifier (CId) and has an associated Name and Address. •  Each installment plan is identified by a plan number (which is unique across ALL customers), and has the current balance. •  A complete history of payments is recorded giving the payment date and payment amount for each payment on each plan. •  There are never two payments for the same plan recorded for the same date.

2


InstallPlan PlanNum PK Balance

Payment PaymentDate PK Amount

1..1

0..*

b) Give a UML that satisfies all constraints stated above by making one simple addition to the UML on the left.

a) Briefly state which of the constraints above, if any, that this UML violates (unless you hacked something ugly up to make it work as is):

Customer CId PK Name Address

1..1

0..*

InstallPlan PlanNum PK Balance

Payment PaymentDate PK Amount

1..1

0..*

Customer SSN PK CName CAddress

1..1

0..* PK

The essential problem is that PaymentDate by itself is an insufficient key for Payment, excluding the possibility of a complete history of payments across all plans (i.e., no record of payments to different plans on the same date could be made).

+2 points

+3 points

Some seemed confused by the 0..* cardinality here. There is nothing in the English spec that prevents a Customer from having multiple plans, and most retailers would allow it (including none at all – I typically pay in one lump sum.

PlanNum is PK and associated 1..1 w Cust

3. Consider the following UML snippet below. Assume that the two classes are translated into two tables following the usual translation rules for subclasses and parents. Assume further that a VIEW is defined that gives all the attributes of Student (undoubtedly there would be many more than I have included here), to include those that are inherited from Individual. Write an INSTEAD OF TRIGGER that implements INSERTs to WholeStudentView by inserting into the relevant base tables.

Student

Individual

partial coverage

CREATE VIEW WholeStudentView (Id, Name, YearEntered) AS SELECT I.Id, I.Name, S.YearEntered FROM Individual I, Student S WHERE I.Id = S.Id;

IdPKName

YearEntered

3


BEGIN INSERT INTO Individual VALUES (NEW.Id, NEW.Name); INSERT INTO Student VALUES (NEW.Id, NEW.YearEntered); END;

CREATE TRIGGER InsertIntoWholeStudentView INSTEAD OF INSERT ON WholeStudentView FOR EACH ROW /* implied by SQLite */

Finish the trigger +3 for one of these INSERTs; +5 for both

4. Consider the following UML snippet. Debug the definitions of Section and TeachAsst that are given, so that they conform to the constraints of the UML specification, assuming that the other tables have been correctly translated. You may add text and/or strike out text in the current definitions. Be neat and clear. Note that we are not interested in attribute types in this problem.

TeachAsst Instructor

Course

Section 0..*

CnamePK

SectNoPKTermPK

PK

Teaches TAs

0..*

CREATE TABLE Section ( SectNo, Term, Cname, InstID, NOT NULL PRIMARY KEY (SectNo, Term, Cname), FOREIGN KEY (InstID) REFERENCES Instructor, FOREIGN KEY (Cname) REFERENCES Course

)

0..*

CREATE TABLE TeachAsst ( SectNo NOT NULL, Term NOT NULL, Cname NOT NULL, StuID, PRIMARY KEY (SectNo, Term, Cname, StuID), FOREIGN KEY (SectNo, Term, Cname) REFERENCES Section

)

StuIdPKYearEntered

InstIdPKRank

4


1..1

1..1

0..1 By declaring NOT NULL, 1..1 is enforced, not 0..1

Including the PK of Instructor is the Preferred way of enforcing 1..1, but without NOT NULL, 0..1 is enforced, not 1..1

Make link from Section to Course and Instructor explicit with FK constraints

YearEntered

1 point for each, Up to MAX of 5 points

The PK given initially allows a TeachAsst (StuID) to be associated

with multiple Sections

No points were taken off if NOT NULLs were added to attributes that were primary key attributes, though NOT NULL is implied already

5. University IT has contracted with an outside company to build a database that will support your university’s administrative applications, to include library applications like checkout and lost-book billing. In contrast, your local university IT group builds the application software for all administrative applications. In the lost-book billing application, for example, the application software (that is built locally) inserts a record into a view called the ChargesView with a lost book fee of $100 for each book (or more realistically, the price of the book) that is more than 30 days overdue. It makes these insertions daily, with the expectation that if a book/person lost fee has been entered once, subsequent attempts to insert the same book/person pair will be rejected because (they believe) the insertion would violate a DB constraint that a person can only be charged a lost-book fee for a given book once. They are operating under assumptions that are faulty.

5


CREATE VIEW LostBookView (PersonId, BookId) /* application programmers see this header */ AS SELECT CO.PersonId, CO.BookId) FROM CheckedOut CO WHERE DiffDate(‘NOW’, CO.DueDate) > 30 /* Diff Date computes the difference between two dates in terms of days */ );

As noted above, there is a second view called ChargesView that is maintained, with inserts into it occurring daily, as books become (and remain) sufficiently overdue. A monthly bill is sent out that sums the total charges to each patron (like this example, for ‘Doug’ only – the real application uses GROUP BY, to sum up charges for everyone in the table) SELECT SUM(C.LostBookFee) FROM Charges C WHERE C.PersonId = ‘Doug’; Doug calls within a couple of days of the notification, complaining that though he has only one book out (i.e., ‘I Robot’), he has been charged almost $3000! Looking more deeply, you find that whenever an insert into ChargesView is made, an INSTEAD OF TRIGGER is inserting a record into a Charges base table, which is just like the view, except that it has an additional AUTOCOUNTER PRIMARY KEY attribute, a value for which is “tacked” on just before insertion into the Charges base table. There are no constraints in this base table on PersonId or BookId or their pairing! Ugh!!! How would you fix the database implementation (NOT the application software) in the simplest possible way to guard against the overcharging that can and is occurring?

Full credit for any of following •  Make PRIMARY KEY (PersonId, BookId) or •  Make UNIQUE(PersonId, BookId), or •  Add a WHEN clause to the INSTEAD OF trigger that checks to see if (PersonId, BookId) pair already in Charges

The first two require changes to the table schema, which can be clunky (e.g., see https://www.sqlite.org/lang_altertable.html#otheralter), but may be best. Because of the difficulty of making changes after a DB “goes live”, its much better to think it through in advance!

6

6. Circle the three tenets that you think are most relevant to the vignette of question 5, both from the perspective of the DB developers and the Application programmers. Briefly describe why they are most relevant to you. You may address more than three. While the particular scenario may not seem that compelling, even something that simple can be, if people believe a (poorly designed) technology over a complainant. Even if you think the previous example is contrived, analogues are not uncommon. Below I simply list numbers of answers that listed each tenet in the three (or on a few cases, four) most relevant tenets. I don’t give the reasons that students gave, but leave it as an exercise to see the relevance of each. Most all (but not all) students received 5 points, and a few received a bonus of 0.5 – 1 point for an outstanding answer (5.5- 6.0 pts total)

1. to accept responsibility in making engineering decisions consistent with the safety, health and welfare of the public, and to disclose promptly factors that might endanger the public or the environment; 20 chose this as among the three (or four) most relevant. 2. to avoid real or perceived conflicts of interest whenever possible, and to disclose them to affected parties when they do exist; 5 chose this as among the three (or four) most relevant. 3. to be honest and realistic in stating claims or estimates based on available data; 11 chose this as among the three (or four) most relevant. 4. to reject bribery in all its forms; 0 chose this as among the three (or four) most relevant. 5. to improve the understanding of technology, its appropriate application, and potential consequences; 19 chose this as among the three (or four) most relevant. 6. to maintain and improve our technical competence and to undertake technological tasks for others only if qualified by training or experience, or after full disclosure of pertinent limitations; 17 chose this as among the three (or four) most relevant. 7. to seek, accept, and offer honest criticism of technical work, to acknowledge and correct errors, and to credit properly the contributions of others; 20 chose this as among the three (or four) most relevant. 8. to treat fairly all persons regardless of such factors as race, religion, gender, disability, age, or national origin; 2 chose this as among the three (or four) most relevant. 9. to avoid injuring others, their property, reputation, or employment by false or malicious action; 10 chose this as among the three (or four) most relevant. 10. to assist colleagues and co-workers in their professional development and support them in following this code of ethics. 16 chose this as among the three (or four) most relevant.

6


7. Problems (a) and (b) use the same relational schema and the same FDs, but address each separately so that any misconceptions that you may have do not cascade. a. Consider the relational schema [ A B C D E F] with functional dependencies: {A à F, AC à B, D à E, A à C, B à F}. Give all minimal keys for this relational schema .

6b. Consider the relational schema [ A B C D E F] with Functional Dependencies {A à F, AC à B, D à E, A à C, B à F}. Give a minimal set of FDs equivalent to this set. If the set is already a minimal set, then say so. BE CLEAR!

7

Can LHS of any FD be simplified? AC can be simplified because C can be inferred from A, so have both AC on LHS is redundant. AC à B can be replaced by AàB 3 points for {AàB, DàE, AàC, BàF }

Consider FDs in left-to-right order given: {A à F, AC à B, D à E, A à C, B à F}. Can F be inferred from A without AàF? YES {A} à {AC} à {ACB} à {ACBF} So, remove AàF Can B be inferred from AC without ACàB (and without AàF)? No – keep ACàB Can E be inferred from D without DàE (and without AàF)? No – keep DàE Can C be inferred from A without AàC (and without AàF)? No – keep AàC Can F be inferred from B without BàF (and without BàF)? No – keep BàF

Minimal set in green: AàB, CD à E, CFàA

‘A’ must be part of any key, because its not on RHS of any FD ‘D’ must be part of any key, because its not on RHS of any FD

{AD} à {ADF} à {ADEF} à {ACDEF} à {ABCDEF} AD is a minimal key. It is the ONLY minimal key, since any others would have to be supersets


2 points for AD; 0 for anything else

2.5 points for {ACàB, DàE, AàC, BàF }

8. Consider the relation [ A B C D E F ], with applicable functional dependencies: { A à B, B à C , C à B}. Give a dependency-preserving decomposition into BCNF relations, or show that there is not one. Be very clear about the decomposition that you are presenting as your answer (e.g., circle the decomposition).

8


[A B C D E F]

[A B D E F] [B C ]

B à C

B à C C à B

A à B

[A B] [A D E F] AàB

ADEF is the (minimal) key

5 pts for [B C], [A B], and [A D E F]

[A B C D E F]

[A C D E F] [B C ]

C à B

B à C C à B

A à C

[A C] [A D E F] AàC

No applicable FDs can decompose this further, so it appears this can’t lead to a dependency-preserving decomposition (because AàB is left hanging), but wait … { A à B, B à C , C à B} è { A à B, B à C , C à B, A à C} è { A à B, B à C , C à B, A à C} is an alternative minimal set

5 pts for [B C], [A C], and [A D E F]

8. Consider the relation [ A B C D E F ], with applicable functional dependencies: { A à B, B à C , C à B}. Give a dependency-preserving decomposition into BCNF relations, or show that there is not one. Be very clear about the decomposition that you are presenting as your answer (e.g., circle the decomposition).

9


[A B C D E F]

[A C D E F] [A B ]

A à B

A à B

ADEF is the (minimal) key

No further decomposition by FDs BàC or CàB possible, so both tables in BCNF but the decomposition into these two tables is not dependency preserving 2 points total for [A B] [A C D E F]

9. (Inspired by a Widom practice exercise) Consider a database of researchers and their works (like ResearchGate.com), with relational schema

Researcher (ID, Name, Institution ) Collaborator (ID1, ID2 ) /* ID1 is a collaborator with ID2. Collaboration is symmetric, so if (abc, wxy) is in the Collaborator table, so is (wxy, abc) */ Follows (ID1, ID2 ) /* ID1 follows the posts of ID2, where Follows is not symmetric, so if (abc, wxy) is in Follows table, there is no guarantee that (wxy,

abc) is also present. */ Consider finding all those researchers for whom all of those they Follow are at different institutions than themselves. Return the names and institutions of all such researchers. One way to write this query is

SELECT R.Name, R.Institution FROM Researcher R WHERE NOT EXISTS (SELECT * FROM Follows F, Researcher R2 WHERE R.ID = F.ID1 AND F.ID2 = R2.ID AND R2.Institution = R.Institution)

Write the query that satisfies the same English specification using the “NOT IN” phrase rather than “NOT EXISTS”:

SELECT R.Name, R.Institution FROM Researcher R WHERE R.ID NOT IN (SELECT R1.ID /* could be F.ID1 here */ FROM Follows F, Researcher R1, Researcher R2 WHERE R1.ID = F.ID1 AND F.ID2 = R2.ID AND R1.Institution = R2.Institution) OR SELECT R.Name, R.Institution FROM Researcher R WHERE R.ID NOT IN (SELECT R.ID /* could be F.ID1 here */ FROM Follows F, Researcher R2 WHERE R.ID = F.ID1 AND F.ID2 = R2.ID AND R.Institution = R2.Institution)

10


IDs of all researchers who follow anyone from the same institution

All researchers who don’t follow anyone from the same institution

0 points if literal replacement of NOT EXISTS by NOT IN

Basically same as above, but for each researcher being considered in turn; will return empty set if R.ID isn’t following anyone at same institution

5 pts -1 point if Name used instead of ID; 2-3 points only if they have * in inner SELECT

9. (Inspired by a Widom practice exercise) Consider a database of researchers and their works (like ResearchGate.com), with relational schema


abc) is also present. */ Consider finding all those researchers for whom all of those they Follow are at different institutions than themselves. Return the names and institutions of all such researchers. One way to write this query is

SELECT R.Name, R.Institution FROM Researcher R WHERE NOT EXISTS (SELECT * FROM Follows F, Researcher R2 WHERE R.ID = F.ID1 AND F.ID2 = R2.ID AND R2.Institution = R.Institution)

Write the query that satisfies the same English specification using the “NOT IN” phrase rather than “NOT EXISTS”:

SELECT R.Name, R.Institution FROM Researcher R WHERE R.Institution NOT IN (SELECT R1.Institution FROM Follows F, Researcher R2 WHERE R.ID = F.ID1 AND R2.ID = F.ID2)

11


5 pts

Institutions of all researchers who are followed by a given R

All researchers who don’t follow anyone from the same institution

Correct answers also stem from variants on a different perspective, exemplified at left

10. Consider the same database and English query specification as the previous question, with relational schema


abc) is also present. */ Repeating, consider finding all those researchers for whom all of those they Follow are at different institutions than themselves. Return the names and institutions of all such researchers. Explain why or why not the following satisfies the English specification of the previous question (look carefully, right to the end of the query)

SELECT R.Name, R.Institution FROM Researcher R WHERE EXISTS (SELECT * FROM Follows F, Researcher R2 WHERE R.ID = F.ID1 AND F.ID2 = R2.ID AND R2.Institution <> R.Institution)

12


The query returns researchers who follow at least one person from a different institution, and thus •  will include in its results researchers who also follow researchers at their own institution (we don’t want these researchers) •  will exclude researchers who don’t follow anyone (who we do want in the result), though I’m lenient of this point

4.5 points for some approximation of this

Additional 0.5 points for some approximation of this

11. This question uses the same relational schema as the last problem.


abc) is also present. */ What is the average number of Collaborators per Researcher? Importantly, compute this average over only those Researchers who have more than 1 Collaborator. (Your result should be just one number.) .

13


SELECT AVG(Temp.Cnt) FROM (SELECT COUNT(*) AS Cnt FROM Researcher R, Collaborator C WHERE R.ID = C.ID1 /* or C.ID2 because Collaborator is symmetric */ GROUP BY C.ID1 /* or R.ID */ HAVING COUNT(*) > 1) AS Temp

Most will have something very much like this (5 pts) -2 for each main missing element

Don’t need rename as Temp

SELECT AVG(Temp.Cnt) FROM (SELECT COUNT(*) AS Cnt FROM Collaborator C GROUP BY C.ID1 /* or C.ID2 because Collaborator is symmetric */ HAVING COUNT(*) > 1) AS Temp

Some will have something more complicated, like this, but still (5 pts)




14


Various hybrids and illegitimate shortcuts are also probable. So, something like this might be 3 points SELECT AVG(Temp.Cnt) FROM Researcher R, Collaborator C WHERE R.ID = C.ID1 GROUP BY R.ID /* or C.ID2 because Collaborator is symmetric */ HAVING COUNT(*) > 1) AS Temp




15


Its possible that you will see something built on this theme (below is 5 points): SELECT CountOfQualifyingResearchers / TotalCountOfQualifyingResearcherCollabs FROM (SELECT COUNT(*) AS CountOfQualifyingResearchers FROM (SELECT DISTINCT C1.ID1 FROM Collaborator C1, Collaborator C2 WHERE C1.ID1 = C2.ID1 AND C1.ID2 <> C2.ID2)) (SELECT SUM(Temp.Cnt) AS TotalCountOfQualifyingResearcherCollabs FROM (SELECT COUNT(*) AS Cnt FROM Collaborator C GROUP BY C.ID1 HAVING COUNT(*) > 1) AS Temp

All researchers with at least two collaborators (could also use the GROUP BY approach)

All researchers with at least two collaborators



abc) is also present. */ Write a trigger that inserts the symmetric pair (B, A) into Collaborator when (A, B) is inserted into Collaborator and (B, A) is not already there. CREATE TRIGGER ensureCollaboratorSymmetry AFTER INSERT ON Collaborator FOR EACH ROW /* implied by SQLite */ WHEN NOT EXISTS (SELECT * FROM Collaborator C WHERE C.ID1 = new.ID2 AND C.ID2 = new.ID1) BEGIN INSERT INTO Collaborator VALUES (new.ID2, new.ID1); END

16


Finish the trigger

2 points for this WHEN clause

3 points for this INSERT

Other answers possible, typically conceptually variants

13. Consider the following two transactions, T1 and T2, from different clients:

T1: Read(A), Op11(A), Write(A), Read(B), Op12(B), Write(B), Commit

T2: Read(B), Op21(B), Write(B), Read(A), Op22(A), Write(A), Commit

Look at the transactions carefully. A and B are distinct and independent objects, but are also shared across transactions (e.g., the A in T1 is the same A as in T2).

a)  Give any schedule (just showing disk reads and writes), left to right, that clearly results in serializable behavior even without knowing what

the particular Ops are. If you do not believe that there is a serializable schedule, then give the reason.

b) Give any schedule (just showing disk reads and writes) that will not exhibit serializable behavior in the general case. If you do not believe that there is such a non-serializable schedule, then give the reason.

17


In the in-class execise (https://my.vanderbilt.edu/cs265/files/2016/03/InClassTransactionsKey.pdf), I wrote “Generally, consider T1 and T2 simplified into two transactions, T(A) and T(B), based on each shared object, A and B. If simplified, but still dependent, transactions follow same serial order, such as T1(A),T2(A) and T1(B),T2(B), then the schedule is serializable”

Inversely, if these simplified transactions follow different orderings, then in general the schedule will not match a serializable behavior. For example, a schedule in which

(a)  T1 operates on A BEFORE T2 operates on A, but in which (b)  T1 operates on B AFTER T2 operates on B,

would NOT be serializable.

13. Consider the following two transactions, T1 and T2, from different clients:

T1: Read(A), Op11(A), Write(A), Read(B), Op12(B), Write(B), Commit

T2: Read(B), Op21(B), Write(B), Read(A), Op22(A), Write(A), Commit

Look at the transactions carefully. A and B are distinct and independent objects, but are also shared across transactions (e.g., the A in T1 is the same A as in T2).

a)  Give any schedule (just showing disk reads and writes), left to right, that clearly results in serializable behavior even without knowing what

the particular Ops are. If you do not believe that there is a serializable schedule, then give the reason.

Simplest schedules (i.e., that are definitional of serializability) will be

T1: Read(A), Write(A), Read(B), Write(B), Commit T2: Read(B), Write(B), Read(A), Write(A), Commit

or vice versa T1: Read(A), Write(A), Read(B), Write(B), Commit T2: Read(B), Write(B), Read(A), Write(A), Commit

Other answers possible b) Give any schedule (just showing disk reads and writes) that will not exhibit serializable behavior in the general case. If you do not believe that there is such a non-serializable schedule, then give the reason.

18


If operators, Opij, are included between the reads and writes, its ok

Top to down is fine

5 pts

5 pts Either of these


T1: Read(A), Write(A),Read(B), Write(B), Commit T2: Read(B), Write(B), Read(A), Write(A), Commit

5 pts

5 pts

any of these

Oth

er a

nsw

ers

poss

ible

COMMIT could happen anywhere after the WRITE(B); similar flexibility in other contexts too


5 pts

19


d) CREATE ASSERTION Tab1INTab2 CHECK (NOT EXISTS (SELECT Tab1.Key1 FROM Tab1 INTERSECT SELECT Tab2.Key1 FROM Tab2))

Tab1 Key1 PK

Tab2 Key2 PK

b) CREATE ASSERTION Tab1INTab2 CHECK (NOT EXISTS (SELECT Tab1.Key1 FROM Tab1 EXCEPT SELECT Tab2.Key1 FROM Tab2))

a) CREATE ASSERTION Tab1INTab2 CHECK (NOT EXISTS (SELECT * FROM Tab1 WHERE Tab1.Key1 NOT IN (SELECT Tab2.Key1 FROM Tab2)))

14. Circle all options that would correctly enforce the 1..* cardinality constraint that Tab1 participate at least once with a record of Tab2 in an SQL translation of the following UML fragment.

e) CREATE ASSERTION Tab1INTab2 CHECK (EXISTS (SELECT Tab1.Key1 FROM Tab1 IN SELECT Tab2.Key1 FROM Tab2))

c) CREATE ASSERTION Tab1INTab2 CHECK (EXISTS (SELECT * FROM Tab1 WHERE Tab1.Key1 IN (SELECT Tab2.Key1 FROM Tab2)))

f) None of the above

+3 points for one correct option (a, b); +5 points for both correct options

-1 for one incorrectly circled option (from c, d, e); -3 for two incorrectly circled options; -5 for 3 incorrectly circled option

1..* 1..1

0 if this option circled

0 is minimum score

<!ELEMENT modules (module*)> <!ELEMENT module (instructor, institution?, learner*)> <!ATTLIST module name CDATA #REQUIRED subject CDATA #IMPLIED> <!ELEMENT instructor (name, email?)> <!ELEMENT name (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT institution (#PCDATA)> <!ELEMENT learner (#PCDATA)> <!ATTLIST learner age CDATA #IMPLIED>

Here is a sample that adheres to the DTD (note the … at bottom, So don’t customize your answer to what you see here)

<modules> <module name="Indexes" subject="DB”> <instructor><name>Doug</name> <email>doug@vandy</email> </instructor> <learner>Hua</learner> <learner age="30">Susan</learner> </module> <module name="War1812” subject="Hist”> <instructor><name>Frank</name></instructor> <institution>Vanderbilt</institution> </module> <module name=”WildlifePreserves” subject=“Eco”> <instructor><name>Mary</name></instructor> <learner age="36">Ravi</learner> </module> … </modules>

15 Consider the following XML DTD

a) Write a query in Xpath or Xquery to return the count of courses that have Doug as the instructor.

b) Write a query in Xpath or Xquery to return the average age of all learners (for whom age is given).

20

doc("…")/modules/avg(module/learner/@age)

Wrong scope: would give a list of averages, one for each module

OK

let $a := doc("…")/modules/avg(module/learner/@age) return $a :)

count(doc("…")/modules/module/instructor/[name = "Doug"])

OK

Wrong scope: would give a list of counts, one for each module


3pts

2pts or

Could be a ‘//’ instead of ‘/modules/’ but module/instructor/[name=“Doug”]) must remain the same

Could be a ‘//’ instead of ‘/modules/’ but avg(module/learner/@age) must remain the same

16. Consider the B+ tree below.

37* 41* 45* 53* 61* 65* 73* 81* 89* 94* 22* 29*

25

40 70

47 85

Answer on the next page 21


62

Note that this tree does not show data nodes, and you do not need to see the data nodes to answer this question. At each leaf, N* is an index of the form <N, <page id, slot #>>, where N is the value of the search key. a) How can there be internal node values in the B+ tree for which thee are no corresponding leaves?

b) Show the tree that results from inserting a record with search key 46 (do not use redistribution). If you can do so clearly and unambiguously, then you can circle and label sub-trees in this diagram that do not change and use those labels in your answer on the next page.

22

-2 if 46 or 41 or 45 does not appear at leaf

-3 at least for any tree that isn’t 4 levels. Use discretion on partial credit. More points off if result is not a B+ tree, to include cases were its not a valid search tree (e.g., if 55 is in a subtree to the right of 72, then the 55 would never be found, using the rules for lookup!)

Only correct tree


40 70

47

37* 22* 29*

25 Subtree A

41*

45

45* 46* 65*

62

53* 61* 73* 81* 89* 94*

85

Subtree B

Answers can reference the two circled sub-trees (A and B)

23

41* 45* 46*

53* 61* 65*

47 62

41* 45* 46*

45

41* 65*

45

45* 46* 53* 61*

47

62

In turn, 45 will trigger a split the parent, by the middle value of [45, 47, 62]

Intermediate steps


37* 41* 45* 53* 61* 65* 73* 81* 89* 94* 22* 29*

25

40 70

47 85 62

45 will split the leaf and be passed up, since it’s the middle value of [41,45,46]

24

41* 65*

45

45* 46* 53* 61*

47

62

Intermediate steps

© D

ougl

as H

. Fis

her,

Van

derb

ilt U

nive

rsity

. Lic

ense

d fo

r per

sona

l use

. Do

not r

epos

t with

out p

erm

issi

on

37* 73* 81* 89* 94* 22* 29*

25

40 70

85 Subtree A

Subtree B

In turn, 47 will trigger a split the parent (root of tree), by the middle value of [40, 47, 70], and 47 will be installed at a new root

40 70

47

37* 22* 29*

25

Subtree A

41*

45

45* 46* 65*

62

53* 61* 73* 81* 89* 94*

85 Subtree B

01

00

10

11

2 64 4 16 12

51 15 5 21

6 18 10 50

2

1

2

17. Consider the extendible hash table to the left. Assume Hash(x) = x.

Show the result of inserting the following keys: 67 = 26 + 21 + 20

Answer here 25

-1 for each misplaced key (order within a bin not important)

5 21

2

15 51 67

2


01

00

10

11

2

64 4 16 12

6 18 10 50

2

2

26

18. Consider the relational schema:

Give two left-deep trees, representing two relational algebra queries, each that finds the sids of Suppliers in the city of Nashville that supply a part named ‘allen-wrench-set’ for less that $25.00 For bonus points, select one of the trees to be one that you imagine would be highly efficient with the right set of indices, and explain your reasoning.

Supplier (sid: integer, sname: string, city: string) Catalog (sid: integer, pid: integer, cost: real) Part (pid: integer, pname: string, color: string)

Recall that in a left deep tree, the right child of every join is a base table (a leaf)

27

28

29

30

31

19. What is the one topic that you felt you were best prepared for that I did not include on the final exam sufficiently to your liking. Tell me about it.

20. What is the topic that you felt you were second best prepared for that I did not include on the final exam sufficiently to your liking. Tell me about it.

Collectively, many topics were covered in the answers to these two questions. Most answers conveyed an expectation, sometimes a desire, that more on a topic would have been covered on the final exam. Sometimes, however, a topic was introduced as an example of something that the respondent was well-prepared for. Here is a basic breakdown of the topics and the number of students (across 40 total) giving that topic. 5 functional dependencies and normal forms 8 UML (cardinality, translation to tables, and UML from English specification) 4 on Authorization (though this material was optional this year) 6 on relational algebra 14 on XML, evenly split between DTDs, and XML and relational/SQL rough equivalences (I lean towards this as something that I would want to emphasize more) 17 on various SQL topics, including 2 on SQLite specifically 2 on Views 5 on SELECT statements 2 on Updates/deletes/inserts 2 on triggers 1 assertions 7 on foreign key constraints, notably DON DELECTE and UPDATE actions 4 on performance implications of indexing 5 on implementations of indexing, 4 on extendible hashing (a more challenging problem) 1 of B+ tree indexing 4 on transactions, and one of these on 2 phase locking with B+ tree in-class example specifically

Date post:	10-Mar-2020
Category:	Documents
Upload:	others
View:	8 times
Download:	0 times

CS 265 Final Exam Spring 2016 Name: KEY - my.vanderbilt.edu · 2019-01-17 · CS 265 Final Exam...

Documents