+ All Categories
Home > Documents > CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Date post: 03-Jan-2016
Category:
Upload: olympia-clarke
View: 29 times
Download: 1 times
Share this document with a friend
Description:
CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999. Carlo Zaniolo Department of Computer Science University of California, Los Angeles February 2003. Temporal Databases: Overview. Many applications The problem is harder than what you think - PowerPoint PPT Presentation
30
CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999 Carlo Zaniolo Department of Computer Science University of California, Los Angeles February 2003
Transcript
Page 1: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

CS240A: Databases and Knowledge Bases

Temporal Applications and SQL:1999

Carlo Zaniolo

Department of Computer Science

University of California, Los Angeles

February 2003

Page 2: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal Databases: Overview

Many applications The problem is harder than what you think Support for time in SQL: the good and the bad A time ontology Many approaches proposed TSQL2 The physical level: efficient storage and indexing

techniques.

Page 3: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

An Introduction to Temporal Databases

Applications abound A case study using SQL: Queries on time varying

data are hard to express in SQL. Temporal databases provide built in support for

storing and querying time-varying information.

Page 4: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Applications Abound: Examples

Academic: Transcripts record courses taken in previous and the current semester or term and grades for previous courses

Accounting: What bills were sent out and when, what payments were received and when?Delinquent accounts, cash flow over timeMoney management software such as Quickencan show

e.g., account balance over time.

Budgets: Previous and projected budgets, multi quarter or multi year budgets

Page 5: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal DB Applications (cont.)

Data Warehousing: Historical trend analysis for decision support

Financial: Stock market data Audit: why were financial decisions made, and

with what information available? GIS: Geographic Information Systems ()

Land use over time: boundary of parcels changeover time, as parcels get partitioned and merged.

Title searches

Insurance: Which policy was in effect at each point in time, and what time periods did that policy cover?

Page 6: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal DB Applications (cont.)

Medical records: Patient records, drug regimes, lab tests.Tracking course of disease

Payroll: Past employees, employee salary history, salaries for future months, records of withholdingrequested by employees

Capacity planning for roads and utilities. Configuring new routes, ensuring high utilization

Project scheduling: Milestones, task assignments Reservation systems: airlines, hotels, trains. Scientific: Timestamping satellite images. Dating

archeological finds

Page 7: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal DBs Applications: Conclusion

It is difficult to identify applications that do not involve the management of temporal data.

These applications would benefit from built in temporal support in the DBMS. Main benefits:More efficient application developmentPotential increase in performance

Page 8: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Reviewing the Situation

The importance of temporal applications has motivated much research on temporal DBs: but no satisfactory solution has been found yet: SQL3 does not support temporal queries Temporal DBs remain an open research problem.

The problem is much more difficult than it appears at first: we have become so familiar with the time domain that we tend to overlook its intrinsic complexity.

Some of the solutions proposed by researchers lack ease of use and amenability to efficient implementation

Page 9: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Case Study

University of Arizona's Office of Appointed Personnel has some information in a database.

Employee(Name, Salary, Title) Finding an employee's salary is easy. The OAP wishes to add the date of birth

Employee(Name, Salary, Title, DateofBirth DATE)

SELECT Salary, DateofBirth FROM Employee

WHERE Name = 'Bob'

Page 10: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Converting to a Temporal Database

Now the OAP wishes to computerize the employment history.

Adding validity periods to tuples:

Employee (Name, Salary, Title, DateofBirth, Start DATE, Stop DATE)

Page 11: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Converting to a Temporal Database Example

Employee (Name, Salary, Title, DateofBirth,Start DATE, Stop DATE)

Name Salary Title DateofBirth Start Stop

Bob 60000 AssistantProvost

1945 04 19 1993 01 01 1993 06 01

Bob 70000 AssistantProvost 1945 04 19 1993 06 01 1993 10 01

Bob 70000 Provost 1945 04 19 1993 10 01 1994 02 01

Bob 70000 Professor 1945 04 19 1994 02 01 1995 01 01

Page 12: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Extracting the Salary

Find the employee's salary at a given time: e.g. the current one:

SELECT SalaryFROM EmployeeWHERE Name = 'Bob‘AND Start <= CURRENT_TIMESTAMPAND CURRENT_TIMESTAMP <= Stop

Instead of CURRENT_TIMESTAMP we could have given any time stamp or date

Page 13: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Distributing the Salary History

OAP wants to distribute to all employees their salary history.

Output: For each person, maximal intervals at each salary

Employee could have arbitrarily many title changes between salary changes.

Name Salary Start Stop

Bob 60000 1993 01 01 1993 06 01Bob 70000 1993 06 01 1995 01 01

Page 14: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Extracting the Salary, cont.

Alternative 1: Give the user a printout of Salary and Title information, and have user determine when his/her salary changed.

Alternative 2: Use SQL as much as possible. Find those intervals that overlap or are adjacent and thus should be merged.

Page 15: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Bob’s Salary History in SQL

CREATE TABLE Temp(Salary, Start, Stop)AS SELECT Salary, Start, StopFROM Employee WHERE Name = 'Bob';

repeatUPDATE Temp AS T1 SET (T1.Stop) = (SELECT MAX(T2.Stop)

FROM Temp AS T2WHERE T1.Salary = T2.SalaryAND T1.Start < T2.StartAND T1.Stop >= T2.StartAND T1.Stop < T2.Stop)

WHERE EXISTS (SELECT * FROM Temp AS T2WHERE T1.Salary = T2.SalaryAND T1.Start < T2.StartAND T1.Stop >= T2.StartAND T1.Stop < T2.Stop)

until no tuples updated;

Page 16: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Example

Initial table After one pass After two passes

Page 17: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Salary History (cont.)

Intervals that are not maximal must be deletedDELETE FROM Temp T1

WHERE EXISTS (SELECT *FROM Temp AS T2WHERE T1.Salary = T2.Salary AND ( (T1.Start > T2.Start AND T1.Stop <= T2.Stop OR

(T1.Start >= T2.Start AND T1.Stop < T2.Stop) )

The loop is executed lgN times in the worst case, where N is the number of tuples in a chain of overlapping or adjacent, value equivalent tuples. Then delete extraneous, non maximal intervals.

Page 18: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Alternative 3: Entirely in SQL

CREATE TABLE Temp(Salary, Start, Stop) AS SELECT Salary, Start, Stop FROM Employee WHERE Name = 'Bob';

SELECT DISTINCT F.Salary, F.Start, L.StopFROM Temp AS F, Temp AS LWHERE F.Start < L.StopAND F.Salary = L.SalaryAND NOT EXISTS (SELECT *

FROM Temp AS MWHERE M.Salary = F.Salary AND F.Start < M.StartAND M.Start < L.StopAND NOT EXISTS (SELECT *

FROM Temp AS T1WHERE T1.Salary = F.Salary AND T1.Start < M.StartAND M.Start <= T1.Stop))

AND NOT EXISTS (SELECT *FROM Temp AS T2WHERE T2.Salary = F.Salary AND

( (T2.Start < F.Start AND F.Start <= T2.Stop) OR (T2.Start < L.Stop AND L.Stop < T2.Stop)))

Page 19: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Alternative 4: Using More Procedural Code

Use SQL only to open a cursor on the table Maintain a linked list of intervals, each with a salary;

Initialize this linked list to empty;

DECLARE emp_cursor CURSOR FORSELECT Salary, Start, StopFROM Employee;OPEN emp_cursor;loop:

FETCH emp_cursor INTO :salary,:start,:stop; if no data returned then go to finished;find position in linked list to insert this information;

go to loop;finished:CLOSE emp_cursor;iterate through linked list, printing out dates and salaries

Page 20: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

A More Drastic Alternatives Reorganize the schema

Separate Salary, Title, and DateofBirth information:

Employee1 (Name, Salary, Start DATE, Stop DATE)

Employee2 (Name, Title, Start DATE, S top DATE)

Getting the salary information is now easy:

SELECT Salary, Start, StopFROM Employee1WHERE Name = 'Bob‘

But what if we want a table with both salary and title?

Page 21: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal Joins

Name Salary Start StopBob 60000 1993 01 01 1993 06 01Bob 70000 1993 06 01 1995 01 01

Name Title Start StopBob AssistantProvost 1993 01 01 1993 10 01Bob Provost 1993 10 01 1994 02 01Bob FullProfessor 1994 02 01 1995 01 01

Name Salary Title Start Stop

Bob 60000 AssistantProvost 1993 01 01 1993 06 01

Bob 70000 AssistantProvost 1993 06 01 1993 10 01

Bob 70000 Provost 1993 10 01 1994 02 01

Bob 70000 FullProfessor 1994 02 01 1995 01 01

Their Temporal Join:

Employee1:

Employee2:

Page 22: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal Join in SQL

SELECT E1.Name, Salary, Title,E1.Start, E1.Stop

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name=E2.Name AND E2.Start <= E1.Start AND E1.Stop <= E2.Stop

UNION ALL

SELECT E1.Name, Salary, Title,E1.Start, E2.Stop

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2.Name AND E1.Start > E2.Start AND E2.Stop< E1.Stop AND E1.Start < E2.Stop

UNION ALL

SELECT E1.Name, Salary, TitleE2.Start, E1.St

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2.Name AND E2.Start > E1.Start AND E1.Stop <= E2.Stop AND E2.Start < E1.Stop

UNION ALL

SELECT E1.Name, Salary, TitleE2.Start, E2.Stop

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2 Name AND E2.Start => E1.Start AND E2.Stop <= E1.Stop AND NOT (E1.Start = E2.Start AND E1.Stop = E2.Stop)

Page 23: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Extracting the Salary History in TSQL2

SELECT Salary

FROM Employee

WHERE Name = 'Bob‘

There is no explicit mention of time in the query. By default the system returns the coalesced time history

Page 24: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Temporal Joins in TSQL2

SELECT E1.Name, Salary, Title

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2.Name

Page 25: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Summary

Coalescing and temporal joins are very difficult to express in SQL.

Solutions proposed …Special operators for period-based representationTSQL2: avoid explicit operations on periods (implicit

model)Point-Based Representation Time stamp attributes rather than tuples (difficult on

tables but not on structured XML documents) Others,including combinations of above (more than 40

counted)

Page 26: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Operators on Periods

A new aggregate called coalesce An overlap operator for joins:

SELECT E1.Name, Salary, Title

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2.Name AND

overlap(E1.Start,E1.End, E2.Start, E2.End)

Definition of overlap

Page 27: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Allen Operators on Intervals

Overlap, Contains, Meets, Precedes, follows. Contains is also applicable to sets of intervals.

Page 28: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Point-Based Model

Employee1 (Name, Sal, Day ) Bob 6000 1993 01 01

Bob 6000 1993 05 31

Bob 7000 1993 05 31

… Bob 7000 1994 12 31

Internally we might still use the period-based representation for point-based and TSQL2:

Name Salary Start StopBob 60000 1993 01 01 1993 06 01Bob 70000 1993 06 01 1995 01 01

Page 29: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Queries in Point Based

No coalescing needed in the query: e.g., project out salary:

SELECT E1.Name, E1.Day

FROM Employee1 AS E1

Temporal Joins are simple: SELECT E1.Name, Sal, Title

FROM Employee1 AS E1, Employee2 AS E2

WHERE E1.Name = E2.Name AND E1.Day=E2.Day

Page 30: CS240A: Databases and Knowledge Bases Temporal Applications and SQL:1999

Conclusions

Several alternatives, in terms of data model and SQL extensions to be used,

Internal representation often must be divorced from external one—adding to alternatives and complexity

New temporal clustering and indexing schemes should be used for maximum performance.


Recommended