Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | ross-fletcher |
View: | 219 times |
Download: | 1 times |
Real-time Query Processing - Agenda
• Description of RTDBs
• Aspects of real-time query processing
• Real-time query approaches
• Query processing in conventional and real-time DBMS’s
• In order to be a real-time database, it needs to have:
• But you already knew that…what may not have been so clear are some of the misconceptions about real-time databases…
Description of real-time databases
– Timing constraints (deadlines), and– Temporal data (data validity related to time)
Description of real-time databases
• Here are a few misconceptions about real-time databases that have persisted– Must be special purpose and highly specialized
– Must be fast
– Must be in-memory
• There is not complete consensus, especially in practice, about what a real-time database even is
Aspects of real-time query processingTiming
constraints(deadlines)
Soft: A query missing a
deadline results in less value than if the deadline were met
Firm: A query missing a
deadline means the results are useless
Hard: A query missing a
deadline means system failure
Topology Distributed: A query references data on more than one machine
Single node:A query uses data on a
single machine, which is possibly at least partially on disk
Embedded: The real-time database
are queries are entirely in-memory and make use of a dedicated processor
Periodicity and timings
Dynamic: Queries and their
timings are not known in advance, and can not be estimated or bounded
Extended Static: Queries and their timings
are not known in advance but can either be bounded or estimated with a degree of certainty
Static: Queries and their
timings are known in advance and are deterministic
Real-time query approaches
• Priority Memory Management (PMM) algorithm– Prioritizes queries by EDF– Queues and admits queries – Allocates memory to each query based on min/max– Adjusts memory allocations based on resource
utilization
Real-time query approaches
• B-tree indexing for real-time queries– Starting point is a conventional B-tree
with queries prioritized by EDF
Real-time query approaches
• B-tree indexing for real-time queries (cont’d)
– As each query searches a node, it locks that node and all its child nodes for either read or write.
– If there is a conflict, the query with the lowest priority restarts at the top of the tree
– Rebalancing operations have the lowest priority
Real-time query approaches
• R-tree – similar to algorithms used for non-real-time distributed queries
4 physical locations mapped by the R-tree
Real-time query approaches
• Freshness for base and derived real-time data– Prioritizes queries by
EDF– Calculates Access
Update Ratio (AUR) for each data item (access frequency / update frequency)
– Uses (and adjusts) an AUR threshold for when data is updated
Query processing in conventional and real-time DBMS’s
• A conventional DBMS has several cross purposes with a real-time DBMS. – Multiple platforms or versions
abstraction from hardware/OS– Measured by TPC-D performance
rules– ACID properties inviolate may be
incompatible with real-time constraints
Query processing in conventional and real-time DBMS’s
• But conventional DBMS’s are sometimes used for data with temporal aspects, e.g. decision support systems (DSS)
select city.name, sum(package.weight)from package /*real-time*/, city/*stored*/where package.weight > 100 and package.service = 'priority' and package.zip = city.zip group by city.name
Query processing in conventional and real-time DBMS’s
Conventional DBMS's Real-time DBMS's
Timing constraints (deadlines)
Soft Firm Hard
Topology Distributed Single node Embedded
Periodicity and timings
Dynamic Extended Static Static
Real-time query processing• Algorithms have been developed for many
aspects of real-time query processing• Conventional DBMS’s don’t handle real-
time queries except by ‘brute force’ and possibly some of the algorithms we’ve seen
• Challenges remain for real-time query processing– Security– Fault tolerance– Scalability– Generalization