Database concurrency control & recovery (1)

Concurrency Control & Recovery Database Consistency

Multi User Environment(Data Sharing)

Transactions interferenceSystem Crash

Hardware failureSoftware failure

Concurrency ControlSafeguard against transaction interference

Database RecoveryRestore database to earlier consistent state

The concept of Transaction Action(s) by user or program to read/write in the database

Logical unit of work against a database

either done entirely or not even a bit of it

Consist of SQL query and/or programming instructions

DBMS/Database Architecture

The concept of Transaction

States of Transaction

The concept of TransactionACID Properties of Transaction Atomicity : All or Nothing

Consistency : Transform database from one consistent state to next

Isolation : Independent of each other

Durability : Permanent effects

Concurrency ControlDatabase: Shared DataMultiple transactions, concurrent access, potential interferenceMultiple reads, No problemMultiple reads, at least one write: Potential interferenceConcurrency: Managing concurrent access to avoid interference

Transactional Interference: Potential Problems

Lost Update


Uncommitted Dependency(Dirty Read)


Inconsistent Analysis

SerializabilitySchedule: A sequence of the operations by a set of concurrent transactions that preserves the order of the operations in each of the individual transactions.

Serial Schedule: A schedule where the operations of each transaction are executed consecutively without any interleaved operations from other transactions.

Nonserial Schedule: A schedule where the operations from a set of concurrent transactions are interleaved.

SerializabilityTwo transactions—same data item—only read

(No Problem)Two transaction—different data items—read/write

(No problem)Two transactions—same data items—either of them write (Potential Problem, the order matters)

Serializable Schedule: If the interleaved operations of the two concurrent transactions produce the same results, are called seriazable schedule.

Concurrency Control: Serializability

(a) And (b) are two equivalent serializable schedules

(c) Is the serial schedule

Concurrency Control: RecoverabilitySerializability

Serialiazable schedules maitain consistencyAssumption: No failurePotential Problem: Irrecoverable Schedule

Irrecoverable Schedule

Concurrency Control: RecoverabilityRecoverable ScheduleIf ITEM(a) was updated by Transaction Ti and latter on read by Tj, then Ti should commit prior to Tj.

Concurrency Control Techniques• Locking• Time Stamping• Optimistic Techniques

Concurrency Control Techniques: LockingAn item accessed/updated by one transaction may be denied access by another transaction.

Locking• Shared Lock(read): can only read• Exclusive Lock(write): can do both read and

write• System support for upgrade/downgrade of locks

Two-Phase 2PL Locking ProtocolAll the locking operation in a transaction must precede the first unlock

Growing Phase: A lock is required as soon as a data item is accessed. May be read or write. All locks are secured. No Unlock

Shrinking Phase: No new lock could be acquired after first unlock. Locks are only released.

Upgrade allowed only in growing phase

Downgrade allowed only in shrinking phase

Two-Phase 2PL Locking Protocol

Preventing Lost Update Problem


Preventing Uncommitted Dependence Problem


Preventing Inconsistent Analysis Problem


Creating Cascade rollback Problem

Two-Phase 2PL Locking ProtocolPossible Solution to cascade rollback

Rigorous 2PL: Release all unlock at the end Strict 2PL: hold only exclusive unlocks till end

Deadlock: A locking problemWhen two(or more) transactions wait for each other to release their corresponding locks.

Problem: Deadlock

Deadlock: A locking problemSolution: Rollback certain transaction(s) and restart User should be unaware of deadlock and solution

Solution: Timing Deadlock Prevention Deadlock detection and recovery

Deadlock: SolutionsTiming• System defined time slice• If transaction timed out, aborted and restarted automatically• transaction may not necessarily be in deadlock• Simple protocol, used by many commercial DBMSs

Deadlock: SolutionsDeadlock Prevention

•Two solutions by proposed by Rosenkrants et. al. (1978)•Timestamp assigned to each transaction•Wait-Die: older transaction wait for younger•If younger request lock hold by older, younger aborted(die), restarted with same timestamp (eventually gets oldest)•Wound-Wait: younger wait for older•If older request a lock hold by younger, younger is aborted(wounded)

Conservative 2PL•Acquire and release all locks at once•Advantage if lock contention is heavy: No blocking, no wait•Low Contention: Locks are held longer•High lock setting overheads: Must release all locks even if single of them not granted.•Not practical: Advanced knowledge of locks required

Deadlock: SolutionsDeadlock detection and Recovery•TiTj shows Ti is dependent on Tj

•Shows Tj hold a resource required by Ti

•Deadlock exists if WFG contains circle TiTjTk•Frequency of deadlock detection•Too large: deadlock undetected•Too small: time waisted •Dynamic approach

Wait-for-graph

Deadlock: SolutionsRecovery•Choice of deadlock victim•Transaction that has been running the long•How many dataitems have been updated•How many dataitems to update

•How far to rallback•Avoiding starvation•The same transaction is the victim repeatidly•Use a counter to count number of time a transaction rollbacked•If reach upper limit, use different protocol

Timestamping (Another concurrency control protocol)

Timestamp: A unique identifier, represent the

relative starting time/order of the transactions

-- System clock or logical counter is usedTimestamping:

Older transactions get priority incase of conflict

A read and write by a transaction on a data item is allowed only if the preceding update to the data item was made by older transaction

Timestamping ( Timestamping continues…)

A transact T has timestamp ts(T) A dataitem x has read timestamp as read_timestamp(x) and write timestamp as

write_timestamp(x) Transaction T wants read x

Allowed only if ts(T)>write_timestamp(x) otherwise an older(earlier) transaction is trying to read a value updated by younger(newer) transaction

Older transaction is too late, rollbacked and restarted with new time stamp

Set read_timestamp=max(ts(T), read_timestamp(x)

Timestamping ( Timestamping continues…)

Transaction T wants write x Allowed only if ts(T)>read_timestamp(x) and ts(T)>write_timestamp(x)If ts(T)<read_timestamp(x) then younger (newer) transaction has already read it and is using it and older is late in updatingSimilarly if ts(T)<write_timestamp(x) then T is trying to update x to an obsolete valueIn both cases Restart T with later timestampOtherwise the transaction can proceedTimestamping is serializable, but not recoverable

Timestamping(Thomas’s Write Rule) Transaction T wants write x

Allowed only if ts(T)>read_timestamp(x) and ts(T)>write_timestamp(x)If ts(T)<read_timestamp(x) then younger (newer) transaction has already read it and is using it and older is late in updatingSimilarly if ts(T)<write_timestamp(x) then T is trying to update x to an obsolete valueIn first case Restart T with later timestamp, as beforeIn the second case simply ignore update, called Ignore Obsolete Write Rule

Otherwise the transaction can proceed

Optimistic Techniques Conflict (interaction between transactions) is rare, is the basic premise No conflict checking, No delays Efficient policy where conflicts are less frequent Before commit, check for conflict, rollback if found Very efficient: No locks, no concurrency checks According to premise less transaction rollback Intolerable in environment where conflicts are frequent Choose another concurrency control

Optimistic TechniquesThree phases in OT

Read PhaseExtends from start to commitAll values are read and stored in locallyChanges are made to local variables

Validation PhaseChecks serializibilty not violated, database remain consistentRestart transaction if conflict occurred, restart

Write PhaseIf update transaction, apply changes to database stored locally

Optimistic Techniques Assign timestamps start(T), validation(T), finish(T) to each transaction T Validation is passed only if

All earlier should finish before T i.e. finish(E)<start(T)If finish(E)>start(T) then

Data items written by E are diff than read by T

(Writes done serially)Start(T)<finish(E)<validation(T)

Granularity for Dataitems

The size of data item used as unit of protection

Granularity of Data item The size of data item used as unit of protection Granularity has greater performance implications on concurrency control algorith There is a tradeof between coarse vs fine granularity E.g. Granularity is not the same for updating a single record vs 80% records of a tableCoarse granularity, low degree of concurrency, low locking information maintenance Fine granularity, High degree of concurrency, more locking information maintenance A better approach, mixed granularity, upgrade and downgrade of locks

Database RecoveryRestore database to correct state incase of failure DBMS is resilient if it is fault tolerant

Storage Media Types Volatile Primary Memory, random access, fast, but expensive Non-volatile online secondary memory disk storage, random access Other non volatile offline secondary storages: Magnetic Tape and Optical Disk Suit only backup, slow, MT sequential access

Types of Failures System crash Media failure Application software errors Natural physical disasters Carelessness Sabotage

All failures involve either main memory or disk copy

Transactions and Recovery Unit of recovery is transaction

Recovery manager guarantee atomicity and durability

Database buffer complicate the issue

Durability guarantees when database buffer flushed

Committed transactions may not reach the database

Buffer flushed either when full or forced written

Transactions and Recovery (continue)In the event of failure

oActive transactions(incomplete) udone, i.e. rollbackedoCommitted tranactions are redone, called rollforwardoPartial undo, when single transaction roll backoGlobal undo, when all active transactions rollbacked

Transactions and Recovery (continue)

Example transactions rollback/rollforward

Buffer ManagementPages brought in as soon as requestedWhen buffer is full, old pages replaced with new onesPage replacement policies: FIFO, LRUTwo var associated with each page: pinCount, DirtyOn each request pinCount is incr, also called pinnedDecr by the system when donePinned pages can’nt be replacedWrite to disk if Dirty is set, on replacementFor new page Dirty is set to 0

Buffer Management (Continued…)When writing pages two policies used

o Steal PolicyPinned pages could be stolen from the transaction, i.e. written to diskAlternative is no-steal

o Force PolicyDirty pages are immediately written to disk on committAlternative no-force

No steal and force is simple to implementWith no-steal no rollback, with force no rollforward Steal and forceSteal obviate the need for large buffer spaceno-force provide opportunity for later transaction to update and then write

Recovery Facilitieso Backup Mechanismo Logging (also called journaling)o Checkpointingo Recovery meneger

BackupOffline storage of data and log filesUsed if database is distroyed or damagedTacken at regular intervalEither complete or incremental BackupIncr is changes after last full/incr backup

Log/Journal FileOnly for

insert and update

Type of

Record

Only for

delete and

Update

Next Record of this transaction

Log/Journal File (Continues…)

Important in both recovery and performance• Log file is some times duplexed or triplexed• for performance, log stored on separate physical drive• backup log file where log data is huge• minor failures recovered from online log in short time• Major failure from offline log

Checkpoint The point of synchronization b/w data and log files Buffers are force written Force write all committed and active transactions Also a check point record is written(consist of IDs of Active transactions)

During Recoveryo Rollforward transactions with commit record after the last checkpointo Rollback transactions without commit record

Recovery TechniquesTwo types• Major recovery if database file damaged• Restore last back • Apply changes from log file after last backup• Assumption: Log file not damaged, separate storage

• Minor recovery such as after system crash• Rollforward/rollback certain transactions• Use the before and after image in log file• Two protocols deferred update and immediate update are used

Recovery Techniques(continues…)

Deferred Update Don’t write until commit, no undoing if aborted Requires redoing committed transaction

Log File Use1. Write start record at start2. During write, write log record except before

image, don’t write anything to buffer or database

3. If transaction commit, write commit log record, also record changes to database buffer/database

4. If transaction abort, do nothing, just ignore log record

During RecoveryOnly rollforward, repeated failures, write operations

idempotant


Deferred Update Immediately record every change, need undoing if aborted But still requires redoing committed transaction

Log File Use1. Write start record at start2. During write, write log record, both before and after image3. After writing log, now record the changes to buffer4. Actual changes will reach database when buffer next

flushed5. If transaction commit, write commit to database buffer6. If transaction abort, undo required, use before image log7. Write-ahead log protocol is a must

During RecoveryBoth rollforward, using after-image, and rollback, using

before-image


Shadow Pagingo Log-less protocolo Maintains two page tables for a transaction, current page table and shadow page tableo Both are the same in starto Shadow never changed, used for recoveryo changes recorded to currento After transaction complete, current page becomes shadow

Advantages No log no log overheads faster recovery, no undo/redo

Disadvantage Data fragmentation Periodic garbage collection of inaccessible blocks

Date post:	07-Apr-2017
Category:	Education
Upload:	rashid-khan
View:	42 times
Download:	1 times

Database concurrency control & recovery (1)

Education