Date post: | 16-Jan-2016 |
Category: |
Documents |
Upload: | adrian-riley |
View: | 226 times |
Download: | 0 times |
Transaction Support
Chapter 9
Transaction
TransactionAn action or series of actions, carried out by a
single user or application program, which accesses or changes the contents of the database.
A transaction is a logical unit of work on the database.
It may involve any number of operations on the database.
The execution of an application program can be seen as a series of transactions with non-database processing done in between.
Transaction states
A transaction transforms a database from one consistent state into another. During the transaction inconsistent states are possible.
Outcomes: Committed : database reaches a new consistent state. Aborted : database must be restored to the consistent
state it was in before the transaction started. An aborted transaction is rolled-back or undone (message
independence). A committed transaction cannot be aborted. Transactions cannot be nested. The DBMS has no way of finding the transaction boundaries but most
DML have appropriate keywords available: Begin Transaction Commit Rollback
Properties of a transaction
AtomicityThe “all or nothing” property .A transaction is performed in its entirety or not performed at
all. Consistency
A transaction must transform a database from one consistent state into another consistent state.
IndependenceTransactions execute independently of one another. In other
words, the partial effects of incomplete transactions should not be visible to other transactions.
DurabilityThe effects of committed transactions should not be lost
because of subsequent failure.
Database Architecture
Accessmethods
File manager
Systembuffers
Database anddata dictionary
Transactionmanager
Scheduler
Buffermanager
Recoverymanager
Transaction Manager : Coordinates transactions on behalf of the applications.Scheduler : Responsible for implementing a strategy for concurrency
control.Recovery Manager : Ensures that the database remains in a consistent state.Buffer Manager : Responsible for the transfer of data between disk
storage and main memory.
Concurrency Control
Concurrency ControlThe process of managing simultaneous operations on the
database without having them interfere with one another.
Many users work simultaneously. Database records are first transferred to central
memory where the modifications are performed and then rewritten to the disk.
Transactions can be interleaved due to program interrupts.
This leads to concurrency problems: Lost update problem;Uncommitted dependency problem; Inconsistent analysis problem.
Lost Update Problem
Successfully completed update is overwritten by another user.
T1 withdrawing €10 from an account with balance x, initially € 100.
T2 depositing € 100 into same account. Serially, final balance would be € 190.
Lost Update Problem
Time Transaction T1 Transaction T2 value x
t1 Begin transaction
t2 Begin Transaction Read R1(x) x=100
t3 Read R1(x)
t4 x=x+100
t5 x=x-10
t6 write R(x) x=200
t7 write R(x) x=90
t8 commit
t9 commit
Uncommitted Dependency Problem
Occurs when one transaction can see intermediate results of another transaction before it has committed.
T4 updates x to €200 but it aborts, so balx should be back at original value of €100.
T3 has read new value of x (€200) and uses value as basis of €10 reduction, giving a new balance of €190, instead of €90.
Uncommitted Dependency Problem
Time Transaction T3 Transaction T4 value x
t1 Begin transaction
t2 Read R1(x) x=100
t3 x = x + 100
t4 write R1(x) x=200
t5 Begin transaction
t6 read R1(x)
t7 Rollback x=100
t8 x = x - 10 x=190
t9 write R1(x)
t10 Commit should be 90
Inconsistent Analysis Problem
Occurs when transaction reads several values but second transaction updates some of them during execution of first.
Sometimes referred to as dirty read or unrepeatable read.
T6 is totaling balances of account x (€100), account y (€50), and account z (€25).
Meantime, T5 has transferred €10 from x to z, so T6 now has wrong result (€10 too high).
Inconsistent Analysis ProblemTime Transaction T5 Transaction T6 x y z sum
t1 Begin transaction 100 50 25
t2 Begin Transaction sum=0 100 50 25 0
t3 read x read x 100 50 25 0
t4 x = x - 10 sum = sum + x 100 50 25 100
t5 write x read y 90 50 25 100
t6 read z sum = sum + y 90 50 25 150
t7 z = z + 10 90 50 25 150
t8 write z 90 50 35 150
t9 commit read z 90 50 35 150
t10 sum = sum + z 90 50 35 185
t11 commit 90 50 35 185
Serializability
Objective of a concurrency control protocol is to schedule transactions in such a way as to avoid any interference.
Could run transactions serially, but this limits degree of concurrency or parallelism in system.
Serializability identifies those executions of transactions guaranteed to ensure consistency.
Serializability
ScheduleA transaction consists of a sequence of reads and
writes to the database. The sequence of reads and writes by a set of concurrent transactions taken together is the schedule.
Serial ScheduleA schedule where the operations of each
transaction are executed consecutively without any interleaved operation from other transactions.
Nonserial ScheduleA schedule where the operations from a set of
concurrent transactions are interleaved.
Nonserial Schedule
Schedule where operations from set of concurrent transactions are interleaved.
Objective of serializability is to find nonserial schedules that allow transactions to execute concurrently without interfering with one another.
In other words, want to find nonserial schedules that are equivalent to some serial schedule. Such a schedule is called serializable.
Serializable Schedule
Serializable Schedule If a set of transactions executes concurrently, we say that the
schedule is correct (serializable), if it produces the same result as some serial execution.
The ordering of reads and writes is important in serializability if two transactions only read a data item, they do not conflict
and order is not important; if two transactions either read or write completely separate
data items, they do not conflict and order is not important ; if one transaction writes a data item and another either reads
or writes the same data item, the order of execution is important.
Recoverability
Serializability identifies schedules that maintain database consistency, assuming no transaction fails.
Could also examine recoverability of transactions within schedule.
If transaction fails, atomicity requires effects of transaction to be undone.
Durability states that once transaction commits, its changes cannot be undone (without running another, compensating, transaction).
Recoverable Schedule
A schedule where, for each pair of transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj.
Concurrency Control Techniques
Two basic concurrency control techniques: Locking Time-stamping
Both are conservative approaches: delay transactions in case they conflict with other transactions.
Optimistic methods assume conflict is rare and only check for conflicts at commit.
Concurrency Control Techniques
LockingA procedure used to control concurrent access to data. When
one transaction is accessing the database, a lock may deny access to other transactions to prevent incorrect updates.
Data items of various sizes, ranging from the entire database down to a field , may be locked.
The size of the item determines the granularity of the lock.
Implementation can be done bysetting a bit in the data item;keeping a list of locked parts;other techniques.
Lock Types
Read lockIf a transaction has a read lock on a data item, it can
read the item but not update it.
Write lockIf a transaction has a write lock on the data item , it
can both read and update the item.
Using Locks
Any transaction that needs to access the data item must first lock the item, requesting a read lock for read only access or a write lock for both read and write access.
If the item is not already locked by another transaction , the lock will be granted.
If the item is currently locked, the DBMS determines whether the request is compatible with the existing lock :
a read request on an item with a read lock will be granted;
for other requests the transaction must wait until the existing lock is released.
A transaction continues to hold a lock until it explicitly releases it , either during execution or when it terminates.
It is only when the write lock has been released that the effects of the write operation will be made visible to other transactions.
Two-phase Locking
2PLA transaction follows the two-phase locking protocol if all
locking operations precede the first unlock operation in the transaction.
With this protocol every transaction has two phases:growing phase: where no locks can be released;shrinking phase: where no locks can be acquired.
Some systems allow upgrades ( in the growing phase ) or downgrades ( in the shrinking phase ) of a lock.
With 2PL serializability of schedules can be granted.
2PL: Lost update solution
Time Transaction T1 Transaction T2 value x
t1 Begin transaction
t2 Begin Transaction Read R1(x) WL x=100
t3 request WL
t4 wait x=x+100
t5 wait write R(x) x=200
t6 wait unlock(x) UL
t7 Read R1(x) WL commit
t8 x=x-10
t9 write R(x) x=190
t10 unlock(x) UL
t11 commit
2PL: Uncommitted Dependency
Time Transaction T1 Transaction T2 value x
t1 Begin transaction
t2 Read R1(x) WL x=100
t3 x = x + 100
t4 write R1(x) x=200
t5 Begin transaction
t6 request WL x=200
t7 wait Rollback UL x=100
t8 read R1(x) WL x=100
t9 x = x - 10 x=90
t10 write(x) x=90
t11 Commit
Deadlock
DeadlockAn impasse that may result when two or more transactions are
each waiting for locks held by the other to be released.
Time Transaction 1 Transaction 2
t1 begin transaction
t2 write-lock (x) begin transaction
t3 read (x) write-lock (y)
t4 x = x - 10 read (y)
t5 write (x) y = y + 100
t6 write-lock (y) write (y)
t7 wait write-lock (x)
t8 wait wait
t9 wait wait
Deadlock
Only one way to break deadlock: abort one or more of the transactions.
Deadlock should be transparent to user, so DBMS should restart transaction(s).
Two general techniques for handling deadlock: Deadlock prevention. Deadlock detection and recovery.
Deadlock Prevention
DBMS looks ahead to see if transaction would cause deadlock and never allows deadlock to occur.
Could order transactions using transaction timestamps: Wait-Die - only an older transaction can wait
for younger one, otherwise transaction is aborted (dies) and restarted with same timestamp.
Deadlock Prevention
Wound-Wait - only a younger transaction can wait for an older one. If older transaction requests lock held by younger one, younger one is aborted (wounded).
Timestamping
TimestampA unique identifier created by the DBMS that indicates the
relative starting time of a transaction.Data items can get read-timestamp or a write-timestamp.
TimestampingA concurrency control protocol in which the fundamental goal is
to order transactions globally in such a way that older transactions (with smaller time stamps) get priority in the event of conflict .
If a transaction attempts to read or write a data item , it can only proceed if the last update on that data item was carried out by an older transaction; otherwise , the transaction is restarted and given a new timestamp.
Timestamping
Read/write proceeds only if last update on that data item was carried out by an older transaction.
Otherwise, transaction requesting read/write is restarted and given a new timestamp.
Also timestamps for data items: read-timestamp - timestamp of last transaction to
read item. write-timestamp - timestamp of last transaction to
write item.
Timestamping - Read(x)
Consider a transaction T with timestamp ts(T):
ts(T) < write_timestamp(x)
x already updated by younger (later) transaction. Transaction must be aborted and restarted with a
new timestamp.
Timestamping - Read(x)
ts(T) < read_timestamp(x)
x already read by younger transaction. Roll back transaction and restart it using a later
timestamp. ( must not be aborted because it cannot have read a wrong version since the write has not yet been done).
Granularity of Data Items
Size of data items chosen as unit of protection by concurrency control protocol.
Ranging from coarse to fine: The entire database. A file. A page (or area or database spaced). A record. A field value of a record.
Granularity of Data Items
Tradeoff: coarser, the lower the degree of concurrency. finer, more locking information that is needed to
be stored. Best item size depends on the types of
transactions.
Levels of Locking
File Recovery
Two Complementary Techniques. Backup.
Periodical copy of the Database on an archive file.• full backup.• incremental backup.
Recovery ( more than 10% of code of DBMS ).after a failure bring the database back in a reliable state.
Redundancy is needed.
Time factor is crucial.
Restore Actions (simplified)
In case of failure:DB is unreadable DB is unreliable
LastArchive
Logfile
REDOUNDO
NewDB New
DB
}
UNDO
UNDO a started transaction can occur in case of: insertdeleteupdate.
Direct access device is needed
active log
Continuous transition
archive log Image before only on active log.
Database Recovery
Process of restoring database to a correct state in the event of a failure.
Need for Recovery Control
Two types of storage: volatile (main memory) and nonvolatile.
Volatile storage does not survive system crashes. Stable storage represents information that has been
replicated in several nonvolatile storage media with independent failure modes.
Recovery Techniques
If database has been damaged:Need to restore last backup copy of database and
reapply updates of committed transactions using log file.
If database is only inconsistent:Need to undo changes that caused inconsistency.
May also need to redo some transactions to ensure updates reach secondary storage.
Do not need backup, but can restore database using before- and after-images in the log file.
Types of failures
System crashes, resulting in loss of main memory.
Media failures, resulting in loss of parts of secondary storage.
Application software errors. Natural physical disasters. Carelessness or unintentional destruction of
data or facilities. Sabotage.
Transactions and Recovery
Transactions represent basic unit of recovery. Recovery manager responsible for atomicity
and durability. If failure occurs between commit and database
buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transaction's updates.
Transactions and Recovery
If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity.
Partial undo - only one transaction has to be undone.
Global undo - all transactions have to be undone.
System interruption
I/O buffers lostDB intact UNDO of current transactions is needed but
those transactions are difficult to identify.
Concept of " Checkpoint "periodically the following steps must be performed: Step 1: Log-buffers emptied on the logfile. Step 2: Checkpoint record written on the Logfile. Step 3: Database buffers emptied on the
database. Step 4: Address of checkpoint record written on
the "restart file".
Log File
Contains information about all updates to database:Transaction records.Checkpoint records.
Often used for other purposes (for example, auditing).
Log File
Transaction records contain:Transaction identifier.Type of log record, (transaction start, insert, update,
delete, abort, commit).Identifier of data item affected by database action
(insert, delete, and update operations).Before-image of data item.After-image of data item.Log management information.
77
Sample Log File
Log File
Log file may be duplexed or triplexed. Log file sometimes split into two separate
random-access files. Potential bottleneck; critical in determining
overall performance.
Checkpointing
CheckpointPoint of synchronization between database and log
file. All buffers are force-written to secondary storage.
Checkpoint record is created containing identifiers of all active transactions.
When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash.
Algorithm to Define Transaction States
The algorithm starts with the creation of two lists;UNDO list contains all transactions in the Checkpoint recordREDO list is empty
Forward reading of the logfile starting at Tc ; encounter "BEGIN" move transaction to UNDO listencounter "COMMIT" move transaction from UNDO to
REDOat end of logfile both list are correct
Timetc tf
T1
T2
T3
T4
T5
Checkpointing
In previous example, with checkpoint at time tc, changes made by T2 and T3 have been written to secondary storage.
Thus:only redo T2 and T4,undo transactions T3 and T5.
Steps in recovery process
1. Define states of the transactions with specific algorithm.
2. Forward processing of the logfile to REDO transactions.
3. Backward processing of the logfile to UNDO transactions.
REDO and UNDO must be idempotent operations.
This technique can also be used in a deadlock situation.