Transaction Support Chapter 9. Transaction Transaction An action or series of actions, carried out...

Transaction Support

Chapter 9

Transaction

TransactionAn action or series of actions, carried out by a

single user or application program, which accesses or changes the contents of the database.

A transaction is a logical unit of work on the database.

It may involve any number of operations on the database.

The execution of an application program can be seen as a series of transactions with non-database processing done in between.

Transaction states

A transaction transforms a database from one consistent state into another. During the transaction inconsistent states are possible.

Outcomes: Committed : database reaches a new consistent state. Aborted : database must be restored to the consistent

state it was in before the transaction started. An aborted transaction is rolled-back or undone (message

independence). A committed transaction cannot be aborted. Transactions cannot be nested. The DBMS has no way of finding the transaction boundaries but most

DML have appropriate keywords available: Begin Transaction Commit Rollback

Properties of a transaction

AtomicityThe “all or nothing” property .A transaction is performed in its entirety or not performed at

all. Consistency

A transaction must transform a database from one consistent state into another consistent state.

IndependenceTransactions execute independently of one another. In other

words, the partial effects of incomplete transactions should not be visible to other transactions.

DurabilityThe effects of committed transactions should not be lost

because of subsequent failure.

Database Architecture

Accessmethods

File manager

Systembuffers

Database anddata dictionary

Transactionmanager

Scheduler

Buffermanager

Recoverymanager

Transaction Manager : Coordinates transactions on behalf of the applications.Scheduler : Responsible for implementing a strategy for concurrency

control.Recovery Manager : Ensures that the database remains in a consistent state.Buffer Manager : Responsible for the transfer of data between disk

storage and main memory.

Concurrency Control

Concurrency ControlThe process of managing simultaneous operations on the

database without having them interfere with one another.

Many users work simultaneously. Database records are first transferred to central

memory where the modifications are performed and then rewritten to the disk.

Transactions can be interleaved due to program interrupts.

This leads to concurrency problems: Lost update problem;Uncommitted dependency problem; Inconsistent analysis problem.

Lost Update Problem

Successfully completed update is overwritten by another user.

T1 withdrawing €10 from an account with balance x, initially € 100.

T2 depositing € 100 into same account. Serially, final balance would be € 190.

Lost Update Problem

Time Transaction T1 Transaction T2 value x

t1 Begin transaction

t2 Begin Transaction Read R1(x) x=100

t3 Read R1(x)

t4 x=x+100

t5 x=x-10

t6 write R(x) x=200

t7 write R(x) x=90

t8 commit

t9 commit

Uncommitted Dependency Problem

Occurs when one transaction can see intermediate results of another transaction before it has committed.

T4 updates x to €200 but it aborts, so balx should be back at original value of €100.

T3 has read new value of x (€200) and uses value as basis of €10 reduction, giving a new balance of €190, instead of €90.

Uncommitted Dependency Problem



t2 Read R1(x) x=100

t3 x = x + 100

t4 write R1(x) x=200


t6 read R1(x)

t7 Rollback x=100

t8 x = x - 10 x=190

t9 write R1(x)

t10 Commit should be 90

Inconsistent Analysis Problem

Occurs when transaction reads several values but second transaction updates some of them during execution of first.

Sometimes referred to as dirty read or unrepeatable read.

T6 is totaling balances of account x (€100), account y (€50), and account z (€25).

Meantime, T5 has transferred €10 from x to z, so T6 now has wrong result (€10 too high).

Inconsistent Analysis ProblemTime Transaction T5 Transaction T6 x y z sum

t1 Begin transaction 100 50 25

t2 Begin Transaction sum=0 100 50 25 0

t3 read x read x 100 50 25 0

t4 x = x - 10 sum = sum + x 100 50 25 100

t5 write x read y 90 50 25 100

t6 read z sum = sum + y 90 50 25 150

t7 z = z + 10 90 50 25 150

t8 write z 90 50 35 150

t9 commit read z 90 50 35 150

t10 sum = sum + z 90 50 35 185

t11 commit 90 50 35 185

Serializability

Objective of a concurrency control protocol is to schedule transactions in such a way as to avoid any interference.

Could run transactions serially, but this limits degree of concurrency or parallelism in system.

Serializability identifies those executions of transactions guaranteed to ensure consistency.

Serializability

ScheduleA transaction consists of a sequence of reads and

writes to the database. The sequence of reads and writes by a set of concurrent transactions taken together is the schedule.

Serial ScheduleA schedule where the operations of each

transaction are executed consecutively without any interleaved operation from other transactions.

Nonserial ScheduleA schedule where the operations from a set of

concurrent transactions are interleaved.

Nonserial Schedule

Schedule where operations from set of concurrent transactions are interleaved.

Objective of serializability is to find nonserial schedules that allow transactions to execute concurrently without interfering with one another.

In other words, want to find nonserial schedules that are equivalent to some serial schedule. Such a schedule is called serializable.

Serializable Schedule

Serializable Schedule If a set of transactions executes concurrently, we say that the

schedule is correct (serializable), if it produces the same result as some serial execution.

The ordering of reads and writes is important in serializability if two transactions only read a data item, they do not conflict

and order is not important; if two transactions either read or write completely separate

data items, they do not conflict and order is not important ; if one transaction writes a data item and another either reads

or writes the same data item, the order of execution is important.

Recoverability

Serializability identifies schedules that maintain database consistency, assuming no transaction fails.

Could also examine recoverability of transactions within schedule.

If transaction fails, atomicity requires effects of transaction to be undone.

Durability states that once transaction commits, its changes cannot be undone (without running another, compensating, transaction).

Recoverable Schedule

A schedule where, for each pair of transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj.

Concurrency Control Techniques

Two basic concurrency control techniques: Locking Time-stamping

Both are conservative approaches: delay transactions in case they conflict with other transactions.

Optimistic methods assume conflict is rare and only check for conflicts at commit.

Concurrency Control Techniques

LockingA procedure used to control concurrent access to data. When

one transaction is accessing the database, a lock may deny access to other transactions to prevent incorrect updates.

Data items of various sizes, ranging from the entire database down to a field , may be locked.

The size of the item determines the granularity of the lock.

Implementation can be done bysetting a bit in the data item;keeping a list of locked parts;other techniques.

Lock Types

Read lockIf a transaction has a read lock on a data item, it can

read the item but not update it.

Write lockIf a transaction has a write lock on the data item , it

can both read and update the item.

Using Locks

Any transaction that needs to access the data item must first lock the item, requesting a read lock for read only access or a write lock for both read and write access.

If the item is not already locked by another transaction , the lock will be granted.

If the item is currently locked, the DBMS determines whether the request is compatible with the existing lock :

a read request on an item with a read lock will be granted;

for other requests the transaction must wait until the existing lock is released.

A transaction continues to hold a lock until it explicitly releases it , either during execution or when it terminates.

It is only when the write lock has been released that the effects of the write operation will be made visible to other transactions.

Two-phase Locking

2PLA transaction follows the two-phase locking protocol if all

locking operations precede the first unlock operation in the transaction.

With this protocol every transaction has two phases:growing phase: where no locks can be released;shrinking phase: where no locks can be acquired.

Some systems allow upgrades ( in the growing phase ) or downgrades ( in the shrinking phase ) of a lock.

With 2PL serializability of schedules can be granted.

2PL: Lost update solution



t2 Begin Transaction Read R1(x) WL x=100

t3 request WL

t4 wait x=x+100

t5 wait write R(x) x=200

t6 wait unlock(x) UL

t7 Read R1(x) WL commit

t8 x=x-10

t9 write R(x) x=190

t10 unlock(x) UL

t11 commit

2PL: Uncommitted Dependency



t2 Read R1(x) WL x=100

t3 x = x + 100

t4 write R1(x) x=200


t6 request WL x=200

t7 wait Rollback UL x=100

t8 read R1(x) WL x=100

t9 x = x - 10 x=90

t10 write(x) x=90

t11 Commit

Deadlock

DeadlockAn impasse that may result when two or more transactions are

each waiting for locks held by the other to be released.

Time Transaction 1 Transaction 2

t1 begin transaction

t2 write-lock (x) begin transaction

t3 read (x) write-lock (y)

t4 x = x - 10 read (y)

t5 write (x) y = y + 100

t6 write-lock (y) write (y)

t7 wait write-lock (x)

t8 wait wait

t9 wait wait

Deadlock

Only one way to break deadlock: abort one or more of the transactions.

Deadlock should be transparent to user, so DBMS should restart transaction(s).

Two general techniques for handling deadlock: Deadlock prevention. Deadlock detection and recovery.

Deadlock Prevention

DBMS looks ahead to see if transaction would cause deadlock and never allows deadlock to occur.

Could order transactions using transaction timestamps: Wait-Die - only an older transaction can wait

for younger one, otherwise transaction is aborted (dies) and restarted with same timestamp.

Deadlock Prevention

Wound-Wait - only a younger transaction can wait for an older one. If older transaction requests lock held by younger one, younger one is aborted (wounded).

Timestamping

TimestampA unique identifier created by the DBMS that indicates the

relative starting time of a transaction.Data items can get read-timestamp or a write-timestamp.

TimestampingA concurrency control protocol in which the fundamental goal is

to order transactions globally in such a way that older transactions (with smaller time stamps) get priority in the event of conflict .

If a transaction attempts to read or write a data item , it can only proceed if the last update on that data item was carried out by an older transaction; otherwise , the transaction is restarted and given a new timestamp.

Timestamping

Read/write proceeds only if last update on that data item was carried out by an older transaction.

Otherwise, transaction requesting read/write is restarted and given a new timestamp.

Also timestamps for data items: read-timestamp - timestamp of last transaction to

read item. write-timestamp - timestamp of last transaction to

write item.

Timestamping - Read(x)

Consider a transaction T with timestamp ts(T):

ts(T) < write_timestamp(x)

x already updated by younger (later) transaction. Transaction must be aborted and restarted with a

new timestamp.

Timestamping - Read(x)

ts(T) < read_timestamp(x)

x already read by younger transaction. Roll back transaction and restart it using a later

timestamp. ( must not be aborted because it cannot have read a wrong version since the write has not yet been done).

Granularity of Data Items

Size of data items chosen as unit of protection by concurrency control protocol.

Ranging from coarse to fine: The entire database. A file. A page (or area or database spaced). A record. A field value of a record.

Granularity of Data Items

Tradeoff: coarser, the lower the degree of concurrency. finer, more locking information that is needed to

be stored. Best item size depends on the types of

transactions.

Levels of Locking

File Recovery

Two Complementary Techniques. Backup.

Periodical copy of the Database on an archive file.• full backup.• incremental backup.

Recovery ( more than 10% of code of DBMS ).after a failure bring the database back in a reliable state.

Redundancy is needed.

Time factor is crucial.

Restore Actions (simplified)

In case of failure:DB is unreadable DB is unreliable

LastArchive

Logfile

REDOUNDO

NewDB New

DB

}

UNDO

UNDO a started transaction can occur in case of: insertdeleteupdate.

Direct access device is needed

active log

Continuous transition

archive log Image before only on active log.

Database Recovery

Process of restoring database to a correct state in the event of a failure.

Need for Recovery Control

Two types of storage: volatile (main memory) and nonvolatile.

Volatile storage does not survive system crashes. Stable storage represents information that has been

replicated in several nonvolatile storage media with independent failure modes.

Recovery Techniques

If database has been damaged:Need to restore last backup copy of database and

reapply updates of committed transactions using log file.

If database is only inconsistent:Need to undo changes that caused inconsistency.

May also need to redo some transactions to ensure updates reach secondary storage.

Do not need backup, but can restore database using before- and after-images in the log file.

Types of failures

System crashes, resulting in loss of main memory.

Media failures, resulting in loss of parts of secondary storage.

Application software errors. Natural physical disasters. Carelessness or unintentional destruction of

data or facilities. Sabotage.

Transactions and Recovery

Transactions represent basic unit of recovery. Recovery manager responsible for atomicity

and durability. If failure occurs between commit and database

buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transaction's updates.

Transactions and Recovery

If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity.

Partial undo - only one transaction has to be undone.

Global undo - all transactions have to be undone.

System interruption

I/O buffers lostDB intact UNDO of current transactions is needed but

those transactions are difficult to identify.

Concept of " Checkpoint "periodically the following steps must be performed: Step 1: Log-buffers emptied on the logfile. Step 2: Checkpoint record written on the Logfile. Step 3: Database buffers emptied on the

database. Step 4: Address of checkpoint record written on

the "restart file".

Log File

Contains information about all updates to database:Transaction records.Checkpoint records.

Often used for other purposes (for example, auditing).

Log File

Transaction records contain:Transaction identifier.Type of log record, (transaction start, insert, update,

delete, abort, commit).Identifier of data item affected by database action

(insert, delete, and update operations).Before-image of data item.After-image of data item.Log management information.

77

Sample Log File

Log File

Log file may be duplexed or triplexed. Log file sometimes split into two separate

random-access files. Potential bottleneck; critical in determining

overall performance.

Checkpointing

CheckpointPoint of synchronization between database and log

file. All buffers are force-written to secondary storage.

Checkpoint record is created containing identifiers of all active transactions.

When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash.

Algorithm to Define Transaction States

The algorithm starts with the creation of two lists;UNDO list contains all transactions in the Checkpoint recordREDO list is empty

Forward reading of the logfile starting at Tc ; encounter "BEGIN" move transaction to UNDO listencounter "COMMIT" move transaction from UNDO to

REDOat end of logfile both list are correct

Timetc tf

T1

T2

T3

T4

T5

Checkpointing

In previous example, with checkpoint at time tc, changes made by T2 and T3 have been written to secondary storage.

Thus:only redo T2 and T4,undo transactions T3 and T5.

Steps in recovery process

1. Define states of the transactions with specific algorithm.

2. Forward processing of the logfile to REDO transactions.

3. Backward processing of the logfile to UNDO transactions.

REDO and UNDO must be idempotent operations.

This technique can also be used in a deadlock situation.

Date post:	16-Jan-2016
Category:	Documents
Upload:	adrian-riley
View:	226 times
Download:	0 times

Transaction Support Chapter 9. Transaction Transaction An action or series of actions, carried out...

Documents