+ All Categories
Home > Documents > advanced database notes

advanced database notes

Date post: 03-Jun-2018
Category:
Upload: agrippa-mungazi
View: 222 times
Download: 0 times
Share this document with a friend
21
Schedule (computer science) From Wikipedia, the free encyclopedia This article needs additional citations for  verification . Please help improve this article  by adding citations to reliable sources. Unsourced material may be challenged and removed. (November 2012)  In the fields of  databases and transaction processing (transaction management), a schedule (or history) of a system is a n abstract model to describe execution of transactions running in the system. Often it is a list  of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If order in time between certain operations is not determined by the system, then a   partial order  is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting lock, locking, etc. Not all transaction operation types should be included in a sc hedule, and typically only selected operation types (e.g., data acc ess operations) are included, as needed to reason about and describe certain phenomena. Schedules and schedule properties are fundamental concepts in database concurrency control theory. Contents  1 Formal description  2 Types of schedule o 2.1 Serial o 2.2 Serializable  2.2.1 Conflicting actions   2.2.2 Conflict equivalence  2.2.3 Conflict-serializable  2.2.4 Commitment-ordered  2.2.5 View equivalence  2.2.6 View-serializable o 2.3 Recoverable  2.3.1 Unrecoverable  2.3.2 Avoids cascading aborts (rollbacks)  2.3.3 Strict  3 Hierarchical relationship between serializability classes   4 Practical implementations  5 See also  6 References Formal description The following is an example of a schedule: In this example, the horizontal axis represents t he different transactions in the schedule D. The vertical axis represents time order of operations. Schedule D consists of three
Transcript
Page 1: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 1/21

Schedule (computer science) From Wikipedia, the free encyclopedia

This article needs additional citations for verification . Please help improve thisarticle by adding citations to reliable sources . Unsourced material may be challengedand removed. (November 2012)

In the fields of databases and transaction processing (transaction management), a schedule (or history ) of a system is an abstract model to describe execution of transactions running inthe system. Often it is a list of operations (actions) ordered by time, performed by a set oftransactions that are executed together in the system. If order in time between certainoperations is not determined by the system, then a partial order is used. Examples of suchoperations are requesting a read operation, reading, writing, aborting, committing, requestinglock, locking, etc. Not all transaction operation types should be included in a schedule, and

typically only selected operation types (e.g., data access operations) are included, as neededto reason about and describe certain phenomena. Schedules and schedule properties arefundamental concepts in database concurrency control theory.

Contents

1 Formal description 2 Types of schedule

o 2.1 Serial o 2.2 Serializable

2.2.1 Conflicting actions 2.2.2 Conflict equivalence 2.2.3 Conflict-serializable 2.2.4 Commitment-ordered 2.2.5 View equivalence 2.2.6 View-serializable

o 2.3 Recoverable 2.3.1 Unrecoverable 2.3.2 Avoids cascading aborts (rollbacks) 2.3.3 Strict

3 Hierarchical relationship between serializability classes 4 Practical implementations 5 See also 6 References

Formal description

The following is an example of a schedule:

In this example, the horizontal axis represents the different transactions in the schedule D.The vertical axis represents time order of operations. Schedule D consists of three

Page 2: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 2/21

transactions T1, T2, T3. The schedule describes the actions of the transactions as seen by theDBMS . First T1 Reads and Writes to object X, and then Commits. Then T2 Reads and Writesto object Y and Commits, and finally T3 Reads and Writes to object Z and Commits. This isan example of a serial schedule, i.e., sequential with no overlap in time, because the actionsof in all three transactions are sequential, and the transactions are not interleaved in time.

Representing the schedule D above by a table (rather than a list) is just for the convenience ofidentifying each transaction's operations in a glance. This notation is used throughout thearticle below. A more common way in the technical literature for representing such scheduleis by a list:

D = R1(X) W1(X) Com1 R2(Y) W2(Y) Com2 R3(Z) W3(Z) Com3

Usually, for the purpose of reasoning about concurrency control in databases, an operation ismodeled as atomic , occurring at a point in time, without duration. When this is notsatisfactory start and end time-points and possibly other point events are specified (rarely).Real executed operations always have some duration and specified respective times ofoccurrence of events within them (e.g., "exact" times of beginning and completion), but forconcurrency control reasoning usually only the precedence in time of the whole operations(without looking into the quite complex details of each operation) matters, i.e., whichoperation is before, or after another operation. Furthermore, in many cases the before/afterrelationships between two specific operations do not matter and should not be specified,while being specified for other pairs of operations.

In general operations of transactions in a schedule can interleave (i.e., transactions can beexecuted concurrently), while time orders between operations in each transaction remainunchanged as implied by the transaction's program. Since not always time orders between alloperations of all transactions matter and need to be specified, a schedule is, in general, a

partial order between operations rather than a total order (where order for each pair isdetermined, as in a list of operations). Also in the general case each transaction may consistof several processes, and itself be properly represented by a partial order of operations, ratherthan a total order. Thus in general a schedule is a partial order of operations, containing(embedding ) the partial orders of all its transactions.

Time-order between two operations can be represented by an ordered pair of these operations(e.g., the existence of a pair (OP1,OP2) means that OP1 is always before OP2), and aschedule in the general case is a set of such ordered pairs. Such a set, a schedule, is a partial

order which can be represented by an acyclic directed graph (or directed acyclic graph ,DAG) with operations as nodes and time-order as a directed edge (no cycles are allowedsince a cycle means that a first (any) operation on a cycle can be both before and after (any)another second operation on the cycle, which contradicts our perception of Time ). In manycases a graphical representation of such graph is used to demonstrate a schedule.

Comment: Since a list of operations (and the table notation used in this article) alwaysrepresents a total order between operations, schedules that are not a total order cannot berepresented by a list (but always can be represented by a DAG).

Types of schedule

Page 3: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 3/21

Serial

The transactions are executed non-interleaved (see example above) i.e., a serial schedule isone in which no transaction starts until a running transaction has ended.

Serializable

A schedule that is equivalent (in its outcome) to a serial schedule has the serializability property.

In schedule E, the order in which the actions of the transactions are executed is not the sameas in D, but in the end, E gives the same result as D.

Conflicting actions

Two actions are said to be in conflict (conflicting pair) if:

1. The actions belong to different transactions.2. At least one of the actions is a write operation.3. The actions access the same object (read or write).

The following set of actions is conflicting:

R1(X), W2(X), W3(X) (3 conflicting pairs)

While the following sets of actions are not:

R1(X), R2(X), R3(X) R1(X), W2(Y), R3(X)

Conflict equivalence

The schedules S1 and S2 are said to be conflict-equivalent if following two conditions aresatisfied:

1. Both schedules S1 and S2 involve the same set of transactions (including ordering ofactions within each transaction).

2. Both schedules have same set of conflicting operations.

Conflict-serializable

A schedule is said to be conflict-serializable when the schedule is conflict-equivalent to oneor more serial schedules.

Another definition for conflict-serializability is that a schedule is conflict-serializable if andonly if its precedence graph /serializability graph, when only committed transactions areconsidered, is acyclic (if the graph is defined to include also uncommitted transactions, then

Page 4: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 4/21

cycles involving uncommitted transactions may occur without conflict serializabilityviolation).

Which is conflict-equivalent to the serial schedule <T1,T2>, but not <T2,T1>.

Commitment-ordered

The neutrality of this section is disputed . Relevant discussion may be found on thetalk page . Please do not remove this message until the dispute is resolved . (November2011)

A schedule is said to be commitment-ordered (commit-ordered), or commitment-order-serializable, if it obeys the Commitment ordering (CO; also commit-ordering or commit-order-serializability) schedule property. This means that the order in time of transactions'commitment events is compatible with the precedence (partial) order of the respectivetransactions, as induced by their schedule's acyclic precedence graph (serializability graph,conflict graph). This implies that it is also conflict-serializable. The CO property is especiallyeffective for achieving Global serializability in distributed systems.

Comment: Commitment ordering , which was discovered in 1990, is obviously notmentioned in (Bernstein et al. 1987 ). Its correct definition appears in (Weikum and Vossen2001 ), however the description there of its related techniques and theory is partial, inaccurate,and misleading. [according to whom? ] For an extensive coverage of commitment ordering and itssources see Commitment ordering and The History of Commitment Ordering .

View equivalence

Two schedules S1 and S2 are said to be view-equivalent when the following conditions aresatisfied:

1. If the transaction in S1 reads an initial value for object X, so does the transaction

in S2.

2. If the transaction in S1 reads the value written by transaction in S1 for

object X, so does the transaction in S2.

3. If the transaction in S1 is the final transaction to write the value for an object X,

so is the transaction in S2.

View-serializable

A schedule is said to be view-serializable if it is view-equivalent to some serial schedule. Note that by definition, all conflict-serializable schedules are view-serializable.

Page 5: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 5/21

Notice that the above example (which is the same as the example in the discussion ofconflict-serializable) is both view-serializable and conflict-serializable at the same time.)

There are however view-serializable schedules that are not conflict-serializable: thoseschedules with a transaction performing a blind write :

The above example is not conflict-serializable, but it is view-serializable since it has a view-equivalent serial schedule <T1, T2, T3>.

Since determining whether a schedule is view-serializable is NP-complete , view-serializability has little practical interest.

Recoverable

Transactions commit only after all transactions whose changes they read, commit.

These schedules are recoverable. F is recoverable because T1 commits before T2, that makesthe value read by T2 correct. Then T2 can commit itself. In F2, if T1 aborted, T2 has to abort

because the value of A it read is incorrect. In both cases, the database is left in a consistent

state.

Unrecoverable

If a transaction T1 aborts, and a transaction T2 commits, but T2 relied on T1, we have anunrecoverable schedule.

In this example, G is unrecoverable, because T2 read the value of A written by T1, and

committed. T1 later aborted, therefore the value read by T2 is wrong, but since T2committed, this schedule is unrecoverable.

Avoids cascading aborts (rollbacks)

Also named cascadeless. A single transaction abort leads to a series of transaction rollback.Strategy to prevent cascading aborts is to disallow a transaction from reading uncommittedchanges from another transaction in the same schedule.

The following examples are the same as the one from the discussion on recoverable:

Page 6: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 6/21

In this example, although F2 is recoverable, it does not avoid cascading aborts. It can be seenthat if T1 aborts, T2 will have to be aborted too in order to maintain the correctness of theschedule as T2 has already read the uncommitted value written by T1.

The following is a recoverable schedule which avoids cascading abort. Note, however, that

the update of A by T1 is always lost (since T1 is aborted).

Cascading aborts avoidance is sufficient but not necessary for a schedule to be recoverable.

Strict

A schedule is strict - has the strictness property - if for any two transactions T1, T2, if a writeoperation of T1 precedes a conflicting operation of T2 (either read or write), then the commit

event of T1 also precedes that conflicting operation of T2.

Any strict schedule is cascadeless, but not the converse. Strictness allows efficient recoveryof databases from failure.

Hierarchical relationship between serializability classes

The following expressions illustrate the hierarachical (containment) relationships betweenserializability and recoverability classes:

Serial ⊂ commitment-ordered ⊂ conflict-serializable ⊂ view-serializable ⊂ allschedules Serial ⊂ strict ⊂ avoids cascading aborts ⊂ recoverable ⊂ all schedules

The Venn diagram (below) illustrates the above clauses graphically.

Page 7: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 7/21

Venn diagram for serializability and recoverability classes

Practical implementations

In practice, most general purpose database systems employ conflict-serializable andrecoverable (primarily strict) schedules.

See also

schedule (project management)

References

Philip A. Bernstein , Vassos Hadzilacos, Nathan Goodman: Concurrency Control and

Recovery in Database Systems , Addison Wesley Publishing Company, 1987, ISBN 0-201-10715-5

Gerhard Weikum , Gottfried Vossen: Transactional Information Systems , Elsevier,2001, ISBN 1-55860-508-8

Categories :

Data management Transaction processing

DBMS Transaction

Page 8: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 8/21

Advertisements

Previous Page Next Page

A transaction can be defined as a group of tasks. A single task is the minimum processingunit of work, which cannot be divided further.

An example of transaction can be bank accounts of two users, say A & B. When a bankemployee transfers amount of Rs. 500 from A's account to B's account, a number of tasks areexecuted behind the screen. This very simple and small transaction includes several steps:decrease A's bank account from 500

Open_Account(A)Old_Balance = A.balance

New_Balance = Old_Balance - 500A.balance = New_BalanceClose_Account(A)

In simple words, the transaction involves many tasks, such as opening the account of A,reading the old balance, decreasing the 500 from it, saving new balance to account of A andfinally closing it. To add amount 500 in B's account same sort of tasks need to be done:

Open_Account(B)Old_Balance = B.balanceNew_Balance = Old_Balance + 500B.balance = New_Balance

Close_Account(B)

A simple transaction of moving an amount of 500 from A to B involves many low level tasks.

ACID Properties

A transaction may contain several low level tasks and further a transaction is very small unitof any program. A transaction in a database system must maintain some properties in order toensure the accuracy of its completeness and data integrity. These properties are refer to asACID properties and are mentioned below:

Atomicity: Though a transaction involves several low level operations but this property states that a transaction must be treated as an atomic unit, that is, either all ofits operations are executed or none. There must be no state in database where thetransaction is left partially completed. States should be defined either before theexecution of the transaction or after the execution/abortion/failure of the transaction.

Consistency: This property states that after the transaction is finished, its databasemust remain in a consistent state. There must not be any possibility that some data isincorrectly affected by the execution of transaction. If the database was in a consistentstate before the execution of the transaction, it must remain in consistent state afterthe execution of the transaction.

Durability: This property states that in any case all updates made on the database will persist even if the system fails and restarts. If a transaction writes or updates some

Page 9: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 9/21

data in database and commits that data will always be there in the database. If thetransaction commits but data is not written on the disk and the system fails, that datawill be updated once the system comes up.

Isolation: In a database system where more than one transaction are being executedsimultaneously and in parallel, the property of isolation states that all the transactions

will be carried out and executed as if it is the only transaction in the system. Notransaction will affect the existence of any other transaction.

Serializability

When more than one transaction is executed by the operating system in a multiprogrammingenvironment, there are possibilities that instructions of one transactions are interleaved withsome other transaction.

Schedule: A chronological execution sequence of transaction is called schedule. A

schedule can have many transactions in it, each comprising of number ofinstructions/tasks. Serial Schedule: A schedule in which transactions are aligned in such a way that one

transaction is executed first. When the first transaction completes its cycle then nexttransaction is executed. Transactions are ordered one after other. This type ofschedule is called serial schedule as transactions are executed in a serial manner.

In a multi-transaction environment, serial schedules are considered as benchmark. Theexecution sequence of instruction in a transaction cannot be changed but two transactions canhave their instruction executed in random fashion. This execution does no harm if twotransactions are mutually independent and working on different segment of data but in case

these two transactions are working on same data, results may vary. This ever-varying resultmay cause the database in an inconsistent state.

To resolve the problem, we allow parallel execution of transaction schedule if transactions init are either serializable or have some equivalence relation between or among transactions.

Equivalence schedules: Schedules can equivalence of the following types:

Result Equivalence:

If two schedules produce same results after execution, are said to be result equivalent.They may yield same result for some value and may yield different results for anothervalues. That's why this equivalence is not generally considered significant.

View Equivalence:

Two schedules are view equivalence if transactions in both schedules perform similaractions in similar manner.

For example:

o

If T reads initial data in S1 then T also reads initial data in S2o If T reads value written by J in S1 then T also reads value written by J in S2

Page 10: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 10/21

o If T performs final write on data value in S1 then T also performs final writeon data value in S2

Conflict Equivalence:

Two operations are said to be conflicting if they have the following properties:

Both belong to separate transactions Both accesses the same data item At least one of them is "write" operation

Two schedules have more than one transactions with conflicting operations are said to beconflict equivalent if and only if:

Both schedules contain same set of Transactions The order of conflicting pairs of operation is maintained in both schedules

View equivalent schedules are view serializable and conflict equivalent schedules are conflictserializable. All conflict serializable schedules are view serializable too.

States of Transactions:

A transaction in a database can be in one of the following state:

[ Image: Transaction States ]

Active: In this state the transaction is being executed. This is the initial state of everytransaction.

Partially Committed: When a transaction executes its final operation, it is said to bein this state. After execution of all operations, the database system performs somechecks e.g. the consistency state of database after applying output of transaction ontothe database.

Failed: If any checks made by database recovery system fails, the transaction is said

to be in failed state, from where it can no longer proceed further.

Page 11: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 11/21

Aborted: If any of checks fails and transaction reached in Failed state, the recoverymanager rolls back all its write operation on the database to make database in the statewhere it was prior to start of execution of transaction. Transactions in this state arecalled aborted. Database recovery module can select one of the two operations after atransaction aborts:

o Re-start the transactiono Kill the transaction

Committed: If transaction executes all its operations successfully it is said to becommitted. All its effects are now permanently made on database system.

DBMS Concurrency Control Advertisements

Previous Page Next Page

In a multiprogramming environment where more than one transactions can be concurrentlyexecuted, there exists a need of protocols to control the concurrency of transaction to ensureatomicity and isolation properties of transactions.

Concurrency control protocols, which ensure serializability of transactions, are mostdesirable. Concurrency control protocols can be broadly divided into two categories:

Lock based protocols Time stamp based protocols

Lock based protocols

Database systems, which are equipped with lock-based protocols, use mechanism by whichany transaction cannot read or write data until it acquires appropriate lock on it first. Locksare of two kinds:

Binary Locks: a lock on data item can be in two states; it is either locked orunlocked.

Shared/exclusive: this type of locking mechanism differentiates lock based on theiruses. If a lock is acquired on a data item to perform a write operation, it is exclusivelock. Because allowing more than one transactions to write on same data item wouldlead the database into an inconsistent state. Read locks are shared because no datavalue is being changed.

There are four types lock protocols available:

Simplistic

Page 12: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 12/21

Simplistic lock based protocols allow transaction to obtain lock on every object before'write' operation is performed. As soon as 'write' has been done, transactions mayunlock the data item.

Pre-claiming

In this protocol, a transactions evaluations its operations and creates a list of dataitems on which it needs locks. Before starting the execution, transaction requests thesystem for all locks it needs beforehand. If all the locks are granted, the transactionexecutes and releases all the locks when all its operations are over. Else if all the locksare not granted, the transaction rolls back and waits until all locks are granted.

[ Image: Pre-claiming ]

Two Phase Locking - 2PL

This locking protocol is divides transaction execution phase into three parts. In thefirst part, when transaction starts executing, transaction seeks grant for locks it needsas it executes. Second part is where the transaction acquires all locks and no otherlock is required. Transaction keeps executing its operation. As soon as the transactionreleases its first lock, the third phase starts. In this phase a transaction cannot demandfor any lock but only releases the acquired locks.

[ Image: Two Phase Locking ]

Two phase locking has two phases, one is growing; where all locks are being acquired by transaction and second one is shrinking, where locks held by the transaction are being released.

To claim an exclusive (write) lock, a transaction must first acquire a shared (read)lock and then upgrade it to exclusive lock.

Strict Two Phase Locking

The first phase of Strict-2PL is same as 2PL. After acquiring all locks in the first phase, transaction continues to execute normally. But in contrast to 2PL, Strict-2PLdoes not release lock as soon as it is no more required, but it holds all locks untilcommit state arrives. Strict-2PL releases all locks at once at commit point.

[ Image: Strict Two Phase Locking ]

Strict-2PL does not have cascading abort as 2PL does.

Page 13: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 13/21

Time stamp based protocols

The most commonly used concurrency protocol is time-stamp based protocol. This protocoluses either system time or logical counter to be used as a time-stamp.

Lock based protocols manage the order between conflicting pairs among transaction at thetime of execution whereas time-stamp based protocols start working as soon as transaction iscreated.

Every transaction has a time-stamp associated with it and the ordering is determined by theage of the transaction. A transaction created at 0002 clock time would be older than all othertransaction, which come after it. For example, any transaction 'y' entering the system at 0004is two seconds younger and priority may be given to the older one.

In addition, every data item is given the latest read and write-timestamp. This lets the system

know, when was last read and write operation made on the data item.

Time-stamp ordering protocol

The timestamp-ordering protocol ensures serializability among transaction in their conflictingread and write operations. This is the responsibility of the protocol system that the conflicting

pair of tasks should be executed according to the timestamp values of the transactions.

Time-stamp of Transaction Ti is denoted as TS(T i). Read time-stamp of data-item X is denoted by R-timestamp(X). Write time-stamp of data-item X is denoted by W-timestamp(X).

Timestamp ordering protocol works as follows:

If a transaction Ti issues read(X) operation: o If TS(Ti) < W-timestamp(X)

Operation rejected.o If TS(Ti) >= W-timestamp(X)

Operation executed.o All data-item Timestamps updated.

If a transaction Ti issues write(X) operation: o If TS(Ti) < R-timestamp(X)

Operation rejected.o If TS(Ti) < W-timestamp(X)

Operation rejected and Ti rolled back.o Otherwise, operation executed.

Thomas' Write rule:

This rule states that in case of:

Page 14: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 14/21

If TS(Ti) < W-timestamp(X) Operation rejected and Ti rolled back. Timestamp ordering rules can be modified to

make the schedule view serializable. Instead of making Ti rolled back, the 'write'operation itself is ignored.

DBMS Deadlock Advertisements

Previous Page Next Page

In a multi-process system, deadlock is a situation, which arises in shared resourceenvironment where a process indefinitely waits for a resource, which is held by some other

process, which in turn waiting for a resource held by some other process.

For example, assume a set of transactions {T 0, T 1, T 2, ...,T n}. T 0 needs a resource X tocomplete its task. Resource X is held by T 1 and T 1 is waiting for a resource Y, which is held

by T 2. T 2 is waiting for resource Z, which is held by T 0. Thus, all processes wait for eachother to release resources. In this situation, none of processes can finish their task. Thissituation is known as 'deadlock'.

Deadlock is not a good phenomenon for a healthy system. To keep system deadlock free fewmethods can be used. In case the system is stuck because of deadlock, either the transactionsinvolved in deadlock are rolled back and restarted.

Deadlock Prevention

To prevent any deadlock situation in the system, the DBMS aggressively inspects all theoperations which transactions are about to execute. DBMS inspects operations and analyze ifthey can create a deadlock situation. If it finds that a deadlock situation might occur then thattransaction is never allowed to be executed.

There are deadlock prevention schemes, which uses time-stamp ordering mechanism oftransactions in order to pre-decide a deadlock situation.

Wait-Die Scheme:

In this scheme, if a transaction request to lock a resource (data item), which is already heldwith conflicting lock by some other transaction, one of the two possibilities may occur:

If TS(T i) < TS(T j), that is T i, which is requesting a conflicting lock, is older than T j, T i is allowed to wait until the data-item is available.

If TS(T i) > TS(t j), that is T i is younger than T j, T i dies. T i is restarted later with random

delay but with same timestamp.

Page 15: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 15/21

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme:

In this scheme, if a transaction request to lock a resource (data item), which is already heldwith conflicting lock by some other transaction, one of the two possibilities may occur:

If TS(T i) < TS(T j), that is T i, which is requesting a conflicting lock, is older than T j, T i forces T j to be rolled back, that is T i wounds T j. T j is restarted later with random delay

but with same timestamp. If TS(T i) > TS(T j), that is T i is younger than T j, T i is forced to wait until the resource

is available.

This scheme, allows the younger transaction to wait but when an older transaction request anitem held by younger one, the older transaction forces the younger one to abort and releasethe item.

In both cases, transaction, which enters late in the system, is aborted.

Deadlock Avoidance

Aborting a transaction is not always a practical approach. Instead deadlock avoidancemechanisms can be used to detect any deadlock situation in advance. Methods like "wait-forgraph" are available but for the system where transactions are light in weight and have holdon fewer instances of resource. In a bulky system deadlock prevention techniques may workwell.

Wait-for Graph

This is a simple method available to track if any deadlock situation may arise. For eachtransaction entering in the system, a node is created. When transaction T i requests for a lockon item, say X, which is held by some other transaction T j, a directed edge is created from T i to T j. If T j releases item X, the edge between them is dropped and T i locks the data item.

The system maintains this wait-for graph for every transaction waiting for some data itemsheld by others. System keeps checking if there's any cycle in the graph.

Page 16: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 16/21

[ Image: Wait-for Graph ]

Two approaches can be used, first not to allow any request for an item, which is alreadylocked by some other transaction. This is not always feasible and may cause starvation, wherea transaction indefinitely waits for data item and can never acquire it. Second option is to roll

back one of the transactions.

It is not feasible to always roll back the younger transaction, as it may be important than theolder one. With help of some relative algorithm a transaction is chosen, which is to beaborted, this transaction is called victim and the process is known as victi m sel ection .

DBMS Data Backup Advertisements

Previous Page Next Page

Failure with loss of Non-Volatile storage

What would happen if the non-volatile storage like RAM abruptly crashes? All transaction,

which are being executed are kept in main memory. All active logs, disk buffers and relateddata is stored in non-volatile storage.

When storage like RAM fails, it takes away all the logs and active copy of database. It makesrecovery almost impossible as everything to help recover is also lost. Following techniquesmay be adopted in case of loss of non-volatile storage.

A mechanism like checkpoint can be adopted which makes the entire content ofdatabase be saved periodically.

State of active database in non-volatile memory can be dumped onto stable storage periodically, which may also contain logs and active transactions and buffer blocks.

<dump> can be marked on log file whenever the database contents are dumped fromnon-volatile memory to a stable one.

Page 17: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 17/21

Recovery:

When the system recovers from failure, it can restore the latest dump. It can maintain redo-list and undo-list as in checkpoints. It can recover the system by consulting undo-redo lists to restore the state of all

transaction up to last checkpoint.

Database backup & recovery from catastrophic failure

So far we have not discovered any other planet in our solar system, which may have life on it,and our own earth is not that safe. In case of catastrophic failure like alien attack, thedatabase administrator may still be forced to recover the database.

Remote backup, described next, is one of the solutions to save life. Alternatively, wholedatabase backups can be taken on magnetic tapes and stored at a safer place. This backup can

later be restored on a freshly installed database and bring it to the state at least at the point of backup.

Grown up databases are too large to be frequently backed-up. Instead, we are aware oftechniques where we can restore a database by just looking at logs. So backup of logs atfrequent rate is more feasible than the entire database. Database can be backed-up once aweek and logs, being very small can be backed-up every day or as frequent as every hour.

Remote Backup

Remote backup provides a sense of security and safety in case the primary location where thedatabase is located gets destroyed. Remote backup can be offline or real-time and online. Incase it is offline it is maintained manually.

[ Image: Remote Data Backup ]

Online backup systems are more real-time and lifesavers for database administrators andinvestors. An online backup system is a mechanism where every bit of real-time data is

backed-up simultaneously at two distant place. One of them is directly connected to systemand other one is kept at remote place as backup.

As soon as the primary database storage fails, the backup system sense the failure and switchthe user system to the remote storage. Sometimes this is so instant the users even can't realizea failure.

DBMS Data Recovery Advertisements

Previous Page Next Page

Page 18: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 18/21

Crash Recovery

Though we are living in highly technologically advanced era where hundreds of satellitemonitor the earth and at every second billions of people are connected through informationtechnology, failure is expected but not every time acceptable.

DBMS is highly complex system with hundreds of transactions being executed every second.Availability of DBMS depends on its complex architecture and underlying hardware orsystem software. If it fails or crashes amid transactions being executed, it is expected that thesystem would follow some sort of algorithm or techniques to recover from crashes or failures.

Failure Classification

To see where the problem has occurred we generalize the failure into various categories, asfollows:

Transaction failure

When a transaction is failed to execute or it reaches a point after which it cannot becompleted successfully it has to abort. This is called transaction failure. Where only fewtransaction or process are hurt.

Reason for transaction failure could be:

Logical errors: where a transaction cannot complete because of it has some codeerror or any internal error condition

System errors: where the database system itself terminates an active transaction because DBMS is not able to execute it or it has to stop because of some systemcondition. For example, in case of deadlock or resource unavailability systems abortsan active transaction.

System crash

There are problems, which are external to the system, which may cause the system to stopabruptly and cause the system to crash. For example interruption in power supply, failure ofunderlying hardware or software failure.

Examples may include operating system errors.

Disk failure:

In early days of technology evolution, it was a common problem where hard disk drives orstorage drives used to fail frequently.

Disk failures include formation of bad sectors, unreachability to the disk, disk head crash orany other failure, which destroys all or part of disk storage

Page 19: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 19/21

Storage Structure

We have already described storage system here. In brief, the storage structure can be dividedin various categories:

Volatile storage: As name suggests, this storage does not survive system crashes andmostly placed very closed to CPU by embedding them onto the chipset itself forexamples: main memory, cache memory. They are fast but can store a small amountof information.

Nonvolatile storage: These memories are made to survive system crashes. They arehuge in data storage capacity but slower in accessibility. Examples may include, harddisks, magnetic tapes, flash memory, non-volatile (battery backed up) RAM.

Recovery and Atomicity

When a system crashes, it many have several transactions being executed and various filesopened for them to modifying data items. As we know that transactions are made of variousoperations, which are atomic in nature. But according to ACID properties of DBMS,atomicity of transactions as a whole must be maintained that is, either all operations areexecuted or none.

When DBMS recovers from a crash it should maintain the following:

It should check the states of all transactions, which were being executed. A transaction may be in the middle of some operation; DBMS must ensure the

atomicity of transaction in this case. It should check whether the transaction can be completed now or needs to be rolled

back. No transactions would be allowed to left DBMS in inconsistent state.

There are two types of techniques, which can help DBMS in recovering as well asmaintaining the atomicity of transaction:

Maintaining the logs of each transaction, and writing them onto some stable storage before actually modifying the database.

Maintaining shadow paging, where are the changes are done on a volatile memory

and later the actual database is updated.

Log-Based Recovery

Log is a sequence of records, which maintains the records of actions performed by atransaction. It is important that the logs are written prior to actual modification and stored ona stable storage media, which is failsafe.

Log based recovery works as follows:

The log file is kept on stable storage media When a transaction enters the system and starts execution, it writes a log about it

Page 20: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 20/21

<T n, Start>

When the transaction modifies an item X, it write logs as follows:

<T n, X, V 1 , V 2>

It reads Tn has changed the value of X, from V1 to V2.

When transaction finishes, it logs:

<T n, commit>

Database can be modified using two approaches:

1. Deferred database modification: All logs are written on to the stable storage anddatabase is updated when transaction commits.

2. Immediate database modification: Each log follows an actual databasemodification. That is, database is modified immediately after every operation.

Recovery with concurrent transactions

When more than one transactions are being executed in parallel, the logs are interleaved. Atthe time of recovery it would become hard for recovery system to backtrack all logs, and thenstart recovering. To ease this situation most modern DBMS use the concept of 'checkpoints'.

Checkpoint

Keeping and maintaining logs in real time and in real environment may fill out all thememory space available in the system. At time passes log file may be too big to be handled atall. Checkpoint is a mechanism where all the previous logs are removed from the system andstored permanently in storage disk. Checkpoint declares a point before which the DBMS wasin consistent state and all the transactions were committed.

Recovery

When system with concurrent transaction crashes and recovers, it does behave in thefollowing manner:

Page 21: advanced database notes

8/11/2019 advanced database notes

http://slidepdf.com/reader/full/advanced-database-notes 21/21

[ Image: Recovery with concurrent transactions ]

The recovery system reads the logs backwards from the end to the last Checkpoint. It maintains two lists, undo-list and redo-list.

If the recovery system sees a log with <T n, Start> and <T n, Commit> or just <T n,Commit>, it puts the transaction in redo-list.

If the recovery system sees a log with <T n, Start> but no commit or abort log found, it puts the transaction in undo-list.

All transactions in undo-list are then undone and their logs are removed. All transaction inredo-list, their previous logs are removed and then redone again and log saved.


Recommended