Distributed Database System

i

Submitted in Partial Fulfillment of the requirement for the degree of

Master of Science in Information Technology

Chin Narong 2008

ii

INTRODUCTION VISION & OBJECTIVES OF RUPP

The Royal University of Phnom Penh established its Master of Science in Information

Technology degree in 2003. The initial aim of the degree was to build the knowledge and

expertise of lecturers within the Department of Computer Science, to ensure the highest level

of teaching at RUPP. The first generation of students consisted of fifteen lecturers from the

Department of Computer Science and 28 candidates from other institutions. This first group

graduated in August 2006. Presently, the Master of Science in Information Technology course

boasts 54 students.

Graduates of the M.Sc. (Information Technology) have the knowledge, understanding, skills

and attitude required to perform at an advanced level within both the public and private

sectors. They are fully capable of assisting in the development of I.T. in the region and the

world.

Teaching Methods

The course is taught through a combination of lectures, seminars, tutorials, workshops,

practical modules and computer laboratory sessions.

This variety of learning environments helps prepare students for professional situations

outside of their degrees.

Objective

The Master of Science in Information Technology (M.Sc. (Information Technology)) degree

aims to provide knowledge and understanding of modern information technology systems, in

order to prepare students for practical careers within the Information Technology sector.

To this end, the curriculum focuses on building understanding in the fields of computer

networking, internet programming, software systems engineering, project management and

high performance computer systems.

Assessment

Students performing at a high level during their degree are selected to research and write a

Masters thesis during their final year.

In the first intake of students, six of the 48 students submitted a thesis. Those who are not

selected to submit a thesis attend classes and sit examinations to attain the equivalent value

of credits.

iii

According to the above assessment, this report was prepared for the Department of

Information Technology of Royal University of Phnom Penh. The project was advised by Mr.

Ouk Chhieng Dean of Computer Science department or Coordinator of Royal University of

Phnom Penh.

This report is in the private domain. Authorization to reproduce it in whole or in part is

granted. Permission to reprint this publication is not necessary.

To order copies of this report, student has to:

• Write to Dr. Ouk Chhieng, Coordinator of Royal University of Phnom Penh, Room # 101,

Campus I, call (855) 12 754-344

• Write to Mr. Chin Narong, The study report owner, through the address:

[email protected]; [email protected]

This report also is not available on RUPP website at http://rupp.edu.kh.

Upon request, this report is available in alternate formats. For more information, please

contact the IT Center, RUPP at 855-23-881-285.

i

Acknowledgements

The Study report writer would like to express its gratitude to the university, staff, dean and

especially to Mr. Ouk Chhieng who was advised for this preparation report. Their

contributions to study, such as assessments, observations, surveys, are deeply appreciated in

the filed of Distributed Database System. On be half of the writer, I would like to say thanks

for their generosity of time and spirit.

Finally, I would like to say thanks to my friends who were given their recommendation and

some resource to support my research.

ii

Abbreviation:

DBMS : Database Management System

DDBMS : Distributed Database Management System

MDBS : Multi-Database System

GCS : Global Conceptual Schema

DDTS : Distributed Defect Tracking System

API : Application programming interface

DHF : Derived Horizontal Fragmentation

PHF : Primary Horizontal Fragmentation

VF : Vertical Fragmentations

RS : Read Set

WS : Write Set

BS : Base Set

TM: : Transaction Management

DCC : Distributed Concurrency Control

CC : Concurrency Control

TO : Timestamp Ordering

2PL : Two Phase Locking

WFG : Wait-for graph

GWFG : Global Wait for Graph

DPS : distributed processing system

DPR : distributed program reliability

DSR : distributed system reliability

FSTs : file spanning trees

iii

Contents

I/ INTRODUCTION .................................................................................................................... 1 1 OBJECTIVE OF THE STUDY ........................................................................................... 2 2 SIGNIFICANCE OF THE STUDY .................................................................................... 3 3 Layout of the Study ......................................................................................................... 4

II/ REVIEW OF LITERATURE ................................................................................................. 4 Chapter 1: Introduction ..................................................................................................... 4

Course Outline ................................................................................................................... 4 What is a Distributed Database System? ................................................................ 4 Implicit Assumptions ....................................................................................................... 5 Motivation ............................................................................................................................ 5 Distributed Computing .................................................................................................... 6 What is Distributed? ........................................................................................................ 6 What a Site is? ................................................................................................................... 6 DDBS Environment .......................................................................................................... 6 Distributed Database Graphic ...................................................................................... 7 What is not a Distributed Database System? ........................................................ 7 Shared-Memory Multiprocessor .................................................................................. 7 Centralized DBMS on a Network ................................................................................. 8 Why Distribute Database? ............................................................................................. 8 Advantages of DDBMSs .................................................................................................. 8 Disadvantages of DDBMSs ............................................................................................ 8 Applications ......................................................................................................................... 8 Issues with DDBMS .......................................................................................................... 9 Distributed Transaction Management ..................................................................... 10 Data Fragmentation ....................................................................................................... 11 Fragmentation Independence .................................................................................... 12 DBMS Independence ..................................................................................................... 12 Operating System Independence ............................................................................. 12 Hardware Independence .............................................................................................. 12 Who Should Provide Transparency .......................................................................... 12 Complexity ......................................................................................................................... 13 Cost ...................................................................................................................................... 13 Distribution Control ........................................................................................................ 13 Security............................................................................................................................... 13 Distributed Database Design ...................................................................................... 14 Distributed Query Processing ..................................................................................... 14 Distributed Directory Management .......................................................................... 14 Distributed Concurrency Control............................................................................... 15 Distributed Deadlock Management .......................................................................... 15 Reliability of Distributed Databases ......................................................................... 15 Operating System Support ......................................................................................... 16 Heterogeneous Databases .......................................................................................... 16

Chapter2: Distributed DBMS Architecture ................................................................ 17 1/ Objective ...................................................................................................................... 17 2/ Types of DDMS Architecture ................................................................................. 17

iv

3/ Distribution .................................................................................................................. 24 4/ Data Processor ........................................................................................................... 29 5/ Heterogeneity ............................................................................................................. 30 6/ Architectural Alternatives ...................................................................................... 30 7/ Implementation Alternatives ................................................................................ 31 8/ Multi-DBS Architecture............................................................................................ 32

Chapter 3 Distributed Database Design ..................................................................... 35 1/ Design Problem .......................................................................................................... 35 2/ Alternative Design Strategies ............................................................................... 36 3/ Horizontal Fragmentations .................................................................................... 41 4/ Primary Horizontal Fragmentation ..................................................................... 43 5/ Derived Horizontal Fragmentation ..................................................................... 43 6/ Minterm Fragments .................................................................................................. 44 7/ Vertical Fragmentations ......................................................................................... 45 8/ Hybrid Fragmentation ............................................................................................. 46 9/ Allocation Alternatives............................................................................................. 47 10/ Allocation Problem ................................................................................................. 47 11/ Reason of Replication ............................................................................................ 48 12/ Thumb Rule ............................................................................................................... 48

Chapter 4 Transaction Management ........................................................................... 49 1/Definition of Transaction .......................................................................................... 49 Unit of Computing .......................................................................................................... 50 2/Database Consistency ............................................................................................... 50 3/ Transaction Consistency ......................................................................................... 51 4/ Replica Consistency .................................................................................................. 51 5/ Reliability ...................................................................................................................... 51 6/ Flat Transactions ....................................................................................................... 52 7/ Nested transaction .................................................................................................... 52 8/ Characterization of Transaction ........................................................................... 52 9/ Properties of Transactions ..................................................................................... 53 10/ Transaction Manager ............................................................................................. 55 11/ Scheduler ................................................................................................................... 55 12/ Local Recovery Manager ...................................................................................... 55

Chapter 5 Distributed Concurrency Control (DCC) ................................................ 57 1/ CC in Distributed DBMS .......................................................................................... 57 2/ Key Issue ...................................................................................................................... 57 3/ Serializability Theory ............................................................................................... 57 4/ Taxonomy .................................................................................................................... 59 5/ Locking Based CC Algorithms ............................................................................... 60 6/ Deadlock ....................................................................................................................... 61 7/ Why Deadlocks ........................................................................................................... 62 8/ Methods ........................................................................................................................ 62 9/ Methodology ................................................................................................................ 64 10/ Detection .................................................................................................................... 64 11/ Key Issues ................................................................................................................. 64 Solution ............................................................................................................................... 65

Chapter 6 Distributed Reliability ................................................................................... 66 1/ Fundamental Definitions ........................................................................................ 66

v

2/ Distributed Reliability Protocols ........................................................................... 67 3/ Two-Phase Commit Protocol ................................................................................. 67 4/ State Transitions in 2PC ......................................................................................... 68 5/ Three-Phase Commit ............................................................................................... 71 6/ Quorum Protocols for Replicated Databases .................................................. 71 7/ Network Partitioning ................................................................................................ 71 8/ Open Problems ........................................................................................................... 71 9/ In-Place Update Recovery Information ............................................................ 72 10/ Out-of-Place Update Recovery Information ................................................. 73 11/ Execution Strategies ............................................................................................. 73 12/ Checkpoints .............................................................................................................. 75

IV/ APPLIED METHOD ON ORACLE .................................................................................. 76 1/ Oracle Database Architecture ................................................................................... 76

Memory Components .................................................................................................... 76 2/ Tasks of an Oracle Database Administrator ....................................................... 78 3/ Database Planning ........................................................................................................ 79 4/ Oracle management framework .............................................................................. 79

4.1/ Startup command ................................................................................................. 83 4.2/ Shutdown command ............................................................................................ 83 4.2/ How table data is stored .................................................................................... 83 4.3/ Automatic Storage Management ..................................................................... 84

5/ Database concurrency ................................................................................................. 84 5.1/ PL/SQL ....................................................................................................................... 84 5.2/ Locks .......................................................................................................................... 85

6/ Database Reliability ...................................................................................................... 86 6.1/ Principle of Least Privilege ................................................................................. 86 6.2/ Applying the Principle of Least Privilege ...................................................... 86 6.3/ Monitoring for Suspicious Activity .................................................................. 86 6.4/ Back up and Recovery ......................................................................................... 87

7/ Database Efficiency ...................................................................................................... 89 7.1/ Listener ..................................................................................................................... 89

8/ Database performance ................................................................................................ 90 IV/ SUMMERY ........................................................................................................................... 91 V/ APPLY ..................................................................................................................................... 91 VI/ CONCLUSION .................................................................................................................... 91 VII/ REFERENCES ................................................................................................................... 92

1

I/ INTRODUCTION In the former of human generation, the most of the knowledge management are based on documentation which was written to support the next generation learner to study and improve such as management, technology business, troubleshooting, architecture, law, regulation and so on… Hence, after the new cutting-edge of computer science technology is introduced and the improvement of business management as well as any other fields are helped to make more creative and new things to supporting the demands in humanity. The database are play-act the role of data keeper to support the business such as: - Cost

- Time - Accountability - Effectiveness; and - Transparency

First of all, I would like to let you know:

+ What is distributed database? With the simple answer form www.webopedia.com say that a database that consists of two or more data files located at different sites on a computer network. Because the database is distributed, different users can access it without interfering with one another. However, the DBMS must periodically synchronize the scattered databases to make sure that they all have consistent data. A database that is under the control of a central database management system (DBMS), which storage devices are not all attached to a common processor. It may be stored in multiple computers located in the same physical location or dispersed over a network of interconnected computers called Distributed database. As like example, in a database, collections of data can be distributed across multiple physical locations (partitions/fragments). Each partitions of a distributed database may be replicated, said by others. Besides distributed database replication and fragmentation, there are many other distributed database design technologies. For example, local autonomy, synchronous and asynchronous distributed database technologies. These technologies' implementation can and does depend on the needs of the business and the sensitivity/confidentiality of the data to be stored in the database, and hence the price the business is willing to spend on ensuring data security, consistency and integrity. + Basic architecture A database users access the distributed database through:

- Local applications: applications which do not require data from other sites. - Global applications: applications which do require data from other sites.

+ Important considerations Care with a distributed database must be taken to ensure the following:

- The distribution is transparent: users must be able to interact with the system as if it were one logical system. This applies to the system's performance, and methods of access amongst other things.

2

- Transactions are transparent: each transaction must maintain database integrity across multiple databases. Transactions must also be divided into sub transactions, each sub transaction affecting one database system...

+ Advantages of distributed databases

- Reflects organizational structure: database fragments are located in the departments they relate to. - Local autonomy: a department can control the data about them (as they are the ones familiar with it.) - Improved availability: a fault in one database system will only affect one fragment, instead of the entire database. - Improved performance: data is located near the site of greatest demand, and the database systems themselves are parallelized, allowing load on the databases to be balanced among servers. (A high load on one module of the database won't affect other modules of the database in a distributed database.) - Economics: it costs less to create a network of smaller computers with the power of a single large computer. - Modularity: systems can be modified, added and removed from the distributed database without affecting other modules (systems). - Fault-tolerant: ability of a computer system or component designed so that, respond gracefully in the event that a component fails, a backup component or procedure can immediately take its place with no loss of service. Fault tolerance can be provided with software, or embedded in hardware, or provided by some combination.

+ Disadvantages of distributed databases - Complexity: extra work must be done by the DBAs to ensure that the distributed nature of the system is transparent. Extra work must also be done to maintain multiple disparate systems, instead of one big one. Extra database design work must also be done to account for the disconnected nature of the database. For example, joins become prohibitively expensive when performed across multiple systems. - Economics: increased complexity and a more extensive infrastructure means extra labour costs. - Security: remote database fragments must be secured, and they are not centralized so the remote sites must be secured as well. The infrastructure must also be secured (e.g., by encrypting the network links between remote sites). - Difficult to maintain integrity: in a distributed database, enforcing integrity over a network may require too much of the network's resources to be feasible. - Inexperience: distributed databases are difficult to work with, and as a young field there is not much readily available experience on proper practice. - Lack of standards: there are no tools or methodologies yet to help users convert a centralized DBMS into a distributed DBMS. - Database design more complex: besides of the normal difficulties, the design of a distributed database has to consider fragmentation of data, allocation of fragments to specific sites and data replication.

1 OBJECTIVE OF THE STUDY

Regarding to my intention in this afterword is twofold. I wish to express my own knowledge and also wish to identify some issues which I chose to admit from the body of the paper. The body of this document began as a brief account of why distributed database system is more popular than pervious database. It was triggered by a number of

3

events happening in close proximity. Preparing a paper on a qualitative evaluation led me to think about the sources of responsiveness and the architecture of database system. And I volunteered to provide some documentation for coursework masters dissertations using Oracle 10g Database Administration Workshop I, to accompany a similar document for other forms which I had been learned. This particular document originated as a document for people wish to learn of how to administrate on database system. That was the urgent priority at the time. After completing it, I realized it was suitable for coursework masters dissertations too. By then it had become larger than intended; but perusing it persuaded me that the length was justified by the topic. But I would like to apology to some readers who want me to explain that this paper can’t be larger than this, due to my short weekdays. Even though, I had some very encouraging responses from outside people and other educational institutions. So here it is. My experience suggests to me that the changing of computer technology requires a non-positivist approach. This was confirmed by my reading. It appears that many academics who find themselves in the role of change agents are led eventually towards a more flexible approach to their technical service. However, while in sympathy with the actual processes they used in field settings, I thought their supporting arguments were sometimes inadequate. Constructivism provides one example. The positivist view, or so it seems to me, depends upon reality being directly knowable. Many advisors are opposing this with a view that our theories and language inevitably colour what we see. It seems apparent to me that my mental frameworks colour what I would like to describe. I was encouraged to find such views expressed in the literature presentation. However, in this afterword let me try to make my own views clearer than I chose to in the body of this document. It seems to me that to judge a good database paradigm, it is reasonable to take into account the purpose of choosing a right database through my experience analysis on in-source database development such as:

To introduce the important concepts, algorithms and techniques in the design of high performance distributed database systems (DDBS or called DDBMS)

What are the differences between DDBS and DDB applications? o Database Systems VS. Distributed Database Systems

An important purpose of the course is to introduce to you WHAT will happen after you have submitted your program (transaction) to a distributed database system for execution and HOW the system meet the performance requirements (WHAT ARE THE PERFORMANCE REQUIREMENTS???)

We hope that the concepts, techniques and algorithms over in this course will be useful to you:

When you develop database applications (as an application programmer) with a distributed database system and;

When you design a new distributed database system (as a database system designer)

2 SIGNIFICANCE OF THE STUDY

The database administrator and project manager should indicate and defend why it is necessary to undertake the feasibility study on the system requirement. The benefits that will result from this and a proper database and application will be beneficial should be developed.

4

3 Layout of the Study This project paper consists of five chapters as following:

Chapter 1 - Introduction Chapter 2 - Review of literature Chapter 3 - Research methodology Chapter 4- Data analysis Chapter 5- Conclusion and recommendation

II/ REVIEW OF LITERATURE

Chapter 1: Introduction

Course Outline • What is a distributed DBMS • Problems • Current state-of-affairs • Background • Distributed DBMS Architecture • Distributed Database Design • Semantic Data Control • Distributed Query Processing • Distributed Transaction Management • Parallel Database Systems • Distributed Object DBMS • Database Interoperability • Current Issues

What is a Distributed Database System? • A distributed database (DDB) is a collection of multiple, logically interrelated

databases distributed over a computer network. • A distributed database management system (D–DBMS) is the software that

manages the DDB and provides an access mechanism that makes this distribution transparent to the users.

Distributed database system (DDBS) = DDB + D–DBMS

data

DBMS

data

DBMS

data

DBMS

data

DBMS

Distributed Database System

data

DBMS

data

DBMS

data

DBMS

data

DBMS

data

DBMS

data

DBMS

data

DBMS

data

DBMS

Distributed Database System

5

The below, is database processing in one of the others site of in Distributed DBMS Environment

Implicit Assumptions • Data stored at a number of sites each site logically consists of a single

processor. • Processors at different sites are interconnected by a computer network no

multiprocessors parallel database systems

• Distributed database is a database, not a collection of files data logically

related as exhibited in the users’ access patterns relational data model

• D-DBMS is a full-fledged DBMS

not remote file system, not a TP system

Motivation • Decentralization • Disaster recovery and backup manipulation • Technology integration • Data replication

6

Distributed Computing • A number of autonomous processing elements (not necessarily homogeneous)

that are interconnected by a computer network and that cooperate in performing their assigned tasks.

• Synonymous terms

distributed function distributed data processing multi-processors/multi-computers satellite processing backend processing dedicated/special purpose computers timeshared systems functionally modular systems

What is Distributed? • Processing Logic: In fact the definition of distributed computing elicits

processing elements or processing logic are distributed

• Function: Various functions of computer system could be delegated to various pieces of hardware or software

• Data: Data used by a number of applications may be distributed to a number

of processing sites

• Control: The control of the execution of various tasks might be distributed instead of being performed by one computer system

What a Site is? Each site participating in a distributed database system has its own local real databases, its own local users, its own local DBMS & transaction management software including its own local locking, logging, recovery etc. and its own local data communication manager.

DDBS Environment

7

Distributed Database Graphic

What is not a Distributed Database System? • A timesharing computer system • A tightly coupled (shared memory) or loosely coupled (shared disk)

multiprocessor system • A database system which resides at one of the nodes of a network of

computers - this is a centralized database on a network node

Shared-Memory Multiprocessor

Processor

Unit

Memory

Processor Unit

Processor Unit

I/O System

Processor Unit

Memory

Processor Unit

Processor Unit

I/O System

8

Centralized DBMS on a Network

Why Distribute Database? • Organizational - geographic distribution

Multinational company with distributed departments (e.g. IBM) • Interconnection of existing databases

(Merger of two companies e.g. Singtel & Optus) • Incremental growth/scalability

(Addition of new branches or warehouses) • Reduced communication overhead • Reliability and availability • Performance considerations (parallelism) • Local autonomy

Advantages of DDBMSs • Reflects organizational structure • Improved share ability and local autonomy • Improved availability • Improved reliability • Improved performance • Economics

Disadvantages of DDBMSs • Complexity • Cost • Security • Integrity control more difficult • Lack of standards • Lack of experience • Database design more complex

Applications • Manufacturing - especially multi-plant manufacturing • Military command and control • Corporate MIS • Airlines • Hotel chains • Payment system • Any organization which has a decentralized organization structure

9

Issues with DDBMS • Fragmentation: Relation may be divided into a number of sub-relations,

which are then distributed. o Horizontal – subset of rows o Vertical – subset of columns

Each fragment must contain primary key Other columns can be replicated

o Mixed – both horizontal and vertical o Must be able to reconstruct original table o Can query and update through fragment o Strategize to achieve:

Locality of Reference Improved Reliability and Availability Improved Performance Balanced Storage Capacities and Costs Minimal Communication Costs.

o Correctness of Fragmentation Completeness

➠ Decomposition of relation R into fragments R1, R2, ..., Rn is complete if and only if each data item in R can also be found in some Ri

Reconstruction

➠ If relation R is decomposed into fragments R1, R2, ..., Rn, then there should exist some relational operator ∇ such that

R = ∇1≤i≤n Ri

Disjointness ➠ If relation R is decomposed into fragments R1, R2, ..., Rn, and data item di is in Rj, then di should not be in any other fragment Rk (k ≠ j )

• Allocation: Each fragment is stored at site with "optimal" distribution.

• Replication: Copy of fragment may be maintained at several sites.

o Storing data at multiple sites Example: Internet grocer with multiple warehouses.

CUSTOMER (Cust#, Addr, Location)

Customer info at central location Location is warehouse that makes deliveries

o Where do we store tables? o Fragment? o Replicate?

10

Distributed Transaction Management • There are two major aspects to transaction management.

o Recovery Control & o Concurrency Control

• Both requires an extended treatment in distributed systems

• In a distributed system, a single transaction can involve the execution of code

at many sites. It can involve updates at many sites.

• Each transaction is said to consist of several agents, where an agent, is the process performed in behalf of a given transaction at a given site.

• And the system needs to know when those agents are part of the same

transaction. For instance, two agents that are a part of the same transaction must obviously not be allowed to deadlock with each other.

• In order to ensure that a given transaction is atomic (all or nothing) in

distributed environment, the system must ensure that the set of agents for that transaction either all commit in unison or all rollback in unison.

• This effect can be achieved by means of the two phase commit protocols.

Before we go ahead for the next step, there are few requirements that caused those being the using of distributed database system with transparency.

Network Transparency • In a distributed database management environment, the network needs to

be shielded in a manner data is shielded in centralized DBMS.

• Preferably, the user should be protected from the operational details of the network. Further more, it is desirable to hide even the existence of a network.

• Then there would be no difference between database applications that

would run on a centralized database or on distributed database.

Distribution Transparency • Location transparency: refers to the fact that the command used to

perform a task is independent of both the location of data and the system on which an operation is carried out.

• Naming Transparency: this means that a unique name is provided for

each object in the database. In the absence of naming transparency users are required to embed the location name as the part of the object name.

11

Location Independence • Users should not have to know where data is physically stored but rather

should be able to behave – at least from a logical stand point as if the data were all stored at their own local site

• Location independence is desirable as it simplifies user programs and

terminal activities.

• It allows data to migrate from one site to another without invalidating any of those programs or activities. Such migratability is desirable because it allows data to be moved around network in response to changing performance requirements.

• Location independence is just an extension to the distributed case of

familiar concept of physical data independence

Data Replication • For performance, reliability, and availability reasons, it is usually desirable

to be able to distribute data in a replicated fashion across the machines in a network.

• Such replication helps performance since diverse and conflicting user

requirements can more easily be accommodated. • Data that is commonly accessed by one user can be placed on that user’s

local machine as well as the machine of another user with the same access requirements.

• This increases the locality of reference. Further if one machine fails, a copy

of the same data is still available on another machine on the network.

Replication Transparency • Assuming that data is replicated, the issue related to transparency that

needs to be addressed is whether the users should aware of the existence of copies or whether the system should handle the management of copies and the user should act as if there is a single copy of data.

• From a user’s perspective it is obvious and from the system perspective it

is not that simple. It is not the system that decides whether or not to have copies and how many copies to have, but the user application.

• It is desirable that replication transparency be provided as a standard

feature of DBMS. Distributing these replicas across a network in a transparent manner is the domain of network transparency.

Data Fragmentation • In a distributed database environment it is commonly desirable to divide each

database relation into smaller fragments and treat each fragment as a separate database object.

• This is commonly done for reasons of performance, reliability, and availability.

• Furthermore, fragmentation reduces the negative effects of replication. Each

replica is not the full relation but only a subset of it, thus less space is required and fewer data items need to be managed.

12

• There are two general types of fragmentation alternatives: o Horizontal Fragmentation: A relation is partitioned into a set of sub

relations each of which have a subset of tuples (rows) of original relation.

o Vertical Fragmentation: Where each sub relation is defined on a

subset of attributes (columns) of original relation.

Fragmentation Independence • When database objects are fragmented the user queries that were specified on

entire relation now required to be dealt with the sub relations.

• Typically this requires a translation from what is called a global query to several fragment queries.

DBMS Independence • Under this heading all that is really needed that the DBMS instances at

different sites all support the same interface they do not necessarily all have to be copies of the same DBMS software.

• If Ingress and Oracle both supported the official SQL standard then it might be

possible to get an Ingress site and an Oracle site to talk to each other in the context of distributed system

• Support of heterogeneity is definitely desirable. The fact is, real world

computer installations typically run not only many different machines and many different operating systems, they very often run different DBMS as well; and it would be nice if those different DBMSs could all participate somehow in a Distributed system.

• In other words ideal distributed system should provide DBMS independence.

Operating System Independence This objective is partly a corollary of the previous one and is obviously desirable, not only to run the same DBMS on different hardware platforms, but also to be able to run it in different operating system platforms as well – including different operating system on the same hardware - and have an MVS version and a UNIX version and an NT version all participate in the same distributed system

Hardware Independence Real world computer installations typically involve a multiplicity of different machines – IBM machines, ICL machines, HP machines, PCs and workstations of various kinds etc. etc. and there is a real need to be able to integrate the data on all of those systems and present the user with a single system image. Thus it is desirable to run the same DBMS on different hardware platforms, and further more to have all those machines all participate as equal partners in distributed systems.

Who Should Provide Transparency The below, mode of operation is the most common method today: • The responsibility of providing transparent access can be left with the access

layer. The transparency features can be built into the user language, which then translates the required services into required operations. In other words, the compiler or interpreter takes over the task and makes the user free from all these botherations.

13

• The second layer at which transparency is provided is the operating system level. State of the art operating systems provide some level of transparency to system users. Providing transparent access to resources at the operating system level can obviously be extended to the distributed environment, where the management of the network resource is taken over by the distributed operating system.

• The third layer at which transparency can be supported is within the DBMS. It is

the responsibility of the DBMS to make all necessary translations from the operating system to the higher level user interface.

Complexity • DBMS problems are inherently more complex than centralized database

management ones as they include not only the problems found in centralized environment but also a new set of unresolved problems

Cost • Distributed systems require additional hardware (communication mechanisms

etc.) thus have increased hardware costs. However the trend towards decreasing hardware costs does not make this a significant factor.

• The most important cost component is due to the replication of effort

(manpower), which usually results in an increase in the personnel in the data processing operations.

• Therefore, the trade-off between increased profitability due to more efficient

and timely use of information and increased personnel costs has to be analyzed carefully.

Distribution Control • This point was stated previously as an advantage of DDBS. Unfortunately,

distribution creates problems of synchronization and coordination. (Disavantage)

• Distributed control can, therefore, easily becomes a liability if care is not taken

to adopt policies to deal with these issues.

Security • One of the major benefits of centralized databases has been the control it

provides over the access to data. Security can easily be controlled in a central location with DBMS enforcing the rules.

• However in distributed database system, a network is involved which is a

medium that has got its own security requirements. It is well known that there are serious problems in maintaining adequate security over computer networks.

• Thus the security problems in distributed database systems are by nature

more complicated then in centralized ones.

14

The distributed database system issues:

Distributed Database Design • How the database and applications that run against should be placed across

sites. There are two basic alternatives to placing data: Partitioned and Replicated.

o In the partitioned scheme the database is divided into a number of

disjoint partitions each of which is placed at a different site.

o Replicated designs can be fully replicated (fully duplicated) where the entire database is stored at each site, or partially replicated where each partition of the database is stored at more than one site, but not at all the sites.

• The research in this area mostly involves mathematical programming in order

to minimize cost of storing the database, processing transactions against it, and communications.

Distributed Query Processing Query processing deals with designing algorithms that analyze queries and convert them into a series of data manipulation operations. The problem is how to decide on a strategy for executing each query over the network in the most cost effective way.

o The factors to be considered are the distribution of data,

communication costs, lack of sufficient locally available information.

o The objective is to optimize where the inherent parallelism is used to improve the performance of executing the transactions.

Distributed Directory Management • A directory contains information (descriptions & locations) about data items in

the database. Problems related to directory management are similar in nature to database placement problem.

• A directory may be global to the entire DDBS or local to each site; it can be

centralized at one site or distributed over several sites; there can be single copies or multiple copies etc.

15

Distributed Concurrency Control • Concurrency control involves the synchronization of access to the distributed

database, such that the integrity of the database is maintained.

• The concurrency control problem in distributed context is some what different then in a centralized framework.

• One not only has to worry about the integrity of a single database but also

about the consistency of multiple copies of the database. • Pessimistic Synchronizing the user requests before execution starts and

Optimistic, executing the requests and then checking if the execution compromised the consistency of the database are two independent approaches to get rid of concurrency related problems.

• Locking which is based on mutual exclusion of access of data items, and time

stamping, where the transactions are executed in some predetermined order.

• There are variations of these schemes as well as hybrid algorithms that attempt to combine the two basic mechanisms.

Distributed Deadlock Management • The competition among users for access to a set of resources (data) can result

in dead locks if the synchronization mechanism is based on locking.

• The well known alternatives of prevention, avoidance, and detection/recovery also applies to distributed databases

Reliability of Distributed Databases • It is important that mechanisms be provided to ensure the consistency of the

database as well as to detect failures and recover from them.

• The implication for DDBSs is that when failure occurs and various sites become either inoperable or inaccessible, the database at the operational sites remains consistent and up to date.

• Further when computer systems or network recovers from failures, the DDBSs

should be able to recover and bring the database at the failed sites up to date.

• This may be especially difficult in case of network partitioning where the sites are divided into two or more groups with no communication among them.

16

The distributed database system related issues:

Operating System Support • The current implementation of distributed database systems on the top of the

conventional operating systems suffers from the performance bottleneck the support provided by operating systems for database operations do not correspond properly to the requirement of the database management software.

• The major operating system related problems in single processor systems are

memory management, file systems and access methods, crash recovery and process management.

• In distributed environments there is additional problem of having to deal with

multiple layers of network software.

• The work in this area is on finding solutions to the dichotomy of providing adequate and simple support for distributed database operations, as well as providing general operating systems support for other applications.

Heterogeneous Databases • When there is no homogeneity among databases at various sites either in

terms of the way data is logically structured (data model) or in terms of mechanisms provided for accessing it (data language), it becomes necessary to provide a translation mechanism between database systems.

• This translation mechanism usually involves a canonical form to facilitate data

translation, as well as program templates for translating data manipulation instructions.

17

Chapter2: Distributed DBMS Architecture

1/ Objective • To structure the distributed DBMS such that it provides the requisite

functionality. • To understand clearly the levels of transparency. • To study the design space for distributed DBMS and implementation details. • To identify alternatives & give examples • To segregate the participating component and to discuss their optimal

utilization.

2/ Types of DDMS Architecture Basically there exist three reference architectures for distributed DBMS, below:

Client/Server Systems Peer to Peer Distributed Systems Multi Database Systems

As regarding, to DBMS Standardization

First Attempts In 1972, the Computer & Information Processing Committee of American National Standards Institute (ANSI) established a study group on DBMS under the auspices of its Standards Planning and Requirements Committee (SPARC).

The Mission To study the feasibility of setting up standards in the area of Database

Management System.

Determining all areas that can be standardized if feasible

Proposal The architectural framework proposed came to be known as the

ANSI/SPARC architecture.

The study proposed that the interface be standardized, and defined as architectural framework that contained 43 interfaces, 14 of which would deal with the physical storage subsystem of the computer.

Standardization Approaches A reference model can be described according to three different approaches

+ Based on Components

The components of the system are defined together with the interrelationship between the components.

Thus a DBMS consists of a number of components, each of which

provides some functionality.

The orderly and well-defined interaction of these components provides the total system functionality

18

+ Based on Functions

The different classes of users are identified and the functions that the system will perform for each class is defined.

This results in hierarchical system architecture with well-defined

interfaces between the functionalities of different layers.

The ISO architecture falls in this category

+ Based on Data

The different types of data are identified, and an architectural framework is specified which defines the functional units that will realize or use data according to these different views.

Since data is the central resource that a DBMS manages, this approach

(data-logical approach) is claimed to be the preferable choice for standardization activities

Based on Data Organization The ANSI/SPARC architecture is claimed to be based on data

organization It recognizes three views of data:

o The External View: which user might be a programmer o The Internal View: that of a system or machine; and o The Conceptual View: that of the enterprise.

For each of these views, an appropriate schema definition is required

+ Internal View

At the lowest level of the architecture is the internal view, which deals with the physical definition and organization of data.

The location of data on different storage devices and the access

mechanisms used to reach and manipulate data are the issues dealt with this level.

19

Internal Schema At the internal level, the storage details of these relations are

described.

Assuming that the emp relation is stored in an indexed file, where the index is defined on the key attribute (ENO) called EMINX.

It is also assumed a HEADER field is associated which might contains

flags (delete, update, etc.) and other control information.

Then the internal schema definition of the relation may be as follows:

INTERNAL_REL EMPL [ INDEX ON E# CALL EMINX FIELD = { HEADER : BYTE(1) E# : BYTE (9) E:NAME : BYTE (15) TIT : BYTE (10) } ]

+ External View

At the other extreme is the external view which is concerned with how users view the database.

An individual user’s view represents the portion of the database that

will be accessed by the user as well as the relationship that the user would like to see among the data.

A view can be shared among a number of users with the collection of

user views making up the external schema

Finally the external views can be described using SQL notation.

Considering two applications for example say one that calculates the payroll payments for engineers, and the second that produces a report on the budget of each project.

Thus they can be defined as follows.

External Schema

CREATE VIEW PAYROLL (ENO, ENAME, SAL) AS SELECT EMP.ENO, EMP.ENAME, PAY.SAL FROM EMP, PAY WHERE EMP.TITLE=PAY.TITLE

The second application is simply a projection of PROJ relation which can be specified as:

CREATE VIEW BUDGET (PNAME, BUD)

AS SELECT PNAME, BUDGET FROM PROJ

20

+ Conceptual Schema

In between these two ends is the conceptual schema, which is an abstract definition of the database.

It is the real world view of the enterprise being modeled in the

database.

As such, it is supposed to represent the data and the relationship among data without considering the requirements of individual applications or restrictions of physical storage media

An Example:

Considering the Engineering Database example for the four relations o EMP, o PROJ, o ASG, and o PAY,

Conceptual Schema should describe each relation with respect to its

attributes and key. The description might look like the following: 1/ RELATION PAY (Conceptual) RELATION PAY [ KEY = {TITLE} ATTRIBUTES = { TITLE : CHARACTER (10) SAL : NUMERIC (6) } ]

2/ RELATION PROJ (Conceptual) RELATION PROJ [ KEY = {PNO} ATTRIBUTES = { PNO : CHARACTER (7) PNAME : CHARACTER (20) BUDGET : NUMERIC (7) } ]

ANSI/SPARC Architecture The investigation of the ANSI/SPARC architecture with respects to

functions results in a considerably more complicated view.

Conventions o Square Boxes : Processing Functions o Hexagons : Administrative Roles o Arrows : Data, Command, Program o “I” shaped Bars: Interfaces o Triangle : Data Dictionary

21

Enterprise Administrator

Conceptual Database Schema

Processor

Application System Administrator

Database Administrator

Internal Database Schema

Processor

External Database Schema

Processor

Internal storage/ internal DB trans

Internal DB/ conceptual trans.

Conceptual/ external DB trans

GD/D

Internal Database

Application Program

System Programmer

Application Programmer

External Database

ApplicationProgram

Enterprise Administrator

Conceptual Database Schema

Processor

Application System Administrator

Database Administrator

Internal Database Schema

Processor

External Database Schema

Processor

Internal storage/ internal DB trans

Internal DB/ conceptual trans.

Conceptual/ external DB trans

GD/D

Internal Database

Application Program

System Programmer

Application Programmer

External Database

ApplicationProgram

Partial Schematic of ANSI/SPARC Architectural Model

Data Dictionary/Directory

The major components that permits mappings between different data organizational views is the data dictionary/directory (depicted as a triangle), which is a meta database which contains the schema & mapping definitions.

It also contains the usage statistics, access control information and the

like. It serves as the central component in both processing different schemes and in providing mapping among them.

Roles

Database Administrator: is responsible for maintaining the internal schema definition.

Enterprise Administrator is responsible for defining the internal schema

definition.

Application administrator is responsible for preparing the external schema for the applications.

22

Architectural Models

Classification

Considering the possible ways in which multiple databases may be put together the system can be classified with respect to

o The Autonomy of Local System o Their Distribution o Their Heterogeneity.

Autonomy

Autonomy refers to the distribution of control, not of data. It indicates the degree to which individual DBMS can operate independently

Autonomy is also a function of a number of factors such as whether

the component systems exchange information, whether they can independently execute transactions, and whether one is allowed to modify them.

Requirements of Autonomous Systems

According to Gligour & Popescu Zeletin: o The local operations of individual DBMS are not affected by

their participation in multidatabase systems.

o The manner in which individual DBMSs processes queries and optimize them should not be affected by the execution of global queries that access multiple database.

o System consistency or operation should not be compromised

when individual DBMSs join or leave the multidatabase confederation.

23

According to Du and Elmagarmid: o Design Autonomy: Individual DBMSs are free to use the data

models and transaction management techniques that they prefer.

o Communication Autonomy: Each of the individual DBMSs is

free to make its own decision as to what type of information it wants to provide to the other DBMS or to software that controls their global execution

o Execution Autonomy: Each DBMS can execute the transactions

that are submitted to it in any way that it wants to

Aspects of Classification

Tight Integration Systems In these tightly integrated systems, the data managers are implemented so that one of them is in control of the processing of each users request even if that request is serviced by more than one data manager. The data managers do not operate as independent DBMS even though they usually have the functionality to do so.

o Where a single image of the entire database is available to any user who wants to share the information, which may reside in multiple database.

o From the users perspective the data is logically centralized in

one database

Semiautonomous Systems o That consists of DBMS that can operate independently, but

have decided to participate in a federation to make their local data shareable.

o Each part of DBMS determines what parts of their own

database they will make accessible to users of other DBMS.

o They are not fully autonomous systems because they need to be modified to enable them to exchange information with one another.

Total Isolation

o Where the individual systems are stand alone DBMSs, which know neither of the existence of other DBMSs nor how to communicate with them.

o In such systems, the processing of user transactions that

access multiple databases is especially difficult since there is no global control over the execution of individual DBMSs

24

3/ Distribution Whereas autonomy refers to distribution of control, the distribution dimension deals with data. There are a number of ways DBMS have been distributed. Mainly they are of two types.

3.1/ Client-Server Distribution A database server is the Oracle software managing a database, and a client is an application that requests information from a server. Each computer in a system is a node. A node in a distributed database system act as a client, a server, or both, depending on the situation

3.1.1/ Distributed DBMS Architecture

o Client/Server DBMSs entered the computing scene at the beginning of 1990’s and have made a significant impact on both the DBMS technology and the way we do computing.

o This provides a two level architecture, which makes it easier to

manage the complexity of modern DBMSs and the complexity of distribution.

3.1.2/ General Idea

The general idea is very simple and elegant to distinguish the functionality that need to be provided and divide these functions into two classes:

Server functions and Client functions.

3.1.3/ Client-Server Reference Architecture

3.1.4/ Process Centric View

o Any process that requests the services of another process is its Client and vice versa.

o However it is important to note that Client Server Computing and

Client Server DBMS assist is used in its most modern context do not refer to processes, but to actual machines.

o Thus the focus is into what software should run on Client machines

and what software should run on the Server machine

25

3.1.5/ Task Management (Server)

o The first and foremost important thing in a Client/Server Architecture is that the server does most of the data management work.

o This means that all Query processing, Query optimization, Transaction

management and Store management is done at the server.

3.1.6/ Task Management (Client)

o The client provides Application and User Interface

o DBMS client module that is responsible for Managing the data which is cached to the client, Managing the transaction locks and Managing Consistency checking of user queries at the client side

o Of course, there is operating system and communication software that

runs on clients and server but communication between client and server is at the level of SQL statements.

o In other words, client passes SQL queries to the server without trying

to understand or optimize them. The server does most of the work and returns the result relation to the client.

Advantages More efficient division of labor Horizontal and vertical scaling of resources Better price/performance on client machines Ability to use familiar tools on client machines Client access to remote data (via standards) Full DBMS functionality provided to client workstations Overall better system price/performance

Classification

There are a number of different types of client server architecture. Multiple Client / Single Server Multiple Client / Multiple Server

Multiple Client-Single Server

o The simplest is the case where is only one server which is accessed by multiple clients.

o From data management perspective this is not much different from

centralized databases since the database is stored only in one machine which also hosts the software to manage it.

o However there are some important differences in the way transactions

are executed and caches are managed.

26

Problems

o Server forms bottleneck (as we can see above) o Server forms single point of failure o Database scaling difficult

Multiple Client-Multiple Server

o The sophisticate client/server architecture is there are multiple servers in the system.

o In this case two alternative management strategies, that is called direct and indirect connection in oracle, are possible:

either each client manages its own connection to the appropriate server or

each client knows of only its home server which then communicates with other servers as required

27

3.2/ Peer to Peer Distributed Systems A client can connect directly to a database server. When the client application issues the first and third statements for each transaction, the client is connected directly to the database that contains the remote data.

3.2.1/ Data Organizational View

o Looking at the data organizational view the physical data organization on each machine is different.

o This means that there needs to be an individual internal schema

definition at each site, termed as local internal schema (LIS).

o The enterprise view of the data is described by the Global Conceptual Schema (GCS), which is global because it describes the logical structure of the data at all the sites.

3.2.2/ Three Layer Architecture

o As the data in a distributed database is usually fragmented and replicated, to handle the phenomenon of fragmentation and replication, the logical organization of data at each site needs to be described.

o Therefore there needs to be a third layer in the architecture, the local

conceptual schema (LCS).

o Thus the global conceptual schema (GCS) is union of the local conceptual schema.

o Finally, user applications and user access to database is supported by

external schemas (ESs)

28

3.2.3/ What does it Supports

o This architectural model provides all the necessary levels of transparency.

o Data independence is supported since the model is an extension of

ANSI/SPARC, which provides such independence naturally.

o Location and replication transparencies are supported by the definition of the local and global conceptual schemas and the mapping in between.

o Network transparency on the other hand is supported by the definition

of the global conceptual schema.

o The user queries data irrespective of its location or of which local component of the distributed database system will service it as the distributed DBMS translates global queries into a group of local queries, which are executed by distributed DBMS components at different sites that communicate with one another.

3.2.4/ Functional Description

In terms of the detailed functional description of our model, the ANSI/SPARC model is extended by addition of a global directory/dictionary (GD/D) that permits the required global mappings. The local mappings are still performed by a local directory/ dictionary. Thus the local database management component is integrated by means of global DBMS functions. As the local conceptual schemas are mappings of the global schema onto each site such database is designed in top down fashion and therefore all external view definitions are made globally.

There is an existence of a local database administrator at each site in order to have local control over the administration of data, which is one of the primary motivations of distributed processing.

3.2.5/ Component Description

Distributed DBMS consists of a number of components. One component handles the interaction with users, and other deals with storage. The first major component, which is called as the user processor, consists of four elements: 3.2.5.1/ User Interface Handler: is responsible for interpreting user commands as they come in, and formatting the result data as it is sent to the user. This component is responsible is responsible in establishing a link between the system at one end and the user on the other.

3.2.5.2/ Semantic Data Controller: uses the integrity constraints and authorizations that are defined as part of the global conceptual schema to check if the user query can be processed. This component is responsible for authorization and other functions.

3.2.5.3/ Global Query Optimizer and Decomposer: are determines an execution strategy to minimize a cost function, and translates the global queries to local ones using the global and local conceptual schemas as well as the global directory. The global query optimizer is responsible among other things for generating the best strategy to execute distribute join operations.

29

3.2.5.4/ Distributed Execution Monitor: co-ordinates the distributed execution of request. it is also called the distributed transaction manager. In executing queries of a distributed fashion the execution monitors at various sites may, and usually do, communicate with one another.

USERPROCESSOR User Interface Handler

Semantic Data Controller

Global Execution Monitor

Global Query Optimizer

External Schema

Global Conceptual Schema

GD/D

USER

System Responses User Requests

USERPROCESSOR User Interface Handler

Semantic Data Controller

Global Execution Monitor

Global Query Optimizer

External Schema

Global Conceptual Schema

GD/D

USER

System Responses User Requests

4/ Data Processor The second major component of a distributed DBMS, which the primary component Oracle uses, is the data processor and consists of three elements:

4.1/ The Local Query Optimizer It’s actually acts as the access path selector, is responsible for choosing the best access path to access any data item.

4.2/ Local Recovery Manager The local recovery manager is responsible for making sure that the local database remains consistent even when failure occurs.

4.3/ Runtime Support Processor It physically accesses the database according to the physical commands in the schedule generated by the query optimizer. It is the interface to the operating system and contains the database buffer (or cache) manage, which is responsible for maintaining the main memory buffers and managing the data access

DATAPROCESSOR

Local Query Processor

Local Recovery Manager

Runtime Support Processor

Local Conceptual Schema

Local Internal Schema

System Log

DATAPROCESSOR

Local Query Processor

Local Recovery Manager

Runtime Support Processor

Local Conceptual Schema

Local Internal Schema

System Log

30

4.4/ Component Architecture

5/ Heterogeneity • Heterogeneity may occur in various forms in distributed systems, ranging from

hardware heterogeneity and difference in networking protocols to variations in data managers.

• The important ones relate to data models, query languages, and transaction

management protocols.

• Representing data with different modeling tools creates heterogeneity because of the inherent expressive powers and limitations of individual data models

• Heterogeneity in query languages not only involves the use of complexity

different data access paradigm in different data models but also covers differences in languages even when the individual system use the same data model.

• Different query languages that use the same data model often select very

different methods for expressing identical requests (e.g. DB2 uses SQL, while INGRES uses QUEL).

6/ Architectural Alternatives • The alternatives along each dimension are identified by numbers as 0,1, or 2

and these numbers of course have different meanings along each of the dimensions.

• Along the autonomy dimension, 0 represents tight integration, 1 represents

semi autonomous systems and 2 represents total isolation.

• Along distribution, 0 identifies homogenous systems while 1 stands for heterogeneous systems.

31

7/ Implementation Alternatives

7.1/ (A0, D0, H0): The first, class of systems are those which are logically integrated. Such systems can be given a generic name composite system. If there is no distribution or heterogeneity, the system is a set of multiple DBMSs that are logically integrated. There are not many examples of such systems, but they may be examples of shared everything multiprocessor systems

7.2/ (A0, D0, H1): If heterogeneity is introduced, one has multiple data managers that are heterogeneous but provides an integrated view of the user. In the past some work was done in this class where systems are designed to provide integrated access to the network, hierarchical, and relational databases residing on a single machine.

7.3/ (A0, D1, H0): The more interesting case is where the database is distributed even though an integrated view is provided to users. This alternative represents client server distribution

7.4/ (A0, D2, H0): This point in the design space represents a scenario where the same type of transparency is provided to the user in a fully distributed environment. There is no distinction among clients and servers, each site providing identical functionally.

7.5/ (A1, D0, H0) The next points in the autonomy dimension are semiautonomous systems, which are commonly termed as federated systems. The Component Systems in a federated environment have significant autonomy in their execution, but their participation in a federation indicate that they are willing to co-operate with others in executing user requests that access multiple databases.

7.6/ (A1, D0, H1) These are systems that introduce heterogeneity as well as autonomy, what we might call a heterogeneous federated DBMS. Examples of these are easy to find in everyday use.

32

7.7/ (A1, D1, H1) Systems of this type introduce distribution by placing component systems on different machines. They may be referred to as distributed heterogeneous federated DBMS. It is fair to state that the distribution aspects of these systems are less important then their autonomy and heterogeneity. Distribution introduces some new problems, but generally the techniques developed for homogenous and autonomous distributed DBMS can be applied to deal with those issues.

7.8/ (A2, D0, H0) If we move to full autonomy, we get what we call the class of multi-database system (MDBS) architectures. The identifying characteristic of these systems is that the components have no concept of cooperation and they do not even know how to “talk to each other”. Without heterogeneity or distribution, an MDBS is an interconnected collection of autonomous databases.

7.9/ (A2, D0, H1) This case is realistic, maybe even more so than (A1, D0, H1), in that we always want to build application which access data from multiple storage systems with different characteristics. Some of these storage systems may not even be DBMSs and they certainly have not been designed and developed with a view to interoperating with any other software.

7.10/ (A2, D1, H1) and (A2, D2, H1) Make up the MDBS are distributed over a number of sites. We called distributed MDBS. The solutions to distribution issues for the two cases are similar and the general approach to dealing with interoperability is not differ too much. The major differences is that in the case of client/server distribution (A2, D1, H1), most of the interoperability concerns are delegated to middleware systems can be homogeneous or heterogeneous.

8/ Multi-DBS Architecture The differences in the level of autonomy between the distributed multi-DBMS and distributed DBMSs are also reflected in their architectural models. The fundamental difference relates to the definition of the global conceptual schema.

8.1/ Fundamental Difference In the case of logically integrated distributed DBMSs, the global

conceptual schema defines the conceptual view of the entire database, while in case of distributed multi-DBMSs it represents only a collection of some of the local databases that each local DBMS want to share.

Thus the definition of a global database is different in MDBSs then in

distributed DBMSs. In the latter, the global database is equal to the union of local databases, where as in the former it is only a subset of the same union

8.2/ Models using GCS In a MDBS, the GCS is defined by integrating either the external

schemas of local autonomous databases or parts of their local conceptual schemas.

Furthermore, users of local DBMS define their own views on local database and do not need to change their application if they do not want to access data from another database. This is again an issue of autonomy

33

8.3/Designing the GCS Designing the global conceptual schema in multi-database systems

involves the integration of either the local conceptual schemas or the local external schemas.

A major difference between the design of the GCS in multi-DBMSs and

in logically integrated distributed DBMS is that in the former the mapping is from local conceptual schema to global schema. In the latter, however, mapping is in the reverse direction.

Hence the former is usually a bottom up approach, whereas in the

latter it is usually a top down procedure

8.4/ Multi-DBMS Architecture Once the GCS has been designed, views over the global schema can be designed for users who require global access. It is not necessary for the GES and GCS to be defined using the same data model and language; whether they do or not determines whether the system is homogenous or heterogeneous.

8.5/ Classification of multi DBMS If heterogeneity exists in the system, then two implementation alternatives exist:

8.5.1/ Unilingual multi DBMS

A unilingual multi DBMSs requires the users to utilize possibly different data models and languages when both a local database and the global database is accessed. The identifying characteristic of unilingual system is that any application that access data from multiple databases must do so by means of external view that is defined on global conceptual schema This means that the user of global database is effectively a different user than those who access only a local database, utilizing a different data model and a different data language. Thus one application may have a local external schema (LES) defined on the local conceptual schema as well as global external schema (GES) defined on global conceptual schema. The different external view definition may require the use of different access languages. Examples of such architecture are MULTIBASE system and DDTS.

34

8.5.2/ Multilingual multi DBMS

An alternative is multilingual architecture, where the basic philosophy is to permit each user to access the global database by means of external schema, defined using the language of the user’s local DBMS. The GCS definition is quite similar in multilingual architecture and the unilingual approaches, the major difference being the definition of external schema of local database. Queries against the global database are made using the language of local DBMS, but they generally require some processing to be mapped to the global conceptual schema. The multi lingual approach obviously makes querying the database easier from the user’s perspective. However, it is more complicated because translation of queries is required at runtime. The multilingual approach is used in Sirus-Delta and in the HD-DBMS project

8.6/ Component based Model The component based architectural model of multi-DBMS is significantly different from a distributed DBMS. The fundamental difference is the existence of full-fledged DBMSs, each of which manages a different database. The MDBS provides a layer of software that runs on top of these individual DBMSs and provides users with the facilities of accessing various databases. Depending on the existence or lack of global conceptual schema or the existence of heterogeneity or lack of it, the contents of these layers will change significantly. If the system is distributed it is necessary to replicate the multidatabase layer to each site where there is a local DBMS that participates in the system.

As far as the individual DBMSs are concerned, the MDBS layer is simply another application that submits requests and receives answers

8.7/ Directory Issues

35

Chapter 3 Distributed Database Design

1/ Design Problem In the general setting, to make a decision about the distribution (placement) of data and programs across the sites of a computer network as well as possibly designing on the network itself. In the case of distributed DBMSs, the distribution of applications involves two things: distribution of distributed DBMS software and distribution of application programs that run on it The former is not a significant problem, since we assume that a copy of the distributed DBMS software exists at each site where data are stored. Here we are not concerned with the placement of application program. Furthermore, we assume that the network has already been designed, or will be designed at a later stage, according to the decisions related to distributed database design. We only concentrate on distribution of data. It has been suggested that the organization of distributed system can be investigated along three orthogonal dimensions:

Dimensions of the Problem

1.1/ Level of Sharing In terms of the level of sharing, there are three possibilities:

1.1.1/ No Sharing

Firstly, there is no sharing each application and data execute at one site and there is no communication with any other program or access to any data files at other sites. This characterizes the very early days of networking and is not very common today.

1.1.2/ Data Sharing

We then find the level of data sharing where all the programs are replicated at all the sites, but data files are not. Accordingly, user requests are handled at the site where they originate and the necessary data files are moved around the network

1.1.3/Data + Program Sharing

Finally, in data plus program sharing, both data and programs may be shared meaning that a program at a given site can request a service from another program at a second site, which in turn, may have access a data file located at a third site. In a heterogeneous environment it is usually very difficult, and sometimes impossible, to execute a given program on different hardware under a different operating system. It might, however, be possible to move data around relatively easily.

36

1.2/ Access Pattern Behavior Along the second dimension of access pattern behavior, it is possible to identify two alternatives:

1.2.1/ Static

The access patterns of user requests may be static, so that they do not change over time. It is obviously considerably easier to plan and manage the static environments. Unfortunately, it is difficult to find many real life distributed applications that would be classified as static.

1.2.2/ Dynamic

The significant question then is not whether a system is static or dynamic, but how dynamic it is. Incidentally, it is along this dimension that the relationship between the distributed database design and query processing is established.

1.3/ Level of Knowledge The third dimension of classification is the level of knowledge about the access pattern behavior. The possible alternatives are:

1.3.1/ No Information

This is a theoretical possibility, but it is very difficult, if not impossible, to design a distributed DBMS that can effectively cope with this situation. Designers do not have any information about how users will access the database.

1.3.2/ Complete Information

The more practical alternative are that the designers have complete information, where the access patterns can reasonably be predicted and do not deviate significantly from the predictions

1.3.3/ Partial Information

Designers have partial information about how users will access the database thus there is a significant variation from the normal predictions.

2/ Alternative Design Strategies

2.1/ Alternative Design Approach This activity is a joint function of the database, enterprise, and application system administrators. As the name indicates, they constitute very different approaches to the design processes. Therefore, It is important to keep in mind that in most database design, the two approaches may need to be applied to complement with one another. But real applications are rarely simple enough to fit nicely in either of these alternatives. To identified for designing a distributed databases:

37

2.1.1/ Top Down Design Process

Observation & MonitoringFeedback Feedback



Requirement Analysis The top down design process begins with requirement analysis that defines the environment of the system and elicits both the data and processing needs of all potential database users. The requirement study also specifies where the final system is expected to stand with respect to the objectives of a distributed DBMS.

View Design Regarding, to the requirement document is input to two parallel activities: view design & conceptual design. The view design activity deals with defining the interface for the end users.

Conceptual Design The process by which the enterprise is examined to determine entity types and relationships among these entities called Conceptual design. It can possibly be divided into two related activity group:

+ Entity Analysis: is concerned with determining the entities, their attributes and the relationship among them.

+ Functional Analysis on the other hand is concerned with determining the fundamental functions within which the modeled enterprise is involved.

Relationship It is an integration of user views. View integration should be used to ensure that entity and relationship requirements for all the views are covered in the conceptual schema. The conceptual model should support not only the existing applications, but future applications also.

38

Activities In conceptual design and view design the user needs to specify Data Entities and must determine the applications that will run on the database as well as Statistical information about these applications. Statistical information includes: specification of the frequency of user applications, volume of various information and alike. GCS & Access Pattern Information Design From the conceptual design step comes the definition of Global conceptual schema and access pattern information. Note: GCS & API are inputs to the distribution design step. Distribution Design The objective at this stage is to design the Local Conceptual Schema by distributing the entities over the sites of the distributed system. It is possible to treat each entity as a unit of distribution and in relational model entity corresponds to relations. Rather than distributing relations, it is quite common to divide them into sub-relations, called fragments, which are then distributed.

Thus the distribution design activity consists of two steps: Fragmentation and Allocation

Physical Design Is the last step in the design process, which maps the local conceptual schemas to the physical storage devices available at the corresponding sites. The inputs to this process are the local conceptual schema and access pattern information about the fragments in these. Observation and Monitoring During, the period of design and development activity that is an ongoing process requiring constant monitoring and periodic adjustments and tuning. Here one does not monitor the behavior of the database implementation but also the suitability of user views. This results in some form of feedback, which may results in backing up to one of the earlier steps in the design.

2.1.2/ Bottom Up Design

Top down design is suitable approach when a database system is being designed from scratch. Commonly, however, a number of databases already exist, and the design task involves integrating them into one database. The bottom up approach is suitable for this type of environment

Design Approach The starting point of bottom up design is the individual conceptual schema. The process consists of integrating local schemas into the global conceptual schemas. This type of environment exists primarily in the context of heterogeneous databases.

39

Distribution Design Issues: + Why fragment at all? + How to fragment? + How much to fragment? + How to test correctness of decomposition? + How to allocate? + Information requirements?

Unit of Fragmentation With respect to fragmentation, the important issue is the appropriate unit of distribution. A relation is not a suitable unit, for a number of reasons. Relation Subsets First the application views are usually subsets of relations. Therefore, the locality of the access of applications is defined not on the entire relation but on their subsets. Hence, it is natural to consider subsets of relations on distribution units. If the application which views defined on a given relation resides on different sites two alternatives can be followed, with the entire relation being the unit of distribution, either the relation is not replicated and stored at only one site or it is replicated at all or some of its sites where application reside. Problem Areas The former results in an unnecessary are high volume of remote data access. The later on the other hand, has unnecessary replication, which causes problems in executing updates and may not be desirable if storage is limited. Advantages Fragments, each being treated as a unit, permits a number of transactions to execute concurrently. In addition the fragmentation of relations typically results in parallel execution of a single query by dividing it into a set of sub-queries that operate on fragments. Thus fragmentation typically increases the level of concurrency and therefore the system throughput. Disadvantages If the application have conflicting requirements which prevents decomposition of the relation into mutually exclusive fragments, those applications whose views are defined on more than one fragment may suffer performance degradation. It might be necessary to retrieve data from two fragments and then take either their union or their join, which is costly. Avoiding this is a fundamental fragmentation issue. The second problem is related to semantic data control, specifically to integrity checking. As a result of fragmentation, attributes participating in a dependency may be decomposed into different fragments which might be allocated to different sites. In this case even the simpler task of checking for dependencies would result in chasing after data in number of sites.

40

Fragmentation Alternatives Relation instances are essentially tables, so the issue is finding one of the alternative way of dividing a table into smaller ones. There are clearly two alternatives for this:

Dividing Horizontally Dividing Vertically

Degree of Fragmentation It’s very small units of fragmentation, that extent to which the database should be fragmented is an important decision that affects the performance of query execution. The degree of fragmentation goes from one extreme, that is not to fragment at all, to the other extreme, to fragment to the level of individual tuples (in case of horizontal fragmentation) or to the level of individual attributes (in case of vertical fragmentation). What is needed is that to find a suitable level of fragmentation which is a compromise between the two extremes. Such a level can only be defined with respect to the application that will run on the database How Issue Characterized with respect to a number of parameters. According to the value of these parameters, individual fragments

can be identified. Correctness of Fragmentation The following three rules can be enforced during fragmentation which together will ensure that the database does not undergo semantic changes during fragmentation. + Completeness: This property which is identical to the lossless decomposition property of normalization is also important in fragmentation since it ensures that the data in global relation is mapped into fragments without any loss. If a relation instance R is decomposed into fragments R1, R2, … Rn , each data item that can be found R can also be found in one or more Ri’s. + Reconstruction: This operator ∇ will be different for different forms of fragmentation. The reconstruct ability of the relation from its fragments ensures that constraints defined on the data in the form of dependencies are preserved. … Rn , then there should exist some relational operator ∇ such that R = ∇ ∀ ∈ Ri, Ri FR

+ Disjointness: If relation R is decomposed into fragments R1, R2, … Rn , and data item di is in Rj, then di should not be in any other fragment Rk (k ≠ j ). This criterion ensures that the horizontal fragments are disjoint. If relation R is vertically decomposed, its primary key attributes are typically repeated in all its fragments. Therefore, in case of vertical partitioning, disjointness is defined only on non primary key attributes of a relation.

41

3/ Horizontal Fragmentations Primary horizontal partition is performed using predicates that are defined on that relation. Derived horizontal partition is the partitioning of a relation which results from the predicates being defined on another relation. There are two versions of horizontal partitioning:

Primary Derived

PROJ

MontrealNew YorkNew YorkParisBoston

150000135000250000310000500000

InstrumentationDatabase DevelopCAD/CAMMaintenanceCAD/CAM

P1P2P3P4P5

LOCBUDGETPNAME PNO


150000135000250000310000500000


P1P2P3P4P5

LOCBUDGETPNAME PNO

MontrealNew York

150000135000

InstrumentationDatabase Develop

P1P2

LOCBUDGETPNAME PNO

MontrealNew York

150000135000

InstrumentationDatabase Develop

P1P2

LOCBUDGETPNAME PNO

New YorkParisBoston

250000310000500000

CAD/CAMMaintenanceCAD/CAM

P3P4P5

LOCBUDGETPNAME PNO

New YorkParisBoston

250000310000500000

CAD/CAMMaintenanceCAD/CAM

P3P4P5

LOCBUDGETPNAME PNO

PROJ1 :Projects with budgets less than $200,000

PROJ2 : Projects with budgets greater than or equal to $200,000

3.1/ Database Information Requirements The database information concerns the global conceptual schema. The important issue is how the database relation are connected to one another especially with joins. The quantitative information required about the the database is the cardinality of each relation R denoted as card(R).

As Example:

3.2/ Application Information The fundamental qualitative information consists of the predicates used in user queries. It is not possible to analyze all of the user applications to determine the predicates one should at least navigate for the most important ones It has been suggested that as a rule of thumb, the most active 20% of user queries account for 80% of the total data access. This 80/20 rule may be used

42

3.2.1/ Quantitative Information

In terms of quantitative information about user applications two sets of data is required 3.2.1.1/ Minterm Selectivity [sel(mi)]: refers to the number of tuples of a relation that would be accessed by user query specified by minterm predicate mi. For example: the selectivity of m1 is 0 since there are no tuples in PROJ that satisfy the minterm predicate. The selectivity of m2 is 2 3.2.1.2/ Access Frequency acc(qi): refers to the frequency with which user application access data. For instance: If Q = {q1, q2, …, qn}is a set of user queries, acc(qi) indicates the access frequency of query qi in a given period.

3.3/ Simple Predicates Even though simple predicates are quite elegant to deal with, user queries quite often include more complicated predicates which are Boolean combination of simple predicates. One combination that we are particularly interested in called a minterm predicate, is a simple conjunction of simple predicates Given a relation R(A1, A2, …, An), where Ai is an attribute defined over domain Di, a simple predicate pj defined on R has the form

pj : Ai θ Value

where θ ∈ {=,<,≤,>,≥,≠}, and Value is chosen from the domain of Ai (Value ∈Di) For relation Ri we define Pri = {pi1, pi2, …,pim}

Examples: Given the relation instance PROJ of Figure 3.1 and PAY of Figure in Exercises p1: PNAME = “Maintenance” p2: BUDGET ≤ 200000 p3: TITLE = “Elect.Eng.” p4: TITLE = “Programmer” are the simple predicates p5: SAL ≤ 30000 p6: SAL > 30000

3.3.1/ Minterm Predicates

Mi={mi1,mi2,…,miz} can be defined as Mi ={mij|mij = Λpik∈Pri pik* }, 1≤k≤m, 1≤j≤z where pik* = pik or pik* = ¬(pik). Examples: m1: PNAME = “Maintenance”∧BUDGET ≤ 200000 m2: NOT(PNAME=“Maintenance”)∧BUDGET ≤ 200000 m3: PNAME=“Maintenance”∧ NOT(BUDGET ≤ 200000) m4: NOT(PNAME=“Maintenance”)∧ NOT(BUDGET ≤200000)

3.4/ Completeness of Simple Predicates A set of simple predicates Pri is said to be completed if and only if there is an equal probability of access by every application to any tuple belonging to any minterm fragment that is being defined according to Pri

43

3.5/ Minimality of Simple Predicates If a predicate influences how fragmentation is performed, (i.e., causes a fragment f to be further fragmented into, say, fi and fj) then there should be at least one application that accesses fi and fj differently. In other words, the simple predicate should be relevant in determining a fragmentation.

4/ Primary Horizontal Fragmentation A primary horizontal fragmentation is defined by a selection operation on the owner relations of a database schema. Given a relation R, its horizontal fragments are given by Rj = σFj (R), 1 ≤ j ≤ ω Where Fj is the selection formula which is a minterm predicate.

Example The decomposition of the relation PROJ into horizontal fragments PROJ1 and PROJ2 is defined as follows:

PROJ1 = s BUDGET≤200000 (PROJ) PROJ2 = s BUDGET>200000(PROJ)

Example Horizontal Fragmentation based on project location.

PROJ1 = s LOC=“Montreal” (PROJ) PROJ2 = s LOC=“New York” (PROJ) PROJ3 = s LOC=“Paris” (PROJ)

PROJ1

Montreal150000InstrumentationP1LOCBUDGETPNAME PNOMontreal150000InstrumentationP1LOCBUDGETPNAME PNO

New YorkNew York

135000250000

Database Develop

P2P3

LOCBUDGETPNAME PNONew YorkNew York

135000250000

Database Develop

P2P3

LOCBUDGETPNAME PNO

Paris310000MaintenanceP4LOCBUDGETPNAME PNOParis310000MaintenanceP4LOCBUDGETPNAME PNO

PROJ2

PROJ3

5/ Derived Horizontal Fragmentation A derived horizontal fragmentation is defined on a member relation of a link according to a selection operation specified on its owner. The link between the owner and member relations is defined as An equijoin which can be implemented by means of semijoins. Given a link L where owner(L)= S and member(L) = R, the derived horizontal fragments of R can be defined as Ri=R Si, 1≤i≤w where w is the maximum number of fragments. Si = s Fi (S) where Fi is the formula according which the primary horizontal fragment Si is defined.

44

Example Given a link L1 where owner(L1) = PAY and member(L1) = EMP

EMP1 = EMP PAY1 EMP2 = EMP PAY2

Where PAY1 = s SAL ≤ 30000 (PAY) PAY2 = s SAL > 30000 (PAY)

Mech.Eng.ProgrammerMech.Eng.

A.LeeJ.MillerR.Davis

E3E4E7

TITLEENAMEENO

Mech.Eng.ProgrammerMech.Eng.

A.LeeJ.MillerR.Davis

E3E4E7

TITLEENAMEENO Mech.Eng.Syst.Anal.Syst.Anal.Mech.Eng.Syst.Anal.

J.DoeM.SmithB.CaseyL.ChuJ.Jones

E1E2E5E6E8

TITLEENAMEENO

Mech.Eng.Syst.Anal.Syst.Anal.Mech.Eng.Syst.Anal.

J.DoeM.SmithB.CaseyL.ChuJ.Jones

E1E2E5E6E8

TITLEENAMEENO

EMP2EMP1

Needed To carry out a derived horizontal fragmentation three inputs are needed:

The set of partitions of the owner relation The member relation The set of semijoin predicates between the owner and the member. (e.g

EMP.TITLE = PAY.TITLE)

5.1/ Checking for Correctness It is important to check for correctness of the fragments with respect to the following three criteria:

5.1.1/ Completeness

The completeness of horizontal fragmentation is based on the selection of the predicates used. As long as the selection predicates are complete the resulting fragmentation is guaranteed to be complete. Let R be the member relation of a link whose owner relation S, which is fragmented as Fs = {S1, S2,…, Sw). Further more let A be the join attribute between R & S. then each tuple t of R, there should be a tuple t’ of S such that

t[A] = t’[A]

5.1.2/ Reconstruction

Reconstruction of a global relation from its fragments is performed by the Union operator in both primary and derived fragmentation. Thus, for a relation R fragmentation FR = {R1,R2,…Rw}, R= U Ri for all Ri belongs to FR

5.1.3/ Disjointness

It is easier to establish disjointness of fragmentation for primary then derived horizontal fragmentation. In the former case disjointness is guaranteed as long as the minterm predicates determining the fragmentation are mutually exclusive.

6/ Minterm Fragments A horizontal fragment Ri of relation R consists of all the tuples of R which satisfy a minterm predicate mi. Given a set of minterm predicates M, there are as many horizontal fragments of relation R as there are minterm predicates. Set of horizontal fragments also referred to as minterm fragments

45

7/ Vertical Fragmentations

PROJMontrealNew YorkNew YorkParisBoston

150000135000250000310000500000


P1P2P3P4P5

LOCBUDGETPNAME PNO


150000135000250000310000500000


P1P2P3P4P5

LOCBUDGETPNAME PNO

150000135000250000310000500000

P1P2P3P4P5

BUDGETPNO

150000135000250000310000500000

P1P2P3P4P5

BUDGETPNO



P1P2P3P4P5

LOCPNAMEPNO



P1P2P3P4P5

LOCPNAMEPNO

PROJ1 :Information about project budgets

Information about project names and locations

PROJ2 :

7.1/ Need Vertical fragmentation of a relation R produces fragments R1, R2, …, Rr, each of which contains a subset of R’s attributes as well as the primary key of R. The objective of vertical fragmentation is to partition a relation into a set of smaller relations so that many of the user applications will run on only one fragment. In this context optimal fragmentation is one that produces a fragment scheme which minimizes the execution time of user applications that runs on these fragments.

7.2/ Motivation VF in the context of a design tool allows the user queries to deal with smaller relations thus causing a smaller number of page access. It has also been suggested that most active sub relations can be identified and placed in faster memory subsystem where memory hierocracies are supported

7.3/ More Complicated Vertical partitioning is more complicated than horizontal due to more number of alternatives. In horizontal if the total number of simple predicates in Pr is n there are 2n possible minterm predicates some of these will contradict the existing implications. In vertical if a relation has m non primary key attributes, the number of possible fragments is equal to B(m) ≈ mm.

7.4/ Heuristics Approach To obtain optimal solutions to vertical partitioning problem two types of heuristics approaches is usually preferred.

46

7.4.1/ Grouping

Starts by assigning each attribute to one fragment, and at each step, joins some of the fragments until some criteria is satisfied. This technique was first suggested for centralized database and then used later for distributed databases.

7.4.2/ Splitting

Starts with a relation and decides on beneficial partitioning based on the access behavior of applications to the attributes. Splitting technique is a more preferred approach as it fits more naturally within the top down design methodology. Further more splitting generates non overlapping fragments where as grouping typically results in overlapping fragments. Of course, none overlapping refers only to non primary key attributes.

7.5/ Information Requirements The major information required for vertical fragmentation is related to applications. Since vertical partitioning places in one fragment those attributes usually accessed together there is a need for some measure that would define more precisely the notion of togetherness

7.6/ Attribute Affinity Attribute affinity is the measure of togetherness of attributes which indicates how closely related the attributes are. Unfortunately, it is not realistic to expect the designer to be able to specify these values. One way by which they can be obtained from more primitive data.

7.7/ Attribute Usage Values Another major data requirement related to applications is their access frequencies. Let Q = {q1, q2, …qq} be the set of user queries that will run on relation R(A1, A2, …, An). Then for each query qi and each attribute Aj an attribute usage value is associated denoted as use(qi, Aj), and defined as follows:

use (qi, Aj ) =1 if attribute Aj is referenced by query qi and 0 otherwise.

8/ Hybrid Fragmentation In most cases a simple horizontal or vertical fragmentation of a database schema will not be sufficient to satisfy the requirements of user application. In such case a vertical fragmentation may be followed by a horizontal one or vice versa, producing a tree structured partitioning. Since the two types of partitioning strategies are applied one after the other it is called Hybrid Fragmentation.

47

9/ Allocation Alternatives Assuming that the database is fragmented properly, one has to decide on the allocation of the fragments to various sites on the network. When data is allocated, it may either be replicated or maintained as a single copy. There are two modes of allocation alternatives:

9.1/ Non-replicated Partitioned: each fragment resides at only one site

9.2/ Replicated Fully Replicated: each fragment at each site Partially Replicated: each fragment at some of the sites

10/ Allocation Problem Assume, there are a set of fragments F = {F1, F2,…Fn} and a network consisting of sites S = {S1, S2, …, Sm}on which a set of applications Q = {q1, q2, … qq} is running. The allocation problem involves finding the optimal distribution of F to S. Which optimality can be defined with respect to

10.1/ Minimal Cost The allocation problem, then attempts to find an allocation scheme that minimizes a combined cost function. Cost function consists of:

The cost of Storing of each Fi at site Sj The cost of querying Fi at site Sj The cost of updating Fi at all the sites The cost of data communication

10.2/ Performance The allocation strategy is designed to maintain a performance metric. The objective is to minimize the response time, maximize the system throughput at each site.

48

10.3/ Information Requirements

10.3.1/ Database Information

- selectivity of fragments - size of a fragment

10.3.2/ Application Information

- number of read accesses of a query to a fragment - number of update accesses of query to a fragment - A matrix indicating which queries updates which fragments - A similar matrix for retrievals - originating site of each query

10.3.3/ Site Information

- unit cost of storing data at a site - unit cost of processing at a site

10.3.4/ Networking Information

- communication cost/frame between two sites - frame site

11/ Reason of Replication The reasons of replication are reliability and efficiency of read-only queries. If multiple copies of a data item, there is a good chance that some copy of data exists on multiple sites. On the other hand, the execution of update queries cause trouble since the system has to ensure that all the copies of the data are updated properly.

12/ Thumb Rule Hence the decision regarding replication is a trade off which depends on the ratio of read-only queries to update queries. If

1queries update

queriesonly -read>=

then, replication is advantageous, otherwise replication causes a problem

49

Chapter 4 Transaction Management

1/Definition of Transaction In computer programming, a transaction usually means a sequence of information exchange and related work (such as database updating) that is treated as a unit for the purposes of satisfying a request and for ensuring database integrity. For a transaction to be completed and database changes to made permanent, a transaction has to be completed in its entirety. A typical transaction is a catalog merchandise order phoned in by a customer and entered into a computer by a customer representative. The order transaction involves checking an inventory database, confirming that the item is available, placing the order, and confirming that the order has been placed and the expected time of shipment. If we view this as a single transaction, then all of the steps must be completed before the transaction is successful and the database is actually changed to reflect the new order. If something happens before the transaction is successfully completed, any changes to the database must be kept track of so that they can be undone. A program that manages or oversees the sequence of events that are part of a transaction is sometimes called a transaction monitor. Transactions are supported by Structured Query Language (SQL), the standard database user and programming interface. When a transaction completes successfully, database changes are said to be committed, when a transaction does not complete, changes are rolled back.

The transactions were set up from the number of the Read and Write on database, as a queue, through a specific measurement. Mean that Transaction is a program which covered query to access on database [Padadimitriou, 1986]. Transaction is an executed command [Ullman, 1988] reliable computation. Intuitively, a transaction takes a database, performs an action on it, and generates a new version of the database, causing a state transition. A transaction is a collection of actions that make consistent transformations of system states while preserving system consistency. Example: Two sample transactions.

a) T1 b) T2 Read (X); Read (X); X:=X-N; X:=X+M; Write (X); Write (X); Read (Y); Y:=Y+N; Write (Y);

Example: A Simple SQL Query Look on SQL query which increase more budget CAD/CAM project up tp 10%

UPDATE PROJ SET BUDGET = BUDGET *1.1 WHERE PNAME = “CAD/CAM”

50

Assume BUDGET_UPDATE is name of transaction. Through the SQL, A transaction was structured as below:

Begin_transaction BUDGET_UPDATE begin

EXEC SQL UPDATE PROJ SET BUDGET = BUDGET*1.1 WHERE PNAME = “CAD/CAM”

end. Example: Airline Database with the relations:

- FLIGHT(FNO, DATE, SRC, DEST, STSOLD, CAP) - CUST(CNAME, ADDR, BAL) - FC(FNO, DATE, CNAME, SPECIAL)

Begin_transaction Reservation begin input(flight_no, date, customer_name);

EXEC SQL UPDATE FLIGHT SET STSOLD = STSOLD + 1 WHERE FNO = flight_no AND DATE = date; EXEC SQL INSERT INTO FC(FNO, DATE, CNAME, SPECIAL); VALUES (flight_no, date, customer_name, null);

output(“reservation completed”) end. {Reservation}

Unit of Computing

In the former database, there is no concept of “Consistent Execution” or “Reliable Computation” associated with the concept of query. According to the experiences of database technical and developer such as:

What happens if two queries attempt to update the same data item concurrently?

What happens when system failure occurs during the execution of a query?

Thus, the concept of a transaction is used within the database domain as a basic unit of consistent and reliable computing. More over, the more database concept, which database specialist needs as below:

2/Database Consistency A database is a consistent state if it obeys all of the consistency (integrity) constraints defined over it. And its state changes occur due to modifications, insertions, deletions (together called updates). Which is the best point of database should never enter an inconsistent state. An important property is that the database can be temporarily inconsistent during the execution of transaction but should be consistent again when the transaction terminates.

51

3/ Transaction Consistency Transaction Consistency: refers to the actions of concurrent transactions. The database should remain in a consistent state even if there are a number of user requests that are concurrently accessing the database. A complication arises when replicated database are considered.

4/ Replica Consistency A replicated database is a mutually consistent state if all the copies of every data item in it have identical values. This is called one copy equivalence since all replica copies are forced to assume the same state at the end of a transaction’s execution.

5/ Reliability Reliability refers to both the resiliency of a system to various system failures and its capability to recover from them. A resilient system is tolerant of a system failure and can continue to provide services even when failure occurs. A recoverable DBMS is one that can get to a consistent state (by moving back or forward) following various types of failures.

5.1/ Transaction Termination A transaction always terminates, even when there are failures. If a transaction can complete its task successfully, we say the transaction commits. And if a transaction stops without completing its tasks we say it aborts.

5.2/ Transaction Aborts Transaction may aborts for a number of reasons such:

A transaction may abort itself because of a condition that would prevent it from completing its tasks successfully. Additionally the DBMS may abort a transaction due to say a deadlock or other conditions.

When a transaction is aborted, its execution is stopped and all of its already executed actions are undone by returning the database to a state prior to its execution known as rollback

5.3/ Transaction Commit The importance of commit is two fold:

The commit command signals to the DBMS that the effects of the transaction should now be reflected in the database, thereby making it visible to other transactions.

The point at which the transaction is committed is a point of no return. The results of the committed transactions are now permanently stored in the database and can’t be undone.

52

Example: Begin_transaction Reservation begin input(flight_no, date, customer_name);

EXEC SQL SELECT STSOLD,CAP INTO temp1,temp2 FROM FLIGHT WHERE FNO = flight_no AND DATE = date; if temp1 = temp2 then output(“no free seats”); Abort else EXEC SQL UPDATEFLIGHT SET STSOLD = STSOLD + 1 WHERE FNO = flight_no AND DATE = date;

EXEC SQL INSERT INTO FC(FNO, DATE, CNAME, SPECIAL); VALUES (flight_no, date, customer_name, null);

Commit output(“reservation completed”) endif

end end. {Reservation}

6/ Flat Transactions Flat transactions have a single start point (Begin_transaction) and a single termination point. (End_transaction). Consists of a number of primitive operations embraced between begin and end markers.

7/ Nested transaction The operations of a transaction may themselves be transactions (many Flat Transaction are mixed).

Begin_transaction Reservation …

Begin_transaction Airline … end. {Airline} Begin_transaction Hotel … end. {Hotel}

end. {Reservation}

8/ Characterization of Transaction The data items that a transactions reads are said to constitute its read set (RS). Similarly, the data items that a transaction writes are said to constitute its write set (WS). Finally the union of the read set and write set of a transaction constitutes its base set (BS = RS U WS) Example

RS[Reservation]= {FLIGHT.STSOLD, FLIGHT.CAP} WS[Reservation]= {FLIGHT.STSOLD, FC.FNO, FC.DATE, FC.CNAME, FC.SPECIAL} BS[Reservation]= {FLIGHT.STSOLD, FLIGHT.CAP, FC.FNO, FC.DATE, FC.CNAME,

FC.SPECIAL}

53

9/ Properties of Transactions The consistency and reliability aspects of transaction are due to four properties, which commonly referred as the ACIDity of transactions.

9.1/ Atomicity Atomicity: refers to the fact a transaction is treated as a unit of operation. Therefore, either all the transaction’s actions are completed, or none of them are. This is also called as all or nothing property Atomicity requires that if the execution of transaction is interrupted by any sort of failure, the DBMS will be responsible for determining what to do with the transaction upon recovery from failure. It can either be terminated by completing the remaining actions; and/or it can be terminated by undoing all the actions that have been executed.

9.1.1/ Failure Classification

Generally there are two types of failures that the transaction may fail due to Input data errors, deadlocks and other factors. In this case either the transaction aborts itself or the DBMS may abort it while handling deadlocks. Maintaining transaction atomicity in the presence of this type of failure is called transaction recovery.

9.1.2/ Crash recovery

A transaction may fail caused by system crashes such as storage media failures, processors failures, communication link breakages, power outages and so on… To ensuring the transaction atomicity in the presence of system crashes is called crash recovery

9.2/ Consistency The consistency of a transaction is simply its correctness. In other words, a transaction is a correct program that maps one consistent state to another. Transaction consistency is ensured by semantic data control and concurrency control mechanisms

9.2.1/ Consistency Classification

This classification groups database into four levels of consistency. This uses the concept of dirty data which refers to data values that have been updated by a transaction prior to its commitment. Based on the concept of dirty data the four levels of are, which called Consistency Degrees defined as follows: + Degree 0:

Transaction T does not overwrite dirty data of other transactions. + Degree 1

Transaction T does not overwrite dirty data of other transactions. T does not commit any writes before EOT

+ Degree 2 Transaction T does not overwrite dirty data of other transactions. T does not commit any writes before EOT T does not read dirty data from other transactions.

+ Degree 3 Transaction T does not overwrite dirty data of other transactions. T does not commit any writes before EOT T does not read dirty data from other transactions. Other transactions do not dirty any data read by T before T completes.

54

9.3/ Isolation Isolation is the property of transactions which requires each transaction to see a consistent database at all times. In other words, an executing transaction cannot reveal its results to other concurrent transactions before its commitment. These isolations are: Isolation Levels Based on these phenomena, the isolation levels are defined as:

+ Read Uncommitted: there are three phenomena are possible for transactions operating at this level. + Read Committed: Fuzzy reads and phantoms are possible, but dirty reads are not. + Repeatable Read: only phantoms are possible. + Anomaly Serializable: none of the phenomenon are possible.

9.3.1/ Serializability

If several transactions are executed concurrently, the results must be the same as if they were executed serially in some order.

9.3.2/ Incomplete results

An incomplete transaction cannot reveal its results to other transactions before its commitment. Necessary to avoid cascading aborts.

Example

9.3.3/ Dirty Read

It refers to data items whose values have been modified by a transaction that has not yet committed. T1 modifies x which is then read by T2 before T1 terminates; T1 aborts T2 has read value which never exists in the database. Example

…,W1(x),…,R2(x),…,C1(or A1),…,C2(or A2) or …,W1(x),…,R2(x),…,C2(or A2),…,C1(or A1)

9.3.4/ Non-repeatable or Fuzzy Read

Transaction T1 reads a data item value. Another transaction T2 then modifies or deletes that data item and commits. If T1 then attempts to reread the data item, it either reads a different value or it cannot find the data item at all; thus two reads within the same transaction T1 return different results.

55

Example …,R1(x),…,W2(x),…,C1(or A1),…,C2(or A2) or …,R1(x),…,W2(x),…,C2(or A2),…,C1(or A1)

9.3.5/ Phantom

The phantom condition occurs when T1 does a search with a predicate and T2 inserts new tuples that satisfies the predicate. Example

…,R1(P),…,W2(yinP),…,C1(or A1),…,C2(or A2) or …,R1(P),…,W2(yinP),…,C2(or A2),…,C1(or A1)

9.4/ Durability Durability refers to that property of transactions which ensures that once a transaction commits, its results are permanent and cannot be erased from database. Therefore, the DBMS ensures that the results will survive subsequent system failures. The durability property brings forth the issue of database recovery, is how to recover the database into a consistent state where all the committed actions are reflected.

Transaction Model

10/ Transaction Manager On behalf of an application, transaction manager is responsible for coordinating the execution of the database operations by recording the transaction name and recording of originating application name

11/ Scheduler In Oracle, archive mode, is a scheduler which records all activities, which have been done. The scheduler, on the other hand, is responsible for the implementation of a specific concurrency control algorithms. Synchronizes database access for various database operations with the data processors

12/ Local Recovery Manager Participates in the management of distributed transactions resides in each site. Implements the local procedure by which the local database can be recovered to a consistent state following a failure.

56

12.1/ Transaction Management Each transaction originates at one site which is called its originating site. The execution of the database operations of a transaction is coordinated by the TM at that transaction’s originating site. Each transaction is implemented by an interface for every application program.

12.2/ Application Program Each transaction managers implement an interface for application programs which consists of five commands, as follows:

12.2.1/ Begin-transaction

This is an indicator to the TM that a new transaction is starting.

12.2.2/ Read

If the data item x is stored locally, its value is read and returned to the transaction. Otherwise, the TM selects one copy of x and requests its copy to be returned.

12.2.3/ Write

The TM coordinates the updating of X’s value at each site where it resides.

12.2.4/ Commit

The TM coordinates the physical updating of all databases that contain copies of each data item for which a previous write was issued.

12.2.5/ Abort

The TM makes sure that no effects of the transaction are reflected in the database

12.3/ Centralized Transaction Execution

12.4/ Distributed Transaction Execution

57

Chapter 5 Distributed Concurrency Control (DCC)

1/ CC in Distributed DBMS Concurrency control deals with the isolation and consistency property of transactions. The distributed DBMS ensures that the consistency of database is maintained in a multi user distributed environment. If transactions are internally consistent the simplest way of achieving this objective is to execute each transaction alone one after another.

2/ Key Issue The level of concurrency i.e the number of concurrent transactions is probably the most important parameter in distributed systems. Therefore, the concurrency control mechanism attempts to find a suitable trade off between maintaining the consistency of the database and maintaining a high level of concurrency. Isolating transactions from one another in terms of their effects on the database is an important issue of distributed DBMS. If the concurrent execution of transactions leaves the database in a state that can be achieved by their serial execution in some order, problems such as lost updates will be resolved.

3/ Serializability Theory

3.1/ Serializability If several transactions are executed concurrently, the results must be the same as if they were executed serially in some order.

3.2/ Schedule Schedule is a prefix of a complete schedule such that only some of the operations and only some of the ordering relationships are included. For instance, a schedule S is defined over a set of transactions T = {T1, T2,…, Tn} and specifies an interleaved order of execution of these transaction’s operations.

3.2.1/ Serial Schedule

A schedule S is serial if, for every transaction T participating in the schedule, which all the operations of T are executed consecutively in the schedule.

3.2.2/ Serializable Schedule

If schedule S is on n transactions is serializable, if it is equivalent to some serial schedule of the same n transactions. Result Equivalent: - Two schedules are result equivalent if they produce the same final state of the database. - However, two different schedules may accidentally produce the same final state.

3.3/ Problem - Input: Schedule S create by a set of transactions T = {T1, T2, …, Tn } - Output: Determine S is serializable or Non-Serializable? If S is serializable, find serial schedule that is equivalent to S?

58

Algorithm to Test a Schedule for Serializability by Lock This algorithm looks at only the Lock and Unlock operations, which to construct a precedence graph (also called serialization graph) that is a directed graph G=(N,E) that consists of a set of nodes N = {T1, T2, …, Tn } and a set of directed edges E = {e1, e2, …, em } The edges can optionally be labeled by the name of the data item that led to creating the edge

3.3.1/ Lock

All Transactions indicate their intentions by requesting locks from the scheduler (called lock manager) and locks are either read lock (rl) [also called shared lock] or write lock (wl) [also called exclusive lock]. Locking works nicely to allow concurrent processing of transactions because A Transaction locks an object before using it. When an object is locked by another transaction, the requesting transaction must wait. When a transaction releases a lock, it may not request another lock. Note: Read locks and write locks conflict (because Read and Write operations are incompatible

Read Lock Write Lock Read Lock yes no Write Lock no no

3.3.2/ Cycle

A cycle in a directed graph is a sequence of edges C=(T1→T2→...→Tn-1 →Tn →T1) with the property that: The starting node of each edge, except the first edge, is the same as the ending node of the previous edge and the starting node of the first edge is the same as the ending node of the last edge.

3.3.3/ Linear order or Topo order

For each case in G, look for node Ti that do not precede by one or more directed edges or the directed edges do not face toward Ti, then delete Ti and their followed edges from G, and put Ti in the first order (1). Then, continue to do as (1) until has no node and put it in the next order until the last order. That this order is Linear order or Topo order Consider the schedule S bellow and determine that is it serializable or not?

59

Algorithm to Test a Schedule for Serializability by RLock, WLock - Input: Schedule S create by a set of transactions T = {T1, T2, …, Tn } - Output: Determine S is serializable or Non-Serializable? If S is serializable, find serial schedule that is equivalent to S? There are some differences of how to determine the edges:

For transaction Ti in schedule S executes RLock (X) or WLock (X), Tj (j ≠ i) is the next transaction executes WLock (X), create an edge {Ti →Tj}in the precedence graph.

For transaction Ti executes WLock (X) in schedule S, transaction Tm (m ≠ i) executes a RLock (X) after Ti Unlock (X), but before the other transactions execute WLock (X), create an edge {Ti →Tm} in the precedence graph.

G has cycle ⇒ schedule S is nonserializable, Otherwise S is serializable ⇒ determine its linear order or topo order. Consider the schedule S bellow and determine that is it serializable or not?

4/ Taxonomy There are a number of ways that the concurrency control approaches can be classified. The concurrency control mechanisms can be classified into two broad classes.

4.1/ Pessimistic Concurrency Control Pessimistic algorithms synchronize the concurrent execution of transactions early in the execution life cycle. The pessimistic group is consist of Locking Based Algorithms, Ordering Based Algorithms and Hybrid Algorithms.

4.2/ Optimistic Concurrency Control Optimistic algorithms delay the synchronization of transactions until their terminations. The optimistic group can simply be classified as Locking based or Time Stamp Ordering based

60

4.2.1/ Lock based Approach

In the locking based approach, the synchronization of transactions is achieved by employing physical or logical locks on some portion or granule of the database. The size of these portions (locking granularity) is an important issue. There are three modes of lock based:

4.2.1.1/ Centralized Locking In centralized locking, one of the sites in the network is designated as the primary site. It is the site where the lock tables for the entire database are stored.

Note: This site is charged with the responsibility of granting locks to transactions 4.2.1.2/ Primary Copy Locking In primary copy locking one of the copies of each lock unit is designated as the primary copy. It is this copy that has to be locked for the purpose of accessing data. If the database is not replicated the primary copy locking mechanism distribute the lock management among all the sites. 4.2.1.3/ Decentralized Locking In decentralized locking the lock management duty is shared by all the sites of a network. In this case the execution of the transaction involves the participation and coordination of scheduler at more than one sites. Each local scheduler is responsible for lock units local to that site.

4.2.2/ Time Stamp Ordering

The timestamp ordering (TO) class involves organizing the execution order of transactions so that they maintain mutual and inter consistency. The ordering is maintained by assigning timestamps to both the transactions and data items. The various types of Time Stamp Ordering algorithms are as under:

- Basic Time Stamp Ordering - Multi Version Time Stamp Ordering - Conservative Time Stamp Ordering - Hybrid Algorithms

5/ Locking Based CC Algorithms The main idea of locking based concurrency control is to ensure that the data that is shared by conflicting operations is accessed by one operation at a time. A lock unit cannot be accessed by any operation if it is already locked by another. This lock is set by a transaction before it is accessed and is reset at the end of its use, which accomplished by associating a lock with each lock unit. Thus, a lock request by a transaction is granted only if the associated lock is not being held by any other transaction. There are two types of locks commonly called lock modes: Read Lock (RLock) and Write Lock (WLock). In distributed DBMS, not only manages locks but also handles the lock management responsibilities on behalf of the transactions. In other words, users do not need to specify when data needs to be locked. The distributed DBMS takes care of that every time transaction issued a read or write operation. In locking based systems, the scheduler is the lock manager (LM). The transaction manager passes to the lock manager the database operation (read/write) and associated information (such as the item that is accessed and the identifier of that transaction that issues the database operation)

61

The lock manager then checks if the lock unit then contains the data item is already locked. If so and if the existing lock mode is in compactable with that of the current transaction, the current operation is delayed. Otherwise the lock is set in the desired mode and the database operation is passed on to the data processor for actual database access. The transaction manager is then informed of the results of the operation. The termination of a transaction results in the release of its locks and the initiation of another transaction that might be waiting for access on the same data item.

5.1/ 2PL (Two Phase Locking) The two phase locking rule simply states that no transaction should request a lock after it releases one of its locks. Alternatively, a transaction should not release a lock until it is certain that it will not request another lock. 2PL algorithms: execute transactions in two phases. Each transaction has a growing phase where it obtains locks and access data items and a shrinking phase is one during which it releases locks.

5.2/ Lock Point Lock point is the moment when the transaction has achieved all its locks but has not yet started to release any of them. Thus, the lock point can determines the end of the growing phase and the beginning of the shrinking phase of the transaction; and any schedule generated by a concurrency control algorithm that obeys the 2PL rule is serializable.

5.3/ Lock Graph The lock graph indicates the lock manager releases locks as soon as access to that data item has been completed. This permits other transactions awaiting access to go ahead and lock it, thereby increasing the degree of concurrency. This is a much difficult task for the concurrency control manager.

5.4/ Requisites The lock manager has to know that the transaction has obtained all its locks and will not need to lock another data item and also needs to know that the transaction manager no longer needs to access the data item so that the lock can be released.

5.5/ Lock Compatibility Two lock modes are compactable if two transactions which access the same data item can obtain these locks on that data item at the same time

RLock(X) WLock(X) RLock(X) Compactable Not Compactable WLock(X) Not Compactable Not Compactable

6/ Deadlock A transaction is deadlocked if it is blocked and will remain blocked until there is intervention. Locking-based CC algorithms may cause deadlocks and TO-based algorithms that involve waiting may cause deadlocks. For instance, If transaction Ti waits for another transaction Tj to release a lock on an entity, then Ti -> Tj in Wait-for graph (WFG). Any locking based concurrency control algorithms may result in deadlocks, since there is mutual exclusion of access to shared resources (data) and transaction may wait on locks. Further, some TO based algorithms that require the waiting of transactions may also cause deadlocks.

62

7/ Why Deadlocks Deadlock is a permanent phenomenon. If one exists in a system, it will not go away unless outside intervention takes place. A deadlock can occur because transactions wait for one another. Informally, a deadlock situation is a set of requests that can never be granted by concurrency control mechanisms. This outside interference may come from the user, the system operator, or the software system (the operating system, or the distributed DBMS)

8/ Methods There are three known methods for handling deadlocks.

8.1/ Deadlock Prevention Deadlock prevention methods guarantee that deadlocks can not occur in the first place. Thus, a transaction manager checks a transaction when it is first initiated and does not permit it to proceed if it may cause a deadlock.

8.1.1/ Requirements

To perform this check, it is required that all the data items that it will accessed by the transaction be pre declared. The TM, then, permits a transaction to proceed if all the data items that it will access are available otherwise it is not permitted to proceed.

8.1.2/ Problem Area

Unfortunately, such systems are not suitable for database environments as it is very difficult to predict in advance which data item will be accessed by the transaction, which access to certain data may depend on the conditions that may not be resolved until run time. This would certainly reduce concurrency. Furthermore, there is additional overhead in evaluating whether a transaction can proceed safely or not. On the other hand, such it is not necessary to abort and restart a transaction due to deadlocks.

8.2/ Deadlock Avoidance Deadlock avoidance schemes employ the concurrency control techniques that will never result in deadlocks and schedulers detect potential deadlock situations in advance and ensure that they will never occur

8.2.1/ Methods

Avoiding deadlocks requires the ordering of resources and insist each process request access to these resources in that order. Accordingly, the lock units in the distributed database are ordered and transaction always request locks in that order. This ordering of lock units may be done either globally or locally at each site. Besides it is also necessary to order the sites and transactions which access data items at multiple sites visit the sites in the predefined order. Another alternative is to make use of transaction timestamps to prioritize transactions and resolve deadlocks by aborting transactions with higher priorities. To implement this lock manager needs to be modified in some other way.

- If a lock request of a transaction Ti is denied, the lock manager does not automatically force Ti to wait. - Instead it applies a prevention test to requesting transaction and the transaction that currently holds the lock (Tj). - If the test is passed Ti is permitted to wait for Tj; otherwise, one transaction or the other is aborted.

63

1/ Wait Die Rule: If Ti requests a lock on a data item that is already locked by Tj, Ti is permitted to wait if and only if Ti is older than Tj. If Ti is younger than Tj, then Ti is aborted (die) and restarted with the same time stamp. begin Ti requests lock on data item currently held by Tj if ts(Ti) < ts(Tj) (Ti is older than Tj) then Ti waits for Tj else Ti die (is rolled back) end-if end

2/ Wound Wait Rule If Ti request a lock on a data item that is already locked by Tj, then Ti is permitted to wait if and only if it is younger then Tj. Otherwise, Tj is aborted (wounded) and the lock is granted to Ti. begin Ti requests lock on data item currently held by Tj if ts(Ti) < ts(Tj) (Ti is older than Tj) then Tj wound (is rolled back) else Ti waits for Tj end-if end

8.3/ Deadlock Detection and Resolution Deadlock detection and resolution is the most popular and best studied method. Detection is done by studied by the GWFG for formation of cycles.

8.3.1/ Resolution

Resolution of deadlocks is done by selection of one or victim transaction(s) that will be preempted and aborted in order to break the cycles in the GWFG. Under the assumption that the cost of preempting each member of a set of deadlocked transactions is calculated and the one that is minimum is selected as the victim.

8.3.2/ Cost Factor

The amount of effort that has been invested in the transaction. This effort will be lost if the transaction is aborted, which the cost of aborting the transaction. This cost generally depends on the number of updates that the transaction has already performed. The amount of effort it will take to finish executing the transaction. The scheduler wants to avoid aborting a transaction that is almost finished The number of cycles that contain the transaction. Since aborting a transaction breaks all cycles that contain it, it is best to abort transactions that are part of more than one cycle.

64

8.3.3/ Deadlock Detection

There are three fundamental methods of detecting distributed deadlocks. They are commonly called:

1-Centralized Deadlock Detection In the centralized deadlock detection approach, one site is designated as the deadlock detector for the entire system. Periodically, each lock manager transmits its LWFG to the deadlock detector, which then forms the GWFG and looks for cycles in it. Centralized deadlock detection has been proposed for distributed INGRES. This method is simple and natural if the concurrency control algorithm were centralized 2PL. 2-Hierarchical Deadlock Detection An alternative is to build a hierarchy of deadlock detectors. The deadlocks that are local to a single site would be detected at that site using a local WFG. Each site also sends its local WFG to the deadlock detector at the next level. Thus, distributed deadlock involving two or more sites would be detected by a deadlock detector at the next lowest level that has control over these sites. The hierarchical deadlock detection method reduces the dependence on the central site thus reducing the communication cost. Note: It is however more complicated implement and would involve nontrivial modification to the lock and transaction manager algorithms. 3-Distributed Deadlock Detection Distributed Deadlock detection algorithms delegate the responsibility of detecting deadlocks to individual sites. Thus, there are local deadlock detectors at each site which communicate their local WFG with one another.

9/ Methodology The local WFG at each site is formed and is modified as follows. Since each site receives the potential deadlock cycles from other sites their edges are added to the local WFGs. The edges in the local WFG which show that local transactions are waiting for transactions at other sites are joined with edges in the local WFG which shows that remote transactions are waiting for local ones.

10/ Detection Local deadlock detectors look for two things that one is if there is a cycle that does not include the external edges, there is a local deadlock that can be handled locally; and the next if there is a cycle involving these external edges, there is a potential distributed deadlock and the cycle information has to be communicated to other deadlocks detectors.

11/ Key Issues The distributed deadlock detection algorithms require uniform modification to the lock managers at each site. However, there is a potential for excessive message transmission. Besides, each site may choose a different victim to abort.

65

Solution Let the path that has the potential of causing distributed deadlocks in the LWG of a site be Ti … Tj. A local deadlock detector forwards the cycle information only if ts(Ti) < ts(Tj). This reduces the average number of message transmission by one half.

66

Chapter 6 Distributed Reliability The reliability of a distributed processing system (DPS) can be expressed by the analysis of distributed program reliability (DPR) and distributed system reliability (DSR). One of the good approaches to formulate these reliability performance indexes is to generate all disjoint file spanning trees (FSTs) in the DPS graph such that the DPR and DSR can be expressed by the probability that at least one of these FSTs is working. In the paper, a unified algorithm to efficiently generate disjoint FSTs by cutting different links is presented, and the DPR and DSR are computed based on a simple and consistent union operation on the probability space of the FSTs. The DPS reliability related problems are also discussed. For speeding up the reliability evaluation, nodes merged, series, and parallel reduction concepts are incorporated in the algorithm. Based on the comparison of number of subgraphs (or FSTs) generated by the proposed algorithm and by existing evaluation algorithms, it is concluded that the proposed algorithm is much more economic in terms of time and space than the existing algorithms. Thus, this procedure can be placed on with the reliability networks and the database maintainability should be considered that its transactions ought to atomicity and durability.

1/ Fundamental Definitions » how to maintain atomicity and durability of transactions

1.1/ Reliability A measure of success with which a system conforms to some authoritative

specification of its behavior. Probability, that the system has not experienced any failures within a

given time period. Typically, used to describe systems that cannot be repaired or where the

continuous operation of the system is critical.

1.2/ Availability The fraction of the time that a system meets its specification. Probability that the system is operational at a given time t.

1.3/ Failure The deviation of a system from the behavior that is described in its specification called Failure. There are four main types of failures:

1.3.1/ Transaction failures

Mostly, this failure is occurred when the transaction is aborts due to deadlock. There is 3% of transactions abort abnormally on the research of reliability database.

1.3.2/ System failures

These normally, failure of processor, main memory, power supply, main memory contents are lost, but secondary storage contents are safe

1.3.3/ Media failures Failure of secondary storage devices such the stored data is lost. An example, Head crash/controller failure (?)

1.3.4/ Communication failures

This part calls technical failure, this generally are cause from lost/undeliverable messages or network partitioning

67

1.4/ Erroneous state The internal state of a system such that there are exists circumstances in which further processing, by the normal algorithms of the system, will lead to a failure which is not attributed to a subsequent fault. This state is divided in to two:

Error: the part of the state which is incorrect. Fault: an error in the internal states of the components of a system or

in the design of a system.

2/ Distributed Reliability Protocols Generally, a distribute database system reliability techniques consists of Commit protocols, Termination protocols and Recovery protocols.

2.1/ Commit protocols How to commit a transaction properly when more than one site are

involved in the commitment. It's different from centralized DB. Which issue of how to ensure atomicity and durability?

2.2/ Termination protocols Designed for the sites that are affected by a failed site, tell a site how

to commit/abort properly when other site fails; and how the remaining operational sites deal with it.

Non-blocking: the occurrence of failures should not force the sites to wait until the failure is repaired to terminate the transaction.

2.3/ Recover protocols Address the recovery procedure for a failed site once it restarts, just

opposite to the termination protocols. When a failure occurs, how the sites where the failure occurred deal

with it. Independent: a failed site can determine the outcome of a transaction

without having to obtain remote information.

3/ Two-Phase Commit Protocol 2PC ensures the atomic commitment of distributed transaction. 2PC involves one coordinator at the originating site and more than one participant from other sites.

Phase 1: The coordinator gets the participants ready to write the results into the database

Phase 2: Everybody writes the results into the database Coordinator: The process at the site where the transaction originates

and which controls the execution Participant: The process at the other sites that participate in executing

the transaction

Global Commit Rule: 1- The coordinator aborts a transaction if and only if at least one participant votes to abort it. 2- The coordinator commits a transaction if and only if all of the participants vote to commit it.

68

Coordinator Participant

Observations

1. A participant can unilaterally abort before he answers "yes". 2. Once a participant answers "yes", it must prepare for commit and cannot change its vote. 3. While a participant is READY, it can either to abort, or to commit, depending on the decision from the coordinator. 4. The global termination is commit if all participants vote "yes", or abort if any participant vote "no”. 5. The coordinator and participants may be in some waiting state, time-out method can be used to exit.

4/ State Transitions in 2PC

4.1/ Site Failures - 2PC Termination The most, in here, are: + Coordinator:

Tmeout in INITIAL: who cares Timeout in WAIT: cannot unilaterally commit or can unilaterally abort Timeout in ABORT or COMMIT: stay blocked and wait for the acks

69

+ Participants: Timeout in INITIAL: coordinator must have failed in INITIAL state or

unilaterally abort Timeout in READY: stay blocked

Coordinator Participants

4.2/ Site Failures - 2PC Recovery + Coordinator:

Failure in INITIAL: start the commit process upon recovery Failure in WAIT: restart the commit process upon recovery Failure in ABORT or COMMIT: nothing special if all the acks have been

received Otherwise, the termination protocol is involved

+ Participants:

Failure in INITIAL: unilaterally abort upon recovery Failure in READY: coordinator has been informed about the local decision

and treat as timeout in READY state and invoke the termination protocol Failure in ABORT or COMMIT: nothing special needs to be done

Coordinator Participants

70

4.3/ 2PC Recovery Protocols–Additional Case

Arise due to non-atomicity of log and message send actions Coordinator site fails after writing “begin_commit” log and before sending

“prepare” command treat it as a failure in WAIT state; send “prepare” command

Participant site fails after writing “ready” record in log but before “vote-commit” is sent

treat it as failure in READY state alternatively, can send “vote-commit” upon recovery

Participant site fails after writing “abort” record in log but before “vote-abort” is sent

no need to do anything upon recovery

Coordinator site fails after logging its final decision record but before sending its decision to the participants

coordinator treats it as a failure in COMMIT or ABORT state participants treat it as timeout in the READY state

Participant site fails after writing “abort” or “commit” record in log but before acknowledgement is sent

participant treats it as failure in COMMIT or ABORT state coordinator will handle it by timeout in COMMIT or ABORT state

4.4/ Variations of 2PC to reduce the number of messages to reduce the number of times logs are forced write

Presumed abort protocol Presumed commit protocol

4.4.1/ Presumed abort protocol

When the coordinator is inquired from participants about a transaction’s outcome, the response is abort if there is no information about a transaction’s status each site can forget about a transaction immediately after it decides to abort

o abort ACKs are not necessary o abort log record needs not to be forced

at both the coordinator and participants

4.4.2/ Presumed commit protocol

If no information about the transaction exists in the coordinator, it should be considered committed (when the coordinator decides to global-commit), writes a commit log record, sends a global-commit, and forgets about a transaction– commit ACKs are not necessary Requires ACKs only from abort– commit log records need not be forced by participants, while both commit and abort log records are forced by coordinator

The coordinator force writes a participant list prior to sending the prepare

message » this is for making sure global-abort

The coordinator can forget about transaction after » sending a global-commit, or » getting all the ACKs for global abort

The presumed commit is not an exact dual of presumed abort » if it were dual,

COMMIT log records at coordinator would not be forced

71

4.5/ Problem With 2PC Blocking

o Ready implies that the participant waits for the coordinator o If coordinator fails, site is blocked until recovery o Blocking reduces availability

Independent recovery is not possible However, it is known that:

o Independent recovery protocols exist only for single site failures; no independent recovery protocol exists which is resilient to multiple-site failures.

So we search for these protocols – 3PC

5/ Three-Phase Commit 3PC is non-blocking. A commit protocols is non-blocking if it is synchronous within one state transition, and its state transition diagram contains

no state which is “adjacent” to both a commit and an abort state, and no non-committable state which is “adjacent” to a commit state

Adjacent: possible to go from one stat to another with a single state transition Committable: all sites have voted to commit a transaction (e.g.: COMMIT state)

6/ Quorum Protocols for Replicated Databases Network partitioning is handled by the replica control protocol. One implementation:

Assign a vote to each copy of a replicated data item(say Vi) such Σi Vi = V Each operation has to obtain a read quorum (Vr) to read and a write

quorum (Vw) to write a data item Then, the following rules have to be obeyed in determining the quorums:

Vr + Vw > V: a data item is not read and written by two transactions concurrently

Vw > V/2 : two write operations from two transactions cannot occur concurrently on the same data item

7/ Network Partitioning Simple modification of the ROWA rule:

When the replica control protocol attempts to read or write a data item, it first checks if a majority of the sites are in the same partition as the site that the protocol is running on (by checking its votes). If so, execute the ROWA rule within that partition.

Assumes that failures are “clean” which means: Failures that change the network's topology are detected by all sites

instantaneously Each site has a view of the network consisting of all the sites it can

communicate with

8/ Open Problems Replication protocols

experimental validation replication of computation and communication

Transaction models changing requirements

- cooperative sharing vs. competitive sharing - interactive transactions - longer duration - complex operations on complex data

relaxed semantics - non-serializable correctness criteria

72

9/ In-Place Update Recovery Information Every action of a transaction must not only perform the action, but must also write a log record to an append-only file called Database log.

Logging is the log contains information used by the recovery process to restore the consistency of a system. This information may include

transaction identifier type of operation (action) items accessed by the transaction to perform the action old value (state) of item (before image) new value (state) of item (after image)

9.1/ Why Logging? Upon recovery: all of T1's effects should be reflected in the database (REDO if necessary

due to a failure) none of T2's effects should be reflected in the database (UNDO if

necessary)

9.1.1/ REDO Protocol

REDO'ing an action means performing it again. The REDO operation uses the log information and performs the action that might have been done before, or not done due to failures. The REDO operation generates the new image.

9.1.2/ UNDO Protocol

UNDO'ing an action means to restore the object to its before image. The UNDO operation uses the log information and restores the old value of the object.

9.2/ When to Write Log Records into Stable Store Assume a transaction T updates a page P. Fortunate case:

System writes P in stable database System updates stable log for this update SYSTEM FAILURE OCCURS!... (before T commits)

We can recover (undo) by restoring P to its old state by using the log. Unfortunate case:

System writes P in stable database SYSTEM FAILURE OCCURS!... (before stable log is updated)

We cannot recover from this failure because there is no log record to restore the old value. Solution: Write-Ahead Log (WAL) protocol

73

9.2.1/ Notice:

If a system crashes before a transaction is committed, then all the operations must be undone. Only need the before images (undo portion of the log).

Once a transaction is committed, some of its actions might have to be redone. Need the after images (redo portion of the log).

9.2.2/ WAL protocol:

Before a stable database is updated, the undo portion of the log should be written to the stable log

When a transaction commits, the redo portion of the log must be written to stable log prior to the updating of the stable database.

9.2.3/ Logging Interface

10/ Out-of-Place Update Recovery Information

10.1/ Shadowing When an update occurs, don't change the old page, but create a shadow

page with the new values and write it into the stable database. Update the access paths so that subsequent accesses are to the new

shadow page. The old page retained for recovery.

10.2/ Differential files For each file F maintain

a read only part FR differential file consisting of insertions part DF+ and deletions part DF- Thus, F = (FR ∪ DF+) – DF-

Updates treated as delete old value, insert new value

11/ Execution Strategies Dependent upon

Can the buffer manager decide to write some of the buffer pages being accessed by a transaction into stable storage or does it wait for LRM to instruct it?

fix/no-fix decision Does the LRM force the buffer manager to write certain buffer pages into

stable database at the end of a transaction's execution? flush/no-flush decision

Possible execution strategies:

no-fix/no-flush no-fix/flush fix/no-flush fix/flush

74

11.1/ No-Fix/No-Flush

Abort

Buffer manager may have written some of the updated pages into stable database

LRM performs transaction undo (or partial undo)

Commit

LRM writes an “end_of_transaction” record into the log.

Recover

For those transactions that have both a “begin_transaction” and an “end_of_transaction” record in the log, a partial redo is initiated by LRM

For those transactions that only have a “begin_transaction” in the log, a global undo is executed by LRM

11.2/ No-Fix/Flush

Abort

Buffer manager may have written some of the updated pages into stable database

LRM performs transaction undo (or partial undo)

Commit

LRM issues a flush command to the buffer manager for all updated pages


Recover

No need to perform redo Perform global undo

11.3/ Fix/No-Flush

Abort

None of the updated pages have been written into stable database Release the fixed pages

Commit

LRM writes an “end_of_transaction” record into the log. LRM sends an unfix command to the buffer manager for all pages that

were previously fixed

Recover

Perform partial redo No need to perform global undo

75

11.4/ Fix/Flush

Abort

None of the updated pages have been written into stable database Release the fixed pages

Commit (the following have to be done atomically)

LRM issues a flush command to the buffer manager for all updated pages

LRM sends an unfix command to the buffer manager for all pages that were previously fixed


Recover

No need to do anything

12/ Checkpoints Simplifies the task of determining actions of transactions that need to be undone or redone when a failure occur. A checkpoint record contains a list of active transactions. Those are Steps:

Write a begin_checkpoint record into the log Collect the checkpoint date into the stable storage Write an end_checkpoint record into the log

76

IV/ APPLIED METHOD ON ORACLE As this paper mentions, a distributed DBMS should provide a number of features which make the distributed nature of the DBMS transparent to the user. These include the following:

Location transparency Replication transparency Performance transparency Transaction transparency Catalog transparency

And through my experience in database administrator, I would like to introduce one of a good database which its architecture and performance are followed as the method of relational database called Oracle database. I do hope, after completing this paper, reader should be able to understand the following:

Install, create, and administer Oracle Database 10g Configure the database for an application Employ basic monitoring procedures Implement a backup and recovery strategy Move data between databases and files

1/ Oracle Database Architecture An Oracle server is a database management system that provides an open, comprehensive, integrated approach to information management. It consists of an Oracle instance and an Oracle database. Oracle 10g requirements:

Memory: 1 GB for the instance with Database Control Disk space:

o 1.5 GB of swap space o 400 MB of disk space in the /tmp directory o Between 1.5 GB and 3.5 GB for the Oracle software o 1.2 GB for the preconfigured database (optional) o 2.4 GB for the flash recovery area (optional)

Memory Components Share/System Global Area (SGA) - it can be shared with few component as of the followings:

1. Shared pool: the size of the SP is maitained by an initial parameter called shared_pool_size. The larger the share pool, the better the performance will be. The shared pool is divided into two: library cache and data cache. Note: correct configuration of shared_pool affect the performance. 2. Buffered cache: buffer_cache_size: it is basically made up of buffer. The parameters which determine the buffer are called db_block_size (e.g. 8k) and db_cache_size (e.g. 80MB). 3. Log buffer: log_buffer_size 4. Java Pool: java_pool_size 5. Stream Pool: stream_pool_size; used for Oracle Streams. 6. Redo log buffer: used for instance recovery.

77

There are 5 mandatory background processors. In case any of the following processors is killed or down, Oracle database is not working.

1. dbwr 2. lgwr 3. smon 4. pmon 5. ckpt

Note: Tablespaces consist of one or more data files. Data files belong to only one tablespace. The SYSTEM and SYSAUX tablespaces are mandatory tablespaces. They are created at the time of database creation. They must be online. The SYSTEM tablespace is used for core functionality (for example, data dictionary

tables). The auxiliary SYSAUX tablespace is used for additional database components

(such as the Enterprise Manager Repository). Segments exist within a tablespace. Segments are made up of a collection of extents. Extents are a collection of data blocks. Data blocks are mapped to disk blocks. Memory structures:

o System Global Area (SGA): Database buffer cache, redo buffer, and various pools

o Program Global Area (PGA) Process structures:

o User process and Server process o Background processes: SMON, PMON, DBWn, CKPT, LGWR, ARCn, and so

on Storage structures:

o Logical: Database, schema, tablespace, segment, extent, and Oracle block o Physical: Files for data, parameters, redo, and OS block

SegmentSegment ExtentsExtents Data

blocksData

blocks Disk

blocksDisk

blocks

78

2/ Tasks of an Oracle Database Administrator A prioritized approach for designing, implementing, and maintaining an Oracle database involves the following tasks:

1. Evaluating the database server hardware 2. Installing the Oracle software 3. Planning the database and security strategy 4. Creating, migrating, and opening the database 5. Backing up the database 6. Enrolling system users and planning for their Oracle Network access 7. Implementing the database design 8. Recovering from database failure 9. Monitoring database performance

Tools Used to Administer an Oracle Database:

Oracle Universal Installer Database Configuration Assistant Database Upgrade Assistant Oracle Net Manager Oracle Enterprise Manager SQL*Plus and iSQL*Plus Recovery Manager Oracle Secure Backup Data Pump SQL*Loader Command-line tools

Setting Environment Variables:

ORACLE_BASE: The base of the Oracle directory structure for OFA ORACLE_HOME: The directory containing the Oracle software ORACLE_SID: The initial instance name (by default, ORCL) NLS_LANG: The language, territory, and client character set settings

79

3/ Database Planning As a DBA, you must plan:

The logical storage structure of the database and its physical implementation: o How many disk drives do you have for this? o How many data files will you need? (Plan for growth.) o How many tablespaces will you use? o Which type of information will be stored? o Are there any special storage requirements due to type or size?

The overall database design A backup strategy for the database

4/ Oracle management framework The three components of the Oracle Database 10g management framework are:

Database instance Listener Management interface

o Database Control o Management agent (when using Grid Control)

ListenerDatabase Control

Managementagent

Management interface

-or-

ListenerDatabase Control

Managementagent

Management interface

-or-

Dynamic Performance Views

These views are owned by the SYS user. Different views are available at different times:

o The instance has been started. o The database is mounted. o The database is open.

You can query V$FIXED_TABLE to see all the view names. These views are often referred to as “v-dollar views.” Read consistency is not guaranteed on these views because the data is

dynamic. In Oracle, SQL*Plus and iSQL*Plus provide additional interfaces to your database to Perform database management operations, Execute SQL commands to query, insert, update, and delete data in your database SQL*Plus is a command-line tool and used interactively or in batch mode iSQL*Plus is not command-line. It’s an interface tool bases on web interface.

80

Example: Using iSQL*Plus to Start Up instance

Example: Using SQL*Plus to Start Up and Shut Down

[oracle@EDRSR9P1 oracle]$ sqlplus dba1/oracle as sysdba

SQL> shutdown immediateDatabase closed.Database dismounted.ORACLE instance shut down.SQL> startupORACLE instance started.

Total System Global Area 285212672 bytesFixed Size 1218472 bytesVariable Size 250177624 bytesDatabase Buffers 33554432 bytesRedo Buffers 262144 bytesDatabase mounted.Database opened.SQL>

[oracle@EDRSR9P1 oracle]$ sqlplus dba1/oracle as sysdba

SQL> shutdown immediateDatabase closed.Database dismounted.ORACLE instance shut down.SQL> startupORACLE instance started.

Total System Global Area 285212672 bytesFixed Size 1218472 bytesVariable Size 250177624 bytesDatabase Buffers 33554432 bytesRedo Buffers 262144 bytesDatabase mounted.Database opened.SQL>

After created a database, an instance and a listener, user sysdba have to create user database owner. Then, create initial file called Initialization Parameter Files The below is the following step to create database manually in windows:

1. create folders for creating database: 1. data1 (disk 1) 2. data2 (disk 2) 3. cdump (core dump) 4. bdump (background dump) 5. udump (user dump) 6. backup(for backup and recovery)

2. create service (oradim) * only for windows 3. configure pfile 4. startup instance 5. create database 6. create data dict views (@oracle_home\dbms\admin\catalog.sql)- as sysdba 7. create built-in PL/SQL packages (@oracle_home\dbms\admin\catproc.sql) - as sysdba 8. create user profile information (@oracle_home\dbms\admin\pubbld.sql) - as system/manager

Initialization Parameter Files Example: create a sample of pfile: initdb1.ora

db_name = db1 shared_pool_size = 100m db_cache_size = 120m log_buffer = 5000000 background_dump_dest = d:\bdump user_dump_dest = d:\udump core_dump_dest = d:\cdump control_files = d:\data1\control01.ctl

81

spfileorcl.oraspfileorcl.ora Example steps: * switch database since there are many oracle instances running.

C:\>set oracle_sid=dba1 C:\>sqlplus / as sysdba

* startup the instance only, no mount or open

sql>startup nomount >> This will give error since it cannot locate the initdba1.ora

sql>startup nomount pfile=d:\data1\initdba1.ora >>> successful, then create a database manually sql>create database dba1 2> datafile 'd:\data1\system01.dbf' size 200m 3> logfile group1 'd:\data1\log1a.rdo'size 5m, 4> group2 'd:\data1\log2a.rdo'size 5m 5> sysaux datafile 'd:\data2\sysaux01.dbf'size 80m; >>>Database created.

*** Note: the larger the size is, the longer it takes to create the database because it needs to allocate the space in the physical disk.

sql>select name from v$database; NAME -------- DBA1

sql>show parameter shared_pool;

* oracle_home\rdbms\admin\catalog.sql

sql>@d:\oracle\product\10.2.0\db_1\rdbms\admin\catalog.sql * oracle_home\rdbms\admin\catproc.sql

sql>@d:\oracle\product\10.2.0\db_1\rdbms\admin\catproc.sql * run the create user profile information script by connect as system

sql>conn system/manager sql>@d:\oracle\product\10.2.0\sqlplus\admin\pupbld.sql

82

sql>select group#, member from v$logfile; ***note: member = mirroring of logfile

sql>alter database add logfile member 2>'d:\data2\log1b.rdo' to group 1, 3>'d:\log2b.rdo' to group 2; >>>Database altered.

sql>select group#, member from v$logfile;

* switch the logwr to write onto a different one, because we need to drop

sql>alter system switch logfile; sql>database drop logfile member 'd:\log2b.rdo';

sql>alter daatabase add logfile member 2>'d:\data2\log2b.rdo' to group 2;

sql>select group#, member from v$logfile;

RESULT

>>> GROUP#, MEMBER 1 D:\DATA1\LOG1A.RDO

2 D:\DATA1\LOG2A.RDO 1 D:\DATA2\LOG1B.RDO 2 D:\DATA2\LOG2B.RDO

>> This shows log files are mirroring to another location as of in D:\Data2

*switch user to sysdba to shutdown/startup the instance & database

sql>conn / as sysdba sql>shutdown immediate

* create another control file as of in D:\data1 to d:\data2 (copy and paste) * modify 'control_files' parameter in the initdba1.ora

control_files=d:\data1\control01.ctl, d:\data2\control02.ctl *startup the instance and the database using 'startup' alone

sql>startup >>> Oracle instance started. sql>select name from v$controlfile; >>> Name d:\data1\control01.ctl d:\data2\control02.ctl

83

*** Note: controlfile files' sizes are identically the same; otherwise, there is a problem.

OPEN

MOUNT

NOMOUNT

SHUTDOWN

All files opened as described by the control file for this instance

Control file opened for this instance

Instance started

STARTUP

OPEN

MOUNT

NOMOUNT

SHUTDOWN

All files opened as described by the control file for this instance

Control file opened for this instance

Instance started

STARTUP

4.1/ Startup command In oracle, there two methods to startup instance: startup (is a normal startup instance) and startup force (is a command to restart instance without shutdown. internally, it does a shutdown abort and startup).

4.2/ Shutdown command • shutdown [normal] : wait until the user disconnect • shutdown transactional : waits until the user commits the transaction • shutdown immediate : kill user's session and rollback transaction • shutdown abort: kill user's session only, files will not be closed properly.

Note: after instance startup after doing 'shutdown abort' oracle will do instance recovery done by oracle automatically. Then, redo writes the committed and logged transactions (logfile information) into datafiles

4.2/ How table data is stored

DatabaseTablespace

Data files

DatabaseTablespace

Data files

84

4.3/ Automatic Storage Management Automatic Storage Management, in oracle:

• Is a portable and high-performance cluster file system • Manages Oracle database files • Spreads data across disks to balance load • Mirrors data • Solves many storage management challenges

ASM Key Features and Benefits • Stripes files, but not logical volumes • Provides online disk reconfiguration and dynamic rebalancing • Allows for adjustable rebalancing speed • Provides redundancy on a per-file basis • Supports only Oracle database files • Is cluster aware • Is automatically installed

ASM Concept

Database

Tablespace

Segment

Extent

Oracle datablock

Data file

Physicalblock

ASM disk

ASM file

Allocation unitFile system

fileor

raw device

ASMdisk groupDatabase

Tablespace

Segment

Extent

Oracle datablock

Data file

Physicalblock

ASM disk

ASM file

Allocation unitFile system

fileor

raw device

ASMdisk group

5/ Database concurrency Up on time and this paper is limit, I would to introduce you about how create user, role, privilege. So, I would like to express more about oracle concurrency. But if you want to know more about this, please contact me: [email protected].

5.1/ PL/SQL There are many types of PL/SQL database objects: Package, Package body, Type body, Procedure, Function and Trigger. Oracle’s Procedural Language extension to SQL (PL/SQL) is a fourth-generation programming language (4GL). It provides:

• Procedural extensions to SQL • Portability across platforms and products • Higher level of security and data integrity protection • Support for object-oriented programming

ASM

FilesystemVolumemanager

Application

Database

Operating system

ASM

FilesystemVolumemanager

Application

Database

Operating system

85

5.2/ Locks • Locks prevent multiple sessions from changing the same data at the same

time. • They are automatically obtained at the lowest possible level for a given

statement. They do not escalate.

SQL> UPDATE employees2 SET salary=salary*1.13 WHERE employee_id=100;

SQL> UPDATE employees2 SET salary=salary+1003 WHERE employee_id=100;

Transaction 1 Transaction 2



Transaction 1 Transaction 2Transaction 1 Transaction 2

5.2.1/ Locks Mechanism • High level of data concurrency:

Row-level locks for inserts, updates, and deletes No locks required for queries

• Automatic queue management • Locks held until the transaction ends (with the COMMIT or ROLLBACK

operation)

Transaction 1



Transaction 2Transaction 1



Transaction 2

5.2.2/ Enqueue Mechanism The enqueue mechanism keeps track of:

Sessions waiting for locks The requested lock mode The order in which sessions requested the lock

5.2.3/ Possible Causes of Lock Conflicts • Uncommitted changes • Long-running transactions • Unnecessarily high locking levels

To resolve a lock conflict:

Have the session holding the lock commit or roll back Terminate the session holding the lock as a last resort

86

6/ Database Reliability A secure system ensures the confidentiality of the data that it contains. There are several aspects of security:

• Restricting access to data and services • Authenticating users • Monitoring for suspicious activity

6.1/ Principle of Least Privilege • Install only required software on the machine. • Activate only required services on the machine. • Give OS and database access to only those users that require access. • Limit access to the root or administrator account. • Limit access to the SYSDBA and SYSOPER accounts. • Limit users’ access to only the database objects required to do their jobs.

6.2/ Applying the Principle of Least Privilege • Protect the data dictionary:

O7_DICTIONARY_ACCESSIBILITY=FALSE • Revoke unnecessary privileges from PUBLIC:

REVOKE EXECUTE ON UTL_SMTP, UTL_TCP, UTL_HTTP,UTL_FILE FROM PUBLIC;

• Restrict the directories accessible by users. • Limit users with administrative privileges. • Restrict remote database authentication:

REMOTE_OS_AUTHENT=FALSE

6.3/ Monitoring for Suspicious Activity Monitoring or auditing must be an integral part of your security procedures. Review the following: • Mandatory auditing • Standard database auditing • Value-based auditing • Fine-grained auditing (FGA) • DBA auditing

87

6.4/ Back up and Recovery In Oracle, bases on its background process there few methods of recovery; have being done during database processes. + Checkpoint (CKPT) is responsible for: • Signaling Database Writer (DBWn) at checkpoints • Updating data file headers with checkpoint information • Updating control files with checkpoint information + Redo log files: • Record changes to the database • Should be multiplexed to protect against loss + LogWriter writes: • At commit • When one-third full • Every three seconds • Before DBWn writes + Archiver (ARCn): • Is an optional background process • Automatically archives online redo log files when ARCHIVELOG mode is set for

the database • Preserves the record of all changes made to the database

88

For Oracle database, Enterprise Manager uses Recovery Manager (RMAN) to perform backup and recovery operations. RMAN: is a command-line client for advanced functions, has powerful control and scripting language, has a published API that enables interface with most popular backup software, Backs up files to the disk or tape, Backs up data, control, archived log, and server parameter files After the instance is open, it fails in the case of the loss of: Any control file, A data file belonging to the system or undo tablespaces and An entire redo log group. As long as at least one member of the group is available, the instance remains open. In Oracle started up from 9i, The Flashback technology is a revolutionary advance in recovery. • Traditional recovery techniques are slow. The entire database or a file (not

just the incorrect data) has to be restored. Every change in the database log must be examined.

• Flashback is fast. Because indexed by row and by transaction are changed and only the changed data is restored.

• Flashback commands are easy. No complex multiple-step procedures are involved.

• Flashback Database brings the database to an earlier point in time by undoing all changes made since that time.

• Flashback Table recovers a table to a point in time in the past without having to restore from a backup.

• Flashback Drop restores accidentally dropped tables.

Note: If a control file is lost or corrupted, the instance normally aborts, at which time you must perform the following steps:

1. Shut down the instance, if it is still open. 2. Restore the missing control file by copying an existing control file. 3. Start the instance.

If a member of a redo log file group is lost, as long as the group still has at least one member, then:

1. Normal operation of the instance is not affected 2. You receive a message in the alert log notifying you that a member cannot be found. 3. You can restore the missing log file by copying one of the remaining files from the same group.

89

If the database is in NOARCHIVELOG mode, and any data file is lost, perform the following tasks:

1. Shut down the instance if it is not already down. 2. Restore the entire database, including all data and control files, from the backup. 3. Open the database. 4. Have users reenter all changes made since the last backup.

If a data file is lost or corrupted, and that file does not belong to the SYSTEM or UNDO tablespace, then restore and recover the missing data file.

If a data file is lost or corrupted, and that file belongs to the SYSTEM or UNDO tablespace:

1. The instance may or may not shut down automatically. If it does not, use SHUTDOWN ABORT to bring the instance down. 2. Mount the database 3. Restore and recover the missing data file 4. Open the database

7/ Database Efficiency

7.1/ Listener To make a client or middle-tier connection, Oracle Net requires the client to know the: • Host where the listener is running • Port that the listener is monitoring • Protocol that the listener is using • Name of the service that the listener is handling

Commands from the listener control utility can be issued from the command line or from the LSNRCTL prompt. UNIX or Linux command-line syntax:

$ lsnrctl <command name>

$ lsnrctl start

$ lsnrctl status Prompt syntax:

LSNRCTL> <command name>

LSNRCTL> start

LSNRCTL> status

Names Resolution

Incoming request

90

Oracle Net supports several methods of resolving connection information: Easy connect naming: Uses a TCP/IP connect string, which no required

client-side configuration and enabled by default. But its offers not support for advanced connection options, such as: connect-time failover, source routing and load balancing SQL> CONNECT hr/[email protected]:1521/dba10g

No Oracle Net configuration files

SQL> CONNECT hr/[email protected]:1521/dba10g

No Oracle Net configuration filesNo Oracle Net configuration files Local naming: Uses a local configuration file, which requires a client-side

Names Resolution file, it supports all Oracle Net protocols and advanced connection options.

SQL> CONNECT hr/hr@orcl

Oracle Net configuration files




Directory naming: Uses a centralized LDAP-compliant directory server, it supports all Oracle Net protocols, supports advanced connection options and requires LDAP with Oracle Net Names Resolution information loaded:

o Oracle Internet Directory o Microsoft Active Directory Services


LDAP directory



LDAP directory


External naming: Uses a supported non-Oracle naming service and includes:

o Network Information Service (NIS) External Naming o Distributed Computing Environment (DCE) Cell Directory Services

(CDS)

Oracle Net

Non-Oracle naming service

Oracle Net

Non-Oracle naming service

8/ Database performance In Oracle, Tuning are used to advice such as complicated SQL structure, accessing issues, missing index and so on… according to this lead the improvement of database performance have been done.

Memory allocation

issues

Memory allocation

issues

Input/outputdevice

contention

Input/outputdevice

contention

Application code

problems

Application code

problems

ResourcecontentionResource

contention

Network bottlenecksNetwork

bottlenecks

?DBA

Memory allocation

issues

Memory allocation

issues

Input/outputdevice

contention

Input/outputdevice

contention

Application code

problems

Application code

problems

ResourcecontentionResource

contention

Network bottlenecksNetwork

bottlenecks

?DBA

Add missing indexRun access advisor

Restructure SQL

Tune SQL plan(SQL profile)

Automatic Tuning Optimizer

SQL analysisoptimization mode

Access analysis optimization mode

Plan tuning optimization mode

Statistics checkoptimization mode Detect stale or missing

statistics

Comprehensive SQL tuning

SQL Tuning Advisor

Add missing indexRun access advisor

Restructure SQL

Tune SQL plan(SQL profile)

Automatic Tuning Optimizer

SQL analysisoptimization mode

Access analysis optimization mode

Plan tuning optimization mode

Statistics checkoptimization mode Detect stale or missing

statistics

Comprehensive SQL tuning

SQL Tuning Advisor

91

Below are Performance statistics that database administrator should know:

IV/ SUMMERY Regarding to the above describe, can indicated student/the fresher of software development in the ways of what is the important things that can associated in their application development which is efficiency, flexibility, availability, reliability, incremental growth and powerful database system. More over, it can be a shareable system which is a real-time multi-database system (replicate able).

V/ APPLY Distributed database system is a kind of a new cutting-edge of business computerization can do their business transaction, manage, controls, evaluate and recover-supporting all around the world. Developer or student will be introduced to interoperable, distributed data processing architectures associated with the access of heterogeneous data sources. Which, Traditional distribution issues are addressed in the context of relational database systems, e.g.: distributed query processing, distributed database design. Hence, this subject can be applied and served in many fields of uses in human-society such as:

- Provide a strong foundation for addressing issues of distributed database processing. - To know the essential of a reliability and concurrency control in database system - Meet the requirement of the distributed database centralization and decentralization - Perform a good communication with a lower cost.

VI/ CONCLUSION In my point of view, I do hope that distributed database system is not only a new computer technology which supports business management, evaluation, decision making and it is also help the manageability data in cases of their crashing. More over, this will be a good result of developing in human community. Hence, I would like to recommend dean of Royal University of Phnom Penh and also dean of Computer Science department that this course should be expanded to give more opportunities to student can applied with real experiments better than study on its theory.

92

VII/ REFERENCES This report was prepared by follow references:

- Slides of presentations of lecturer: Pok Leakmony - Oracle 10g Database Administration Workshop I - M. T. Ozsu and P. Valduriez, Principles of Distributed Databases (2nd edition), Prentice-Hall, ISBN 0-13-659707-6 - Elmasri and Navathe, Fundamentals of database systems (3rd edition), Addison-Wesley Longman, ISBN 0-201-54263-3

Date post:	26-Oct-2014
Category:	Documents
Upload:	chnarong
View:	139 times
Download:	10 times

Distributed Database System

Documents