DOCUMENT RESUME IR 012 292 .AUTHOR Yao, S. Bing; Hevner, … · 2014. 3. 4. · DOCUMENT RESUME. IR...

ED 274 315

.AUTHORTITLE

INSTITUTIONSPONS AGENCY

REPORT NOPUB DATENOTE

AVAILABLE FROM

PUB TYPE

EDRS PRICEDESCRIPTORS

IDENTIFIERS

DOCUMENT RESUME

IR 012 292

Yao, S. Bing; Hevner, Alan R.A Guide to Performance Evaluation of DatabaseSystems.Software Systems Technology, College Park, MD.National Bureau of Standards (DOC), Washington, D.C.Inst. for Computer Sciences and Technology.NBS/SP-500-118Dec 8458p.; Part of the Series, Reports on Computer Scienceand Technology.Superintendent of Documents, U.S. Government PrintingOffice, Washington, DC 20402.Guides - Non-Classroom Use (055) -- Reports -Research/Technical (143)

MF01/PC03 Plus Postage.*Database Management Systems; *Databases; *EvaluationCriteria; *Evaluation Methods; Indexing; Performance;Reaction Time; Research Design*Benchmarking

ABSTRACTBenchmarking is one of several alternate methods of

performance evaluation, which is a key aspect in the selection ofdatabase systems. The purpose of this report is to provide aperformance evaluation methodology, or benchmarking framework, toassist in the design and implementation of a wide variety ofbenchmark experiments. The methodology, which identifies criteria tobe utilized in the design, execution, and analysis of a databasesystem benchmark, has been applied to three different databasesystems representative of current minicomputer, microcomputer, anddatabase machine architectures. This generalized methodology canapply to most database system designs. In addition to presenting awide variety of possible considerations in the design andimplementation of the benchmark, this methodology can be applied tothe evaluation of either a single system with several configurations,or to the comparison of several systems. A summary of the reportidentifies the three principal phases of a database systembenchmark--benchmark design, execution, and analysis--and notes thatno generalized methodology can provide a complete list ofconsiderations for the design of an actual experiment. Seventyreferences are listed. (DJR)

***********************************************************************Reproductions supplied by EDRS are the best that can be made.

from the original document.***********************************************************************

U.S. Departmentof Commerce

National Bureauof Standards

Computer Scienceand Technology

U.S. DEPA/TrMENT OF EDUCATIONOffice of Educational Research and Improvement

EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)

Thii document has bean reproduced asreceived from the person or organizationoriginating it

0 Minor changes have been made to improvereproduction quality.

Points of view or opinions staled in this docu-/Tient do net necessarily represent official()ER! position or policy

NBS Special Publication 500-118

A Guideto Performance Evaluationof Database Systems

9

DIM

77he National Bureau of Standards' was established by an act of Congress on March 3, 1901. TheBureau's overall goal is to strengthen and advance the nation's science and technology and facilitate

their effective application for public benefit. To this end, the Bureau conducts research and provides: (1) abasis for the nation's physical measurement system, (2) scientific and technological services for industry andgovernment, (3) a technical basis for equity in trade, and (4) technical services to promote public safety.The Bureau's technical work is performed by the National Measurement Laboratory, the NationalEngineering Laboratory, the Institute for Computer Sciences and Technology, and the Center for MaterialsScience.

The National Measurement LaboratoryProvides the national system of physical and chemical measurement;coordinates the system with measurement systems of other nations andfurnishes essential services leading to accurate and uniform physical andchemical measurement throughout the Nation's scientific community, in-dustry, and commerce; provides advisory and research services to otherGovernment agencies; conducts physical and chemical research; develops,produces, and distributes Standard ReferenceMaterials; and providescalibration services. The Laboratory consists of the following centers:

The National Engineering Laboratory

Basic StandardsRadiation ResearchChemical PhysicsAnalytical Chemistry

Provides technology and technical services to the public and private sectors toaddress national needs and to solve national problems; conducts research inengineering and applied science in support of these efforts; builds and main-tains competence in the necessary disciplines required to carry out thisresearch and technical service; develops engineering data and measurementcapabilities; provides engineering measurement traceability services; developstest methods and proposes engineering standards and code changes; developsand proposes new engineering practices; and develops and improvesmechanisms to transfer results of its research to the ultimate user. TheLaboratory consists of the following centers:

The Institute for Computer Sciences and Technology

Applied MathematicsElectronics and ElectricalEngineering2Manufacturing EngineeringBuilding TechnologyFire ResearchChemical Engineering2

Conducts research and provides scientific and technical services to aidFederal agencies in the selection, acquisition, application, and use of com-puter technology to improve effectiveness and economy in Governmentoperations in accordance with Public Law 89-306 (40 U.S.C. 759), relevantExecutive Orders, and other directives; carries out this mission by managingthe Federal Information Processing Standards Program, developing FederalADP standards guidelines, and managing Federal participation in ADPvoluntary standardization activities; provides scientific and technological ad-visory services and assistance to Federal agencies; and provides the technicalfoundation for computer-related policies of the Federal Government. The In-stitute consists of the following centers:

The Center for Materials Science

Programming Science andTechnologyComputer SystemsEngineering

Conducts research and provides measurements, data, standards, referencematerials, quantitative understanding and other technical information funda-mental to the processing, structure, properties and performance of materials;addresses the scientific basis for new advanced materials technologies; plansresearch around cross-country -scientific themes such as nondestructiveevaluation and phase diagram development; oversees Bureau-wide technicalprograms in nuclear reactor radiation research and nondestructive evalua-tion; and broadly disseminates generic technical information resulting fromits programs. The Center consists of the following Divisions:

Inorganic MaterialsFracture and Deformation3PolymersMetallurgyReactor Radiation

'Headquarters and Labo-atones at Gaithersburg, MD, unless otherwise noted; mailing addressGaithersburg, MD 20899.

2Some divisions within the center are located at Boulder, CO 80303.3Located at Boulder, CO, with some elements at Gaithersburg, MD. 3

computer Scienceand Technology

NBS Special Publication 500-118

A Guideto Performance Evaluationof Database Systems

Daniel R. Benigni, EditorCenter for Programming Science and TechnologyInstitute for Computer Sciences and TechnologyNational Bureau of StandardsGaithersburg, MD 20899

Prepared by:

S. Bing YaoAlan R. HevnerSoftware Systems Technology, Inc.7100 Baltimore Avenue, Suite 206College Park, MD 20740

U.S. DEPARTMENT OF COMMERCEMalcolm Baldrige, Secretary

National Bureau of StandardsErnest Ambler, Director

Issued December 19844

Reports on Computer Science and Technology

The National Bureau of Standards has a special responsibility within the Federal

Government for computer science and technology activities. The programs of the

NBS Institute for Computer Sciences and Technology are designed to provide ADPstandards, guidelines, and technical advisory services to improve the effectiveness

of computer utilization in the Federal sector, and to perform appropriate researchand development efforts as foundation for such activities and programs. This

publication series will report these NBS efforts to the Federal computer community as

well as to interested specialists in the academic and private sectors. Those wishing

to recei,,e notices of publications in this series should complete and return the form

at the end of this publication.

Library of Congress Catalog Card Number: 84-601144

National Bureau of Standards Special Publication 500-118Natl. Bur. Stand. (LIS.), Spec. Publ. 500-118, 54 pages (Dec. 1984)

CODEN: XNBSAV

U.S. GOVERNMENT PRINTING OFFICE

WASHINGTON: 1984

For sale by the Superintendent of Documents. U.S. Government Printing Office, Washington. DC 20402

TABLE OF CONTENTS

Page

FOREWORD 2

1. INTRODUCTION 3

2. PERFORMANCE EVALUATION TECHNIQUES 6

2.1 Analytic Modelling 6

2.1.1 Queueing Models 62.1.2 Cost Models 6

2.2 Simulation Modelling 7

2.3 Benchmarking 8

3. A BENCHMARK METHODOLOGY FOR DATABASE SYSTEMS 12

3.1 Benchmark Design 12

3.1.1 S7stem configuration . 143.1.2 Test Data 143.1.3 Benchmark Workload 143.1.4 Experimental Design 15

3.2 Benchmark Execution 15

3.3 Benchmark Analysis 16

4. BENCHMARE DESIGN .. 17

4.1 System Configuration 17

4.1.1 Hardware Parameters 174.1.2 Software Parameters 17

4.2 Test Data 18

4.2.1 Constructing the Database 194.2.2 Database Size 214.2.3 Indexing 22

4.3 Benchmark Workload 23

4.3.1 Transactions 234.3.2 User-System Environment 24

4.3.3 Job-Scripts Model 25

4.3.4 Background Load27

4.4 Experimental Design28

4.4.1 Performance Measurement28

4.4.2 Experimental Variables 29

5. BENCHMARK EXECUTION33

5.1 Benchmark Initialization 33

5.1.1 Loading33

5.1.2 Timing34

5.2 Benchmark Verification34

5.3 Benchmark Testing 36

6. BENCHMARK ANALYSIS38

7. SUMMARY AND CONCLUSIONS41

A GUIDE TO PERFORMANCEEVALUATION OF DATABASE SYSTEMS

Daniel R. Benigni, Editor

This guide presents a generalized performanceanalysis methodology for the benchmarking of data-base systems. The methodology identifies criteriato be utilized in the design, execution, andanalysis of a database system benchmark. Thisgeneralized methodology can apply to most databasesystem designs. In addition, presenting a widevariety of possible considerations in the designand implementation of the benchmark, this metho-dology can be applied to the evaluation of eithera single system with several configurations, or tothe comparison of several systems.

Key words: Benchmark execution; benchmark metho-dology; benchmark workload; database systems;DBMS; indexing; performance evaluation; query com-plexity; response time.

FOREWORD

This report is one of a continuing series of NBS publi-

cations in the area of data management technology. It con-

centrates on performance evaluation, which is a key aspect

in the selection of database systems.

Benchmarking is one of several alternate methods of

performance evaluation. It can be an expensive undertaking.However, this expense may be necessary for some applica-

tions, e.g., those involving large databases or where

response time requirements are critical.

The purpose of this report is to provide a performanceevaluation methodology, or benchmarking framework, to assistin the design and implementation of a wide variety of bench-

mark experiments. The methodology has been applied to three

different database systems representative of current mini-computer, microcomputer, and database machine architectures.Detailed results can be found in [YAO 84].

Other NBS publications addressing various aspects of

data management system selection include: PIPS PUB 77[NBS80], NBS Special Publication 500-108 [GALL 843, and a forth-coming NBS publication on "Choosing a Data Management Ap-

proach." The advantages and disadvantages of benchmarking

and other techniques for evaluating computer systems arediscussed in NBS SP-500-113 [LETM 84].

References to commercial products as necessary to sur-

vey results of previous work on performance evaluation arecontained in this guideline. In no case does this imply

recommendation or endorsement by NBS.

1. INTRODUCTION

The rising popularity of database systems for themanagement of data has resulted in an increasing number ofnew systems entering the marketplace. As the number ofavailable systems grows the difficulty in choosing the sys-tem which will best meet the requirements of a particularapplication environment also increases. Database systemshave been implemented on many different computer architec-tures; mainframes, minicomputers, microcomputers, and asstand-alone database machines. The selection of a databasesystem from among these varied alternatives requires astructured and comprehensive evaluation approach.

A complete evaluation methodology for database systemsmust integrate both feature analysis and performanceanalysis phases. The range of features and capabilitiesthat a database system may support is very large. Featurelists for database systems have appeared in a number of ar-ticles [CODA 76, AUER 81, WEIS 81a, WEIS 81b, BARL 81, DATE81, SU 81a, SU 81b, BROD 82].

A feature analysis performs two functions; it firstserves as a winnowing process to eliminate those databasesystems which are completely unsuitable for answering theneeds of a particular application; and second, it provides aranking of the surviving candidate systems. Featureanalysis is a widely used method of database system evalua-tion. It has a number of significant advantages over othermethods of system evaluation.

1. A database system implementation is not required.Analysis is based upon documentation. Little or nosystem costs are involved in performing a featureanalysis. This is critical for users with no systemaccess.

2. Feature analysis provides a structured first cut fornarrowing the range of potential database systems. Alarge number of systems can be evaluated effectivelyat one time. The result of a feature analysis shouldbe a small number of candidate systems. Performanceanalysis, which is much more costly, need be per-formed on only this small number of systems.

3. The list of features evaluated can be customized toan organization's application environment andpresented at the level of detail desired by thedesigner. Among these features are qualitative

-3-

10

aspects of a database system that cannot be quanti-fied in terms of system performance. Examples in-clude vendor support, documentation quality, securi-ty, "user friendliness", and reliability. Benchmarkanalysis cannot directly test the performance of

these features. Thus, feature analysis remains thebest method for their analysis.

In spite of these advantages, feature analysis shouldnot be used in isolation to evaluate and select databasesystems. There are several reasons:

1. Feature analysis is a subjective exercise. Featureimportance coefficients and the system support rat-ings needed in feature analysis are values which mustbe provided by a knowledgeable design expert. Howev-er, no two experts will come up with same valuesgiven the same application environment. At best, thefeature analysis scores among different database sys-tems should be viewed as rough indicators of the sys-tems' applicability.

2. To obtain consistent scoring among different databasesystems the evaluator must be equally well acquaintedwith all systems. This places a great burden upon

one person to acquire this knowledge. If instead,different persons grade different systems, then thescoring consistency problems increase because gradingstandards must be set and closely controlled.

3. The greatest disadvantage of feature analysis is thatno true system performance is measured. Featureanalysis is a paper exercise that cannot truly evalu-ate how a system will perform in an organization'sapplication environment.

The limitations of feature analysis introduce the needfor a more rigorous evaluation method that can provide ob-jective, quantifiable differences among the candidate data-base systems. Performance analysis provides this type ofevaluation.

The goals of performance analysis techniques are to

model a database system's behavior and gather performancedata. This is done to identify the system's strengths andweaknesses [LUCA 71, FERR 81]. Performance analysis hasbeen utilized on database systems for two purposes. Thefirst is to evaluate a single system to determine the bestconfiguxation, or running environment, for that system. Forexample, new system algorithms (e.g., file management (STON

-4- 11_

83], query optimization [HEVN 79]) can be tested before ac-tually implementing them in the system. In this way systemscan be "tuned" for their most efficient operating condition.The other application of performance evaluation on databasesystems has been to study two or more database systems, thusproviding a comparison of the systems performance.

Section 2 presents an overview of past research on theperformance evaluation of database systems. The purpose isto suggest that benchmark analysis is the most comprehensivetechnique for analyzing a single database system or compar-ing multiple database systems. An overview of the completebenchmark methodology is given in Section 3. The remainderof the report discusses in detail the design, execution, andanalysis steps required in the benchmark methodology.

2. PERFORMANCE EVALUATION TECHNIQUES

The major methods of performance evaluation are Analyt-ic Modelling, Simulation Modelling, and Benchmarking. Abrief description of each method and a survey of previouswork using the method for database system analysis ispresented. The advantages and disadvantages of using each

method are discussed.

2.1 Analytic Modelling

Analytic modelling represents a system by definingequations that relate performance quantities to known systemparameters. The use of these equations allows a fast andaccurate means to evaluate system performance. Two modelsthat have been used predominantly bo evaluate database sys-tem performance are queueing models and cost models.

2.1.1 Queueing Models.

The dynamic behavior of a database system can be viewedas a stochastic process and can be represented analyticallyas a queueing model [COMP 78]. A database system is

modelled as a multiple resource system with jobs movingthrough the system demanding services from the resource sta-tions. Queueing analysis provides the performance measuresof system throughput, resource utilization, and job responsetime [MIYA 75]. The database system workload can be charac-terized by statistical parameters obtained from the datab,requests [LEWI 76] . Because the database systems are usu;ly quite complex, a queueing model normally can represe,only a portion of its dynamic behavior. This is demonstrat-ed clearly by the attempt to analytically model the schedul-ing of an IMS (IBM's Information Management System) databasesystem in [GAVE 76, LAVE 76]. However, certain aspects ofdatabase system processing are conducive to queueing modelanalysis. For example, queueing models have been used to

analyze concurrency control algorithms [SMIT 80, POTI 80]and data allocation in distributed database systems [COFF81].

2.1.2 Cost Models.

Cost analysis has been an effective way of obtainingperformance estimates for physical database structures. Theperformance measures most easily obtained by cost analysisare storage costa and average response time for queries.

Cost equations for inverted file systems have been developedin [CARD 73]. Generalized cost models have been pioneeredin [YAO 74, 75, 77a, 77b] and further extended in (TEOR 76)and [BATO 82]. The cost model approach has been used toanalyze the performance of query processing in various rela-tional databaze systems [YAO 78, 79]. These models and 'costfunctions have been useful for performing critical pathanalysis for database system applications. Hawthorne andDewitt [HAWT 82] have developed cost models to evaluatequery processing among different proposed database machinedesigns. A performance analysis of hierarchical processingin an IMS database system has been performed using costmodels [BANE 80].

Analytic modelling has proven useful in many areas ofdatabase modelling. However, analytic models have some ma-jor disadvantages. Queueing models are inadequate to modelthe complete range of functionality found in a database sys-tem. Cost modelling fails to account for the dynamicbehavior of the database system. Por these reasons, analyt-ic modelling has failed to receive wide acceptance as a toolfor modelling database systems.

2.2 Simulation Modelling

Most real world systems are too complex to allow real-istic models to be evaluated analytically. Simulation isthe process of approximating the behavior of a system over aperiod of time. The simulation model is used to gather dataas an estimate of the true performance characteristics ofthe system. Simulation modelling has been applied to data-base systems as illustrated in the following survey ofrepresentative work performed in this area.

A database system simulator, PHASE II, has beendeveloped for the analysis of hierarchical data structures[OWEN 71]. PHASE II is an effective tool for evaluatinghardware configurations, data structures, and search stra-tegies. Another simulation model has been used to model theUNIVAC 114S-1100 database system [GRIP 75]. Hulten andSoderland designed the ART/DB simulation tool to investigatea multiprogrammed database system containing multiple CPUs[HULT 77]. This simulation tool is written in SIMULA andhas an interactive interface.

A simulation model containing four modelling components(application program, database system, operating system, andhardware environment) was reported in [NAKA 75]. This simu-lation tool has two major sectiors: a definition section and

-7-14

a procedure section. The definition section describes thesystem environment being simulated while the procedure sec-tion represents the software system using an instruction setprepared by the simulator. This type of programming inter-face allows a user to modify the system parameters while

running a series of simulation programs. In another paperan IDS simulator is described [APPL 73]. A DBMS evaluationmethodology through the integrated use of a limited proto-type implementation for the DBMS design, a flexible measure-ment facility, and a predictive model based on the DBMS pro-totype was developed in [DEUT 79].

Similar to analytic models, simulation models are mostoften used to study specific types of database system pro-cessing. For example, recent simulation studies haveanalyzed optimal granule size for database locking [RIES 77,RIES 78], centralized versus decentralized concurrency con-

trol algorithms [GARC 78], and run-time schema interpreta-tion on a network database system [BARO 82].

Although simulation modelling can be useful in systems

which are boo complex for analytic modelling methods, thereare some disadvantages [LAW 82]. The major concern is the

time and expense that are often necessary to develop a simu-lation model. Stochastic simulation models also produceonly estimates of a model's "true" performance and the largevolume of results returned by a simulation often creates a

tendency to place more confidence in the results than may

actually be warranted. As the simulation grows more com-

plex, the difficulties in program verification increasecorrespondingly, making the validity of the results moredifficult to determine.

2.3 Benchmarking

Benchmarking is used when a few database systems are to

be evaluated and compared. Benchmarking requires that thesystems be implemented so that experiments can be run undersimilar system environments. Benchmarks are costly and timeconsuming but provide the most valid performance resultsupon which database systems can be evaluated. In databasebenchmarking, a system configuration, a database, and a

workload to be tested are identified and defined. Thenbests are performed and results are measured and analyzed.The workload can be either representative of the planned ap-plication of the system (an application-specific benchmark)or designed to allow for an overall system evaluation (ageneral benchmark). Running the workload on several systemsor several configurations of the same system will supply in-formation which can be used to compare and evaluate the

-8- 15

separate systems or configurations.

Although simulation and analytic modelling have beenuseful in modelling aspects of database system behavior,benchmarking can apply to the complete database system func-tionality. While both simulation and analytic modelling arelimited in the scope of their system testing, benchmarkingoffers the chance to evaluate the actual database system[GOFF 73]. Previous work on benchmarking database systemshas been performed for two primary purposes. On a singledatabase system, different implementation algorithms or dif-fererit system configurations can be tested. For multipledatabase systems, the performance of the different systemson the same database and workload can be compared [DEAR 78].

Early database benchmark experiments concentrated onthe comparison of candidate commercial systems for a partic-ular application. For example, in [SPIT 77] an evaluationof three systems was described. The three systems testedwere System 2000, GIM, and DMS-1100. All three systems ranon the UNIVAC CS-1100 computer system and were evaluatedutilizing a specially designed monitoring system; the MITREPerformance Evaluation Syctem, PES-1100. In [GLES 81], thebenchmarking of several c '3rcial systems to find the bestsystem for the U.S. Public. ealth Service is described. Oneother early article [HILL 77] documented a performanceanalysis performed to select the best candidate system forNASA's Flight Planning System.

In the academic environment several studies have beenperformed on single database systems to evaluate performanceand test enhancements. The System R access path optimizeris studied in [ASTR 80]. Benchmarks on the INGRES databasesystem were reported in [YOUS 79, KEEN 81]. In [HAWT 79]detailed benchmarks were used to identify some possibleenhancements to improve the performance of INGRES. Theresults of this study showed that dramatically increasedperformargle on INGRES could be achieved by implementing acombination of extended memory, improved paging, and multi-ple processors. A later article [STON 83] described the ef-fects of four enhancements to the rNGRES system. Theenhancements were dynamic compilation, microcoded routines,a special purpose file system, and a special purpose operat-ing system. The results showed that while all four enhance-ments improved performance to some degree, the cost associ-ated with the improvements were significant. While the com-pilation and file system tactics produced a high return andwere relatively easy to accomplish, the microcode and spe-cial operating systems resulted in somewhat less of an im-provement, and at a high cost. In [STON 821 and [EPST 801the distributed INGRES database system was analyzed by

-9.-

16

studying the performance results of queries.

More recently, comparisons of the performance of two or

more database systems have been published. System Develop-ment Corporation has performed several comparison studies ofbenchmarks of database systems on a DEC VAX system [LUND 82,

TEMP 82]. The first article compared ORACLE version 2.3.1

to the IUM-500 release 17. A report by Signal Technology,

Inc. (STI), [SIGN 82) showed a comparison of STI's OMNIBASE

to INGRES version 1.2/09. This article focused on specifictest results and did not attempt to make an overall com-

parison of the two systems.

Bitton, DeWitt, and Turbyfill have described a custom-

ized database, a comprehensive set of queries and a metho-dology for systematia'ally benchmarking relational databases[BITT 83]. In this study, testing was done on a synthetic

database which was specifically designed for benchmarking.

The benchmark included selections, projections, joins, ag-gregates, and updates on both the INGRES system, in two dif-

ferent configurations, and the IDM-500 database machine.The rNGRES database systems, the 'university' and 'commer-

cial' versions, were implemented on a VAX 11/750. The IDM-

500 database machine was connected to a PDP 11/70 host and

was studied both with and without a database accelerator.The purpose of the study was to test and compare the four

configurations (two INGRES and two IDM). Although the ori-ginal study was performed only in the single-user environ-

ment, a later paper [BORA 84] extended the project to the

multiple user environment, by performing similar testing,

with multiple users, on the IDM-500 database machine and

ORACLE database system.

In DIOGD 83] an experiment in benchmarking a database

machine was reported. The purpose of the paper was topresent an approach to benchmarking database machines using

a generated database. The paper describes a database gen-eration tool which allows the user to build a synthetic

(generated) database through an interactive interface. Thedescription of the testing was quite general. The system

configuration of the database machine was not fully

described. The testing in this research was limited to the

single-user case. The paper provided a summary of the test-ing results and numerous graphs plotting the results.

While benchmarking can be a useful and important tech-

nique for aatabase system evaluation; designing, setting up,and running a benchmark is a difficult and time consuming

task. In order to aid in the development and analysis ofbenchmarks it is essential that a generalized methodology be

designed. While some work in this area has been done [TUEL

-10-

75, RODR 75, WALT 76, BITT 83, BOGD 63], no one methodologyhas provided the necessary robustness demanded of a general-ized methodology. Most of the methods presented have beeneither tied to a limited number of systems, or have notrigorously addressed the possible testing variables anddesign characteristics necessary for a generalized methodol-ogy. In order to apply to many types of evaluation (e.g,,general vs. specific, single system vs. many systems), amethodology must discuss many possible design and implemen-tation features while providing guidance in the design ofany benchmark experiment. In the next section an overviewof the methodology is presented.

3. A BENCHMARK METHODOLOGY FOR DATABASE SYSTEMS

Managing a database requires a large, complex system

made up of hardware, software, and data components. Abenchmark methodology for database systems must consider a

wide variety of system variables in order to fully evaluateperformance. Each variable must be isolated as much as pos=

sible to allow the effects of that variable, and only-thatvariable, to be evaluated. Because of the complex, interac-tive nature of database systems it is often very difficult,

if not impossible, to do this. The benchmark methodologydeveloped here enables a designer to identify the key vari-ables of a database system to be evaluated. In this sectiona synopsis of the methodology is presented.

The benchmark methodology for database systems consistsof three stages:

1. Benchmark Design - Establishing the system environ-ment for the benchmark; involves designing the systemconfiguration, test data, workload, and deciding on

the fixed and free variables of the benchmark stu-dies.

2. Benchmark Execution - Performing the benchmark and

colleaTig the performance data.

3. Benchmark Analysis - Analyzing the performanceresults on individual database systems and, if morethan one system is benchmarked, comparing performanceacross several systems.

Figure 3.1 illustrates the methodology as a flow chart.In this section, an overview of each phase is presented.The remainder of the report will discuss each phase in de-tail.

3.1 Benchmark Design

The design of a benchmark involves establishing the en-vironment of the database system to be tested, and develop-ing the actual tests to be performed. The four steps of thebenchmark desicr phase are described below. For a compara-tive benchmark r several database systems, the benchmarkdesign must.i) ta'ariant over all systems.

-12- 19

SYSTEMCONFIGURATION

TESTDATA

BENCHMARKWORKLOAD

BENCHMARKDESIGN

BENCHMARKEXECUTION

BENCHMARKANALYSIS

CCMPARATIVEANALYSIS OF

SYSTEMS

Figure 3".1: Database System Benchmark Methodology

3.1.1 System Configuration.

The hardware and software parameters, such as mainmemory size, the number and speed of the disk drives,operating system support for database system requirements,

and load scheduling policies will be determined in thisstep. Often the hardware and software configuration is

given. This is usually the case when the database system is

to be added to an existing computing system. Also, many da-tabase systems can be installed on only one or very fewtypes of operating systems. Cost is virtually always a fac-tor and, for many applications, will be the primary deter-minant of which system configuration is actually chosen.

The parameters related to configuration that can be

varied in the testing include maximum record length, theblocking factor of data in the storage system (e.g. the

amount of data transferred by one disk access), the numberof allowable indexes on relations, the maximum size and

number of attribute values in an index, and the other typesof direct access retrievals and their costs.

3.1.2 Test Data.

Among the parameters considered here are the test data-base, the background load, and the type and amount of index-ing. The database on which the testing will be performedcan be generated using one of two methods. The traditionalmethod has been to use an already existing database, refor-

matting it for the benchma rk need s. Rec en tly , howev e r , theapproach of generating a synthetic database has been gainingpopularity. Both techniques are discussed in Section 4.

3.1.3 Benchmark Workload.

A transaction load consists of several qualitative and

quantitative aspects. Some of the qualitative aspects re-lating to transaction load are: the types of queries whichoccur (e.g., simple retrieval on a single relation or com-plex queries involving many files), the possible modes usedfor modification of the database (e.g., batch updates or in-

teractive updates), the level of user-system interaction(e.g., on-line or batch access), and whether or not userscommonly access the same data files. Some of the quantita-tive aspects of the transaction load include: the percentageof time that each type of database action is performed, theaverage amount of data returned to a user per transaction,the average number of users, and the number of users in-volved in each type of transaction. Therefore, the transac-tion load defines not only the number of users present in

the system, but also the quality and tr ree of activity

-14-

introduced into the system by each user.

3.1.4 Experimental Design.

In this important phase of the benchmark design, param-eters are selected to be varied in the benchmark testing.Values to be used for the parameters must also be defined.It is very important to choose values that, while withinreason for the system being tested, push the system to itsperformance limitations. Among the parameters to be con-sidered are database size, background load, number ofindexes, query complexity, and number of simultaneous users.

It is also in this phase of the benchmark design thatthe criteria to be used for evaluation are considered. Itis important to realize that the planned use of the systemto be selected will have a definite relationship to the mainmeasurement criteria on which the systems are evaluated.For example, if the system is expected to be used heavilyand is likely to become CPU bound, system utilization orthroughput would most likely be the main measurement cri-teria. On the other hand, if the system is more likely tobe run under a light ok moderate workload, response timewould most likely be the most important criteria. Theselection of measurement criteria for the experimentaldesign is discussed in greater detail in Section 4.4.

3.2 Benchmark Execution

After the time-consuming and complex task of designingthe benchmark is completed, the next step is to execute theexperiments. It would make benchmarking a much less compli-cated task if the benchmark could be implemented exactly asdesigned on each individual system to be tested. In reali-ty, this is seldom the case. Each system has its particulardesign and limitations to be considered. The benchmark hasto be tailored to each specific system involved in the test-ing. Benchmark execution involves the steps of benchmarkinitialization, benchmark verification, and the actualbenchmark tests. These steps are explained further in Sec-tion 5.

3.3 Benchmark Analysis

Benchmark experiments normally produce large amounts ofoutput data that are too burdensome to evaluate. The finalphase of a good benchmark experiment, therefore, must be a

concise summary of the results obtained. This summaryshould point out the interesting results of the experiment

and attempt to explain the reasons behind those results. A

good summary will also present graphs relating testing

parameters to response measures and matrices comparingresults under varying variables. Benchmark analysis in-

volves forming the raw performance data into graphs andtables that clearly illustrate observations and comparisonson the systems benchmarked. Benchmark analysis is fullydescribed in Section 6.

Two types of benchmark analysis can be performed basedupon the objectives of the benchmark testing.

1. Individual System Analysis - For each tested system,The-351-3are analyzed tO provide observations on theperformance of the database system under varying sys-tem algorithms and conditions.

2. cRmolatiit System Analysis - When multiple systems

are being studied, performance data can be compared.This analysis should provide a basis to make state-

ments as to critical comparison among several data-base systems.

4. BENCHMARK DESIGN

The benchmark design is made up of three areas whichprovide input to the final step of experimental design.These three areas; system configuration, test data, and thebenchmark workload; as well as other factors involved in theexperimental design, are discussed in this section.

4.1 System Configuration

System configuration consists of a wide variety ofparameters which relate to both hardware and software. Thehardware parameters include main memory size, the number andspeed of disk drives, and data blocking. The softwareparameters include the operating system support, schedulingpolicies, and query optimization. Below is a list of someof the parameters considered in this phase and a brief dis-cussion of each. A more detailed list of parameters can befound in [SU Slat SU Bib].

4.1.1 Hardware Parameters.

1. Main memory size consists of the number of bytes ofmemory available for programs, user work areas, andinput/output buffers.

2. Secondary storage consists of the number and type ofdisk drives. Parameters include disk capacity, seektime, and data transfer speed.

3. The configuration and speed of the communicationlines in the system are important features that ef-fect database system performance.

4. The speed of the CPU has an effect on the responsetime since most database systems experience CPU sa-turation conditions.

4.1.2 Software Parameters.

1. Memory page size has a direct effectof data allocation. Large page sclustering effect but requires largerUnfortunately most systems do not pespecify the page size as a parameter.

2. Indexing directly affects the dataciency. Index parameters should incindex supported and any restrictionsindexes permitted.

3. Operating system support and schedufunctions of the chosen operatingtherefore difficult to test.

on the localityize enhances thebuffer space.

mit the user to

retrieval effi-lude the type ofon the number of

ling are oftensystem and are

4. The query optimization scheme utilized by thesoftware is not necessarily always the best methodavailable. Comparison of an alternative method canoften provide an interesting result. Although thequery optimization algorithm is internal to a data-base system, some alternative algorithm's effect canbe simulated through carefully arranged queries.

5. Database system control algorithms, such as con-currency control and recovery algorithms, may also betested as to their effect on performance. Some data-base systems allow di'ffering levels of control bysetting system parameters. New control algorithmscan be tested by adding the programs to the databasesystem.

Many of the hardware and software parameters listedabove are given, especially when the database system is tobe added to an existing computer system. Often a databasesystem can be installed on only one, or very few, types ofoperating systems. Therefore, testing is further con-strained in regard to the selection of configuration parame-ters. It is usually difficult to vary database parameterssuch as buffer size and page size.

4.2 Test Data

A database is represented on a logical level by a con-ceptual schema that provides an overall view of the data anddata relationships in the system. At this level the data-base is defined in terms of relations, records, attributes,and domains (using relational terminology). Hierarchicaland network systems can be described in the appropriate datamodel terminology. At the physical level of representationthe size and storage requirements of the database system

-18- 25

must be considered. In addition to the storage required bahold the data values in the database, access structures suchas indexes must be included in the storage costs. Also,certain data may be duplicated for reliability or retrievalefficiency.

4.2.1 Constructing the Database.

One of the major considerations in any benchmark exper-iment is that of what test data will be used for the best-ing. The database used in the testing must be implementedon each of the candidate systems to be tested and after im-plementation must remain constant over all systems. Thereare basically two methods for obtaining a test database: us-ing an already existing application database or developing asynthetic database.

Application Database

The traditional method has been the use of real datafrom an application database. By 'real data is meant datathat is being used, or has previously been used, for appli-cation purposes. If real data is to be used it must be for-matted into the appropriate form for each system to be test-ed. If several systems are to be tested, the data must beformatted for each of the systems. If the database systemsinvolved in the testing are not all of the same type (e.g.relational, hierarchical, or network), this formating canbecome a time consuming exercise in database design. Evenwhen the systems involved are the same type, the loading andsetting up of the database can produce unexpected problems.The use of real data, however, demonstrates database systemperformance on realistic application environments. This isclearly the best method when the evaluation is done boselect a system for a known database environment.

Synthetic Database

The second method, the use of synthetic databases, hasbeen gaining popularity in recent studies. When using thismethod, synthetic data is generated bo make up a databasewhich easily lends itself to benchmark testing. Attributesare allowed either integer or character values. Key attri-butes are assigned unique integer values. For example, fora relation with 10,000 tuples, the key attribute may takethe values 0, 1, ... , 9999. The numbers can be scrambledusing a random number generator. Other attributes aredesigned so that they contain non-unique values. The mainpurpose of these attributes is to provide a systematic wayof modelling a wide range of selectivity factors. For exam-ple, a relation could be designed containing an attribute

-19-

26

with a uniform distribution of the values between 0 and 99.By utilizing the random number generator to create 5000 oc-currences of tbis attribute into a relation, then queriescan be easily deo:Lc:nec . with selectivities of 10%, 20%, ...90%, or ar.y other percentage that is of interest in testing.Since the attribute has only 100 distinct values in the 5000occurrences, 10% of the relation could be retrieved simplyby running the following queries (using SQL):

SELECT <all>FROM <relation>WHERE <attr ibute> < 10

or

SELECT <all>FROM <relation>WHERE <attr ibute> > 89

Such a design allows for much greater control over selec-tivity and can lead to a more precise result.

A major concern with the use of synthetic databaseslies in the question of independence between attributeswithin a relation. In order to be certain that the attri-butes are truly independent each attribute within the rela-tion must have an occurrence for each and every value ofevery other attribute in the relation. For example, a rela-tion with two attributes (attr_l and at_r_2) containingvalues 0 through 1 and 0 through 2 respectivell would haveto contain the following records to demonstrate true attri-bute independence [ULLM 82].

Table 4.1: Independent Attributes

attr I_ a-Ea:7

0 0

0 1

0 2

1 0

1 1

1 2

Obviously as the size of the relation grows, maintainingthis independence leads to a very large relation.

While the use of a synthetic database does add a cer-tain amount of control there are some drawbacks to usingsynthetic data. Synthetic relations fail to incorporate thecomplex data dependencies inherent to real data. For exam-ple, in a real database attributes within a relation, suchas years of service and salary, have definite correlationswhich synthetic databaseb do not provide. Another factor toconsider in choosing application vs. synthetic databases isthe purpose of the testing. If the benchmark is being per-formed in oraer to select a system for a specific applica-tion it would obviously be preferable to test some data thatwould be used in the application. The use of a syntheticdatabase is most suitable when designing a general benchmarkover several database systems. However, the use of synthet-ic data takes away a cerf.lin measure of real world applica-bility present when using real data.

4.2.2 Database Size.

Database size is a key parameter and should be testedat various levels. Database sizes, 'small to 'large',should be identified by studying the application system orthe testing environment. In its use here 'large' means thelargest database the application system is likely to use orthe largest database available for testing. The term'small' represents the point where the best performance isexpected for the application. These points will often beidentified during the benchmark and should be estimated ini-tially.

Benchmark testing should begin on the smallest test da-tabase. By stepping to larger sizes, performance changescan be readily noted. When large performance gaps are no-ticed between database sizes additional database sizes maybe tested in order to discover the point where the databasesize causes performance deterioration. Although availabili-ty of data may often dictate how large a database is used in

-21--

28

the benchmark, it is important to set the upper size at anadequate level to assure that possible applications will notnormally exceed that level. By identifying a higher levelthan would normally be attained the effects of possible fu-ture growth can be evaluatei.

The choice of the sizes of the databases to be testedwill also be directly related to the system being tested andthe resources available. If the testing is to be done ondifferent configurations of one system, or comparing verysimilar systems, it is quite likely that each system couldbe tested on all of the sizes of the database to be tested.If, on the other hand, the :41:sting is comparing aspects ofsystems of different sizer= Aith differing capabilities, thedatabase sizes tested on one system may be limited to a sub-set of the sizes tested on a lsrger system (e.g., micros andminis).

4.2.3 Indexin

It is important to test the effects that indexing has

on the performance of the systems being tested. Index lev-els should be set that will allow the systems to showdifferences in performance related to using the indexes.The transactions in the benchmark workload should bedesigned to highlight the potential performance benefits andcosts of using indexes.

Some of the possible index levels that could be select-ed include:

0. No indexing. Studying results with no indexing pro-vides a basis for comparison of the change in perfor-mance when utilizing indexes and is therefore essen-tial.

1. Clustered indexes on key. attributes. A clustered in-dex is an TaW7an an attribute whose data values arestored in sequential order. Clustered indexes onkeys can be used effectively for retrieval and joinoperations.

2. Nonclustered indexes on secondary key attributes.Secondary keyTENTe.s, if used properly, will enhancethe performance of queries that contain selectionconditions using these keys.

3. Complete indexing. Indexes are placed upon all at-tributes in the database. The benefits of completeindexing must be weighted against its costs.

Database systems can alsoto provide combined indexes.ranges over two or more fieldscan be defined on attributetogether in queries.

4.3 Benchmark Workload

be tested for their abilityA combined index is one thatISTON 74]. Combined indexes'groups that appear frequently

The benchmark workload is the heart of the benchmarkexperiment. While the system configuration and test datadefine the running environment, the benchmark workload de-fines the tests that will be run on this environment. Bychoosing a variety of transactions and then modelling userand system workloads by utilizing a job scripts model, avariety of benchmark parameters can be tested. The jobscripts model is defined in Section 4.3.3. Transactiontypes that should be considered in the testing are discussedin the next section.

4.3.1 Transactions.

Each system, and each user on that system, is involvedin a variety of transactions. A transaction is defined hereas a well-defined user action on the database. In testinga database system, a variety of transaction types should berun. benchmark should include the following types oftr ansac t ions :

1. Single-Relation Queries - These queries involve onlyone rerati-7)F-1. Testing on single relation queriesshould include queries on different sizes of rela-tions, queries retrieving all or part of a relation,queries with and without indexes, and queries usingaggregates and sorting functions such as "group-by"and "order-by".

2. Multi-Relation Queries - These queries involve morethan o;Ie relation. Testing should include all of thevariables discussed in the single-relation queries.A benchmark should also include testing on differentjoin methods and varying the number of relations thatare included in the query. Joins must be tested withand without indexes, and with different sequences ofjoining the relations.

-23- 30

3. 9.0._a_t2s s- Updates include function such as modifica-tion, insertion, and deletion. Testing on updatesmust be performed carefully. r-hese queries changethe state of the database. Tt the effects of the up-dates are not removed, further testing on the data-base will not be performed on exactly the same data.Therei-Nre the update effects must be removed from thedatabase before further testing.

4.3.2 User-System Environment.

The next consideration in the design of the benchmarkworkload is the area of the user-system environment. Thisenvironment is a combination of factors including whetherthe system is on-line or batch, the frequency at which tran-sactions enter the system, utilities offered by the system,and the programming interface.

There are basically two methods of executing transac-tions: batch and on-line.

1. Batch - A batch transactionno interaction between a usecess ing.

2. On-line - An on-line transacTrite rac t with the program.

is submitted and run withr and his job during pro-

tion allows the user to

The frequency at which transactions enter the system isanother factor that should be considered in the workload en-vironment. Considerations regarding the amount of thinktime between transactions, the number and type of transac-tions, and the frequency of transactions on the system as

the number of users grow, all must be taken into account.The job-scripts model. defined in the next section is a con-venient method of modelling these factors.

The utilities offered by the system are of prime con-cern to the database administrator and their functionalitywill be important to him. The utilities include creating,loading, and extending a table, creating an index, databasedump, recovery, and checkpointing. User utilities may in-clude sort packages, statistical routines, and graphicspackages that can be interfaced with the database system.

-24- 31

4.3.3 Job-Scri ts Model.

A job scripts model will be utilized to model the tran-saction load of the each user. By defining several distincttransaction loads and running them concurrently, a multi-user environment can be modelled.

A transaction load consists of several qualitative andquantitative aspects. Some of the qualitative aspects re-lating to transaction load are: the types of queries whichoccur (e.g., simple retrieval on a single relation or com-plex queries involving many relations) , the possible modesused for modification of the database (e.g., update throughmenu modification, batch updates, updates in conjunctionwith queries), the level of user-system interaction (e.g.,on-line vs. batch access), and whether or not users commonlyaccess the same data files.

Some of the quantitative aspects of the transactionload include: the percentage of time that each type of data-base action is performed, the average amount of data re-turned to a user per transaction, the average number ofusers, and the number of users involved in each type oftransaction. Therefore, the transaction load defines notonly the number of users present in the system, but also thequality and degree of activity introduced into the system byeach user.

Let a set of defined transactions be represented as T=1 t(1), t(2), , t(n) 1 where t(i) represents the i-thtransaction.

A job script represents a typical workload of an appli-cation user on the system. Let S be a set of pre-definedjob scripts, s(i) , of the form:

s(i) = < t(j(1)), x(1), t(j(2)), x(2), , t(j(m)) >

where t(j(k)) is a transaction from the set T

and x(i) stands for the average think time found betweensuccessive user transactions. The x(i) parameters can alsorepresent the interarrival times of transactions into thesystem. Job scripts can be designed to characterize a par-ticular type of user in the system. For example, a databaseuser may be described as' retrieval intensive or update in-tensive. In either case a job script can be designed forthat user by selecting transactions that best represent theuser's typical processing.

-25- 32

The job script model consists of two defined sets: thejob scripts, S = 1 s(1), s(2), / s(p) 1, and the set ofusers, U = { u(1), u(2), u(r) J. The workload for thesystem is defined by the number of users on the system andthe assignment of users to job scripts. Each benchmarkstudy is parameterized by the mix of job scripts that thesystems execute. The assignment of users to one or more jobscripts provides a very effective and clear way to charac-terize this job mix. An additional advantage of this jobscripts model is that it can be easily implemented by a testgenerator program in the actual benchmark study.

A job script file for each user is read into a bench-mark runner program that executes on the host computer sys-tem. Tffesitinner executes the transactions in job script

order on the database system. This program also gathersperformance data from the database system by recordingstatistics from hardware and software monitors. As an exam-ple, the following performance data can be collected on adatabase system for each transaction:

1. parse time - The time required to parse thetion and send it for execution.

2. Execution time - The time required to developcess strategy for a transaction.

3. Time to first record - Time until the first resultrecord is presented to the user.

4. Time to 3ast record - Time until the last resultrecord is presented to the user.

5. Number of records retrieved - The result size of the

transadilon; the size in bytes of the records shouldalso be collected.

transac-

an ac-

The benchmark runner algorithm can be outlined as fol-

lows:

Algorithm Runner:

Begin

Read Job-script-file into trans-array until EOP;

Open database system;

Wbile (trans-array not empty) do

-26-

33

Read next transaction from trans-array;Parse transaction and send bo database system;Execute transaction;Record time statistics on:

time-to-parsetime-to-executetime-to-firsttime-to-last;

Record size statistics on:number of records in resultsize of result records;

Print gathered statistics;

End while;

Close database system;

End of Algorithm.

For each benchmark test a job script is defined and ex-ecuted on the different database systems to be tested.Statistics for each transaction in the job script are print-ed in a convenient format. For multi-user tests on the da-tabases, multiple copies of the benchmark runner are runsimultaneously on separate job scripts. Statistics aregathered for each job script.

4.3.4 Background Load.

The accurate evaluation of any system must take intoaccount the type and amount of non-database work being per-formed by the host computer system. Based on typical systemusage, as defined by the application, a number of non-database programs should be designed for background execu-tion on the tested systems. These programs should bemodelled using a job script in much the same manner as theuser transaction loads. In this way measurements can be ob-tained for the effects of the background load on the data-base load, while the effects of the database load on thebackground load can also be measured. By enlisting the jobscript approach, parameters on the background load such asarrival rate of the programs, type of programs in the back-ground load (e.g., CPU-bound vs, I/0 bound) and prioritygiven to programs in the background load, can be varied.Different background loads can be utilized in identifyingsystem saturation points for the combined database and non-database system loads.

4.4 Experimental Design

Now that the running environment and possible alterna-tives for testing parameters have been defined, the parame-ters to be used in the testing should be selected. In orderto properly evaluate the testing it will be necessary to setfixed values for most parameters, while testing others atvarious levels.

It is also at this stage of the benchmark design thatthe performance criteria to be used for evaluation are con-sidered. The evaluation criteria selected are an essentialkey bo understanding and' correctly interpreting the bench-mark results. In this section the selection of the measure-ment criteria will be discussed, as well as a review of thepossible experimental parameters that can be varied in thetesting.

4.4.1 Performance Measurement.

The relevant measures that may be considered for use in

the performance evaluation include system throughput, utili-zation, and response time. Each of these will be dis-cussed in the following paragraphs.

1. System throughput is defined as the average number oftransactions (e.g., queries) processed per unit time.It stands to reason that as the number of queries onthe system increases, approaching a saturation level,the system throughput will also increase. Thethroughput is a good indicator of the capacity andproductivity of a system.

2. Utilization of a resource is the fraction of the timethat the particular resource is busy. The mainresources involved with processing database calls arethe CTU, secbndary storage, and the communicationchannels between them. Utilization, as with systemthroughput, is directly related to the number oftransactions on the system at one time.

3. Response time can be taken to mean several differentthings. Fist, the response time could be consideredas time-to-first-record. In other words, from thetime the query enters the system until the time thefirst record in the response is returned. Anotherdefinition of response time could be time-to-last-record. This is from the time the query enters theWIEErn until the last record of the resp,mse is

-28- 35

available. Using this measurement would, of course,cause the size of the result to influence the perfor-mance measure. If the system is I/0 bound the timeto retrieve and deliver the entire response couldeclipse the actual transaction processing time.

System throughput, utilization, and response time areall related in some sense. All three measurements tend boincrease as the load on the system increases, but while ahigh system throughput, and a high utilization rate are per-ceived as desirable, a large response time carries a nega-tive connotation.

Determining resource utilization and/or systemthroughput can be a very easy, or very difficult taskdepending upon the support offered by the system being test-ed. Some systems provide bools which offer easily accessi-ble statistics while others require a great deal of user in-tervention using software bools to acquire the necessary in-formation: Response time is usually the most readily avail-able measure and is also the most user apparent. Because ofthese facts response time is the measurement utilized inmost benchmarks.

The method used to perform the necessary calculationsin determining response time is often a function of the sup-port offered by the system being tested. For example, a da-tabase system running on an operating system which allows aflexible interface could support a very detailed timing al-gorithm, while a system with limited interfacing ability mayrequire the use of a much more general timing method (e.g.,setting a time at query input and again at query completionand recording the difference).

4.4.2 Experimental Variables.

A number of important tests can be performed in thebenchmark by selecting and varying one or more dependentvariables. The possible parameters that could be selectedinc lu de :

1. Database Size - Several sizes relevant to the systembeing tested should be selected.

2. Query Complexity - Two factors are considered indetermining query complexity. Greater complexity ofthe query predicate leads to increased parsing timeand increases the potential for query optimization.

-29-

36

Within each query set, the predicate complexity is

increased by adding additional conditions. A methodof complexity classification has been developed by

Cardenas [CARD 73]. The complexity of a query predi-cate ilcreases in the following manner:

a. An atomic condition simply places a simple selec-tion on a relation (e.g., Relationl.Attl = '10').

b. An item condition is a disjunction (OR) of two

atomic corlaraijiig- on the same attribute (e.g.,

Relationl.Attl = '10 OR Relationl.Attl = '20').

c. A record condition is a conjunction (AND) of two

item 765RUItions (e.g., Relationl.Attl = '10' AND

Relationl.Att2 = 'ABC').

d. A query condition is a disjunction (OR) of record

conditions (e.g., Relationl.Attl = '10' OR

Relationl.Att2 =

Second, the number of relations involved in a query

indicates query complexity. More costly join opera-tions are required when multiple relations are in-

volved.

3. Records Retrieved - The response time of a transac-

tion will depend greatly upon the number of recordsin the query result.

4. Order of Query Execution - The different databasesystems use internal memory as buffers for thestorage of needed indexes and intermediate results

during query execution. To test the effect of thebuffer memory on the order of query execution, job

scripts should be formed which consist of similartransaction loads executed sequentially. This willidentify any efficiencies caused by buffering.

5. Indexing - Indexing should be tested at various lev-

els. The use of at least three levels of indexes isrecommended: no indexes, primary key indexes, and

complete indexes.

6. Sorting - Sorting costs should be tested. One methodof doing this is to add 'order by' clauses to thequery sets. By comparing sorted 'and unsortedqueries, the costs of sorting in the different data-base systems can be determined.

-30-

7. Aggregation Functions - Aggregation functions shouldbe tested by adding 'count or 'max' to the outputlist.

8. Number of Users - Multiple users contend for databasesystem resources. This tends to increase theresponse time and increase the throughput. To studycontention, tests should be run in which each userruns an identical job script. Other tests would in-clude different combinations of job scripts. Multi-ple user tests will also test the database system'scapabilities to control concurrency. Concurrent up-dates on the same data item will test the lockingprotocols of the systems.

9. Background Load - Tests should include runs varyingthe non-database jobs in the host computer system.The number and type of jobs in the background can bevaried. Background jobs can be designed as CPU orI/0 intensive jobs. Tests can determine the effectof these jobs on the performance of the databasequeries. By measuring the performance of the back-ground jobs under different query loads, the effectof database jobs on the background jobs can also bestudied. This is called a reverse benchmark.

10. Robustness - System performance should also be meas-ured under controlled failure conditions in the sys-tem. This includes simulating the conditions of lossof power, disk failure, and software bugs. The capa-bility to recover from these failures gracefully isan important feature of any database system. Thisincludes the system's ability to recover on-goingtransactions, and to back out such transactions andrequest resubmission. Possible tests here include adeadlock test (it may or may not be easy to inducetransaction deadlock) and disaster tests including afailed disk (demonstrate by powering down the disk),a failed system (power down the entire system), andan aborted transaction.

A benchmark may be either application- specific, mean-ing tLat it is intended primarily to evaluate the suitabili-ty of the candidate database systems for a particular appli-cation; or it can be more general, in which case the experi-ment is intended to perform an overall evaluation. Whenselecting parameters such as the types of transactions,number of users, and background load, the type of benchmarkthat is being performed will have a direct relation on theparameters selected. For example, when selecting

-31-

38

transactions to be run, if the benchmark in aold icatlon-specific, the application environment would be studied andtransactions modelled to duplicate this environment. In themore general case, a wide variety of transactions w )1iia beused with the intention of including the spectrum ot actionstypically performed in a user environment.

5. BENCHMARK EXECUTION

When the experiment has been formally defined the nextstep is to implement the design on each of the candidatesystems. The actual execution of the benchmark can be bro-ken into three phases: benchmark initialization, benchmarkverification, and benchmark testing.

5 .1 Benchmark In it ial iz at ion

Before any testing can be performed the benchmark mustfirst be initialized. During initialization the database isprepared for testing and the benchmark runner program isreadied for use.

5.1.1 Loading.

The first step in preparing for the benchmark is toload the database into each kof the test systems. Whileloading may seem like a simple step, this is often not thecase. /n most benchmark experiments the database istransferred from one system to each of the systems to betested. When this is the case caution must be used to avoidinconsistencies in the databases. Any transfer of databetween two systems requires the use of transfer protocols.When transferring a small amount of data the protocolbetween the systems may handle the details quite well. Testdatabases for benchmarking are usually quite large andtransferring can be a very lengthy task. While not majorfactors, power failures, surges, and hardware malfunctionscan create inconsistencies in the transfer and result in in-correct data.

/f application data is used, the data must be reformat-ted into a usable form for the test system. The reformatingof data can sometimes lead to unexpected problems whichresult in either the loading of unusable data, or not beingable to load the data. If the data loaded intro the testsystem turns out to be incorrect it may have to be dumpedand the loading reinitialized with the necessary correc-tions. When loading large databases this can be a costlyand time-consuming process.

Using a synthetic database does not eliminate the load-ing problems. If the database is generated in one systemand subsequently transferred to the test systems, the sameconcerns as above apply. /f, on the other hand, the data-base is generated on the test system, the consistency of the

-33--

40

test data must still be verified.

5.1.2 Timin9.

It is also in this initialization phase that the timingalgorithm in the benchmark runner is chosen, designed, andtested. In section 4.3 a detailed discussion of the possi-ble measurement criteria to be returned by the timing algo-rithm is presented. In this phase the mechanism to be usedis considered and tested. In the analysis of a single sys-tem, or compatible systems, the timing algorithm chosenshould provide as detailed nesults as possible. Whetherhardware monitors or software monitors are to be used, theinitialization phase should allow for the tuning and testingof the tool.

When benchmarking several systems, the measurement cri-teria chosen must be implemented across all systems. Thelack of ability to support a detailed timing algorithm byone system will sometimes limit the criteria that can beevaluated in comparisons. To insure that the timing methodchosen is implementable across all test systems, the moni-toring should be tested prior to any actual testing.

5.2 Benchmark Verification

Results obtained from benchmark experiments must beverified in order to be of any value. Three types of verif-ication are discussed below.

1. Equivalence Verification. Each transaction is codedin the data manipulation language (e.g., SQL) of thedifferent systems to be tested. It must be verifiedthat the transactions are equivalent across all ofthe different systems to be tested. This equivalencecan be tested by executing the transaction and check-ing that the results are correct and identical on allsystem. A more rigorous approach would be to prove,via semantic proving techniques, that the transac-tions are equivalent and will always produce identi-cal results.

2. Optimum Performance Verification. In order toachieve fairness in the benchmarking, it should beverified that the transactions are coded so that thebest performance can be realized in a typical confi-guration of each system. The existence and use of

-34-

41

access paths (7..g., index structures) should be asclose to identic.,:, as possible in the different sys-tems. The syz:-,7t- will be set up with normal amountsof processing pu-.,,r and storage as would be found ina typical configuration. In this way, each systemwill be shown in its 'best light'. A method of ac-complishing this performance verification is to exe-cute the transactions (after equivalence verifica-tion) on each system and collect performance results.Each vendor should be asked to evaluate the perfor-mance of his system and to suggest ways to tune thesystem for better performance.

3 Consistency Verification. During the execution ofthe be. Ichmark experiments, a method for checking theconsistency of performance results should be includedin the experimental design. A consistency check con-sists of running a particular benchmark more thanonce and verifying that the performance results arecons:stent between the runs. While not all benchmarkruns need to be duplicated, performing selected con-sistency checks will provide some assurance that theperformance msults are a function of the definedsystem environment.

The implementation of each job script must be verifiedboth for correctness of its semantics and optimality of per-formance. When multiple database systems are being tested,the scope for verificatioa measures is increased. In orderto verify that the proper transactions are being used, theresul...s of the queries (e.g., record counts, text of the re-turned fields) can be compared across database systems. Anydiscrepancies ob,gerved will trigger an inspection of the of-fending scripts. Comm) errors are: a missing qualificationsub-clause, an improper sort order, or a syntacticallycorrect but semantically incorrect operator in a selectionclause.

Scripts may be changed during the initial check-outphase. A single benchmark experiment will run one or morescripts simultanemsly on a target database system. Eachscript will be monitored so that the system response timewhich it experiences can be recorded. Simultaneously,statistics describing overall system throughput will berecorded. The choice of scripts to run together will bedetermined based upon the behavior in the database systemsthat are being tested. This is one area in which it is ex-pected that preliminary evaluation of the benchmark resultswill feed back into the experiment, as new groups of scripts

-35-

42

will be suggested and tried out to probe specific perfor-mance features.

The details involving the transaction verification, as

well as those involved with running the benchmark, will havea direct relation on how well the results from each systemcan be analyzed and compared to one another.

5.3 Benchmark Testing

Once the validity, of the database has been assured, it

is time to begin planning the benchmark runs on each of thesystems to be tested. Differences between systems willnecessitate differences in the approach to each implementa-tion. For example, the limitations of a particular system'sarchitecture may allow only a subset of the experiment to beapplied. So, while the methodology flowchart presented ear-lier shows a clean interface between the design and execu-tion phases, in reality this is rarely, if ever, the casewhen implementing the benchmark on more than one system. It

will often be necessary to return to the design phase in

'er to restrict or revise the planned testing on each par-t tIlar system to be tested.

Any general design of a benchmark will encounter prob-lems specific to a given environment. The variations of thehardware and operating system environments, as well as the

particular database system, will cause the experiment tovary from its originai design. In an application-orientedbenchmark any system which lacks the functionality to per-form all of the desired operations presumably has alreadybeen eliminated during the features analysis phase. Howev-er, a general benchmark may include a diverse set of sys-tems, some of which cannot perform all of the tests, or mayperform limited versions of them.

The hardware involved can cause departures from what is

desired. If, for instance, there is a limited amount ofmain storage available, the operating system and the data-base system taken together may preclude the testing of anytype of background load altogether. Also, limited amountsof main storage reduce the possible complexity of the data-base system. Interaction between the various hardware/operating system/ database system components can havedeleterious effects. For example, the conflicting seeksused by an operating system and the DBMS system software areshown to have degraded performance of benchmark tests run bySystem Development Corporation [LUND 82].

-36-4 3

In the case of a small computer database system, thelimited amount of main storage may not allow the databasesystem enough space to have the code to implement all of thefunctions specified in the scripts. Aggregate functions(e.g., MAX, MIN, AVERAGE) are not present in the querylanguages of many database systems. More exotic (but use-ful) operations such as an pouter join' are present in veryfew of today's systems.

The benchmark tests must be carefully monitored duringtheir execution, and as knowledge is gained from the experi-ments, it is expected that the original experiment will beredesigned to take advantage of that knowledge. Hence, run-ning the benchmarks is not a completely "cut-and-dried"task. Most of the benchmark involves running sc r iptsagainst each of the target systems, while varying individualparameters of the systems. After each run, the results willbe scanned to verify that embedded consistency checks havenot uncovered any anomalies.

Once the experiments are running smoothly, the effortof interpreting the data will begin, and henceforth, theprocesses of gathering data and interpreting it can proceedin parallel. The results obtained should suggest new combi-nations of parameters and scripts, or variations on oldones, which will be run to probe specific aspects of asystem's performance.

6. BENCHMARK ANALYSIS

The final phase of benchgtarking is the analysis ofresults. Benchmark testing often produces large amounts ofraw data which must be condensed and analyzed. Evaluationof the data generated during benchmarking must begin beforethe tests have been completed. This provides feedback dur-ing the testing by suggesting which types of experimentsneed to be repeated in more detail, or should be extended insome way. Summarizing the meaningful information from theseresults and discussing them in a report form is a key step

in the benchmark testing. Explanations are provided for anysignificant findings and graphs and tables showing the com-parison of results are included. Analyses are made on bothindividual systems, comparing results by varying parameters,and between systems, comparing one's results to the other'sresults. The following section discusses some of the possi-ble comparisons to be drawn when analyzing a single systemor when comparing several systems.

Each test parameter should be evaluated in as isolated

an environment as possible so that the results can bedirectly attributable to the current configuration of param-

eters. A matrix of parameters to be evaluated (e.g., data-

base size, background load, etc.) should be designed andperformance benchmarks run for each combination.

When the testing is complete, results of each systemare thoroughly analyzed. The parameters defined as those tobe varied in the testing are to be monitored, and theirbehavior summarized. Graphs are utilized to demonstratethese behaviors. Graphs provide a clear, concise synopsis

of the relationship between parameters and should be util-ized frequently in the analysis phase.

When evaluating a single system, a variety of comparis-ons should be studied in order to identify interestingresults. Below are several possible comparisons and effectsthat: should be included in the analysis.

1. Response Time vs. (21.2e Com lexit . Under normalconditions, the relationship between these two param-eters should he an increase in response time as thequery becomes more complex due to an increase inparsing and execution. This is shown when consider-ing the time-to-first-record response time statistic.

-38- 4 5

2. Ltemnse Time vs. Records Retrieved. This relation-ship should provrEean interesting result regardingthe time relationship when re tr ieving increasingnumbers of records.

3. Response Time vs. Indexing.. Indexing should have apositive effect (decreasing response time) for mostqueries but this is not always the case. Some index-ing may actually cause the response time to increaseif the query accesses a large percentage of the rela-tion (high hit ratio ).

. Buffering. Efficient use of buffer space can some-Tries lead to improved performance. However, this re-quires the use of special buffer management algo-rithms which are not implemented in most databasesystems.

5. Sorting. Sorting can be an expensive process. Atest showing the difference in a query when it issorted vs. when no sorting is required can identifythe system's strength in this area.

6. A9.9seation. The resulting increase in cost as thenumber of records retrieved increases using an aggre-gate function should be documented.

7. Multi-user Results. The effects of multi-users onthe database are an important and realistic testingparameter. The amount of increase in response timeas the number of users on the system increases shouldbe calculated and graphed.

8. Background Load Results. In much the same manner asthe multi-user environment, the background load canhave a dramatic effect on the response time. As thebackground load increases the resulting increase inresponse time on the database system workload shouldbe monitored.

9. Reverse Benchmark. The database workload will have'57iTfrec t on the performance of applications on hostcomputer systems. An analysis of this effect shouldbe performed.

Performance saturation points will occur when a systemshows a marked decrease in performance based upon a resourcebecoming overloaded. System saturation points should beidentified and plotted for each of the systems tested. Thesaturation level will be a function of the number of users,

-39--46

types of workload, background load, and other tested parame-ters. Therefore, an explanation of the saturation points ea n

each configuration and some comments regarding the level arenecessary to backup the graphs.

Finally, the end product of any benchmark experimentshould be a report which summarizes the interesting findingsof the testing, discusses the reasons behind the findings,

and draws comparisons between the systems tested. If any

possible solutions exist to problems identified in the

benchmark they should be recommended in writing. This re-port should stress general, rather than specific, results

and therefore provide an overall evaluation of the systems.

7. SUMMARY AND CONCLUSIONS

This report has presented a general methodology for thebenchmarking of database systems. Previous projects on da-tabase system benchmarks, surveyed in Section 2, have iden-tified many different factors that influence database per-formance. The objective in this report has been to describea framework into which these many database system parameterscan be fitted. Three principal phases of a database systembenchmark have been identified.

1. Benchmark design includes the design of the systemconfigurntion, the test data, and the benchmark work-load. Th,z3e parameters are controlled in the experi-mental dezign of the benchmark. Performance measuresand the means to gather the performance statisticsare selectea.

2. Benchmark execution implements the design on one ormai--aiiabase systems. This phase requires strictverification procedures and may involve feedback forimproving the benchmark design.

3. Benchmark analysis is the phase in which the raw per-formance data is studied and observations and conclu-sions are made. Single system analysis and multiplesystem comparisons form the result of the benchmark.

The design of a benchmark methodology is a complextask. Previous work on database system benchmarks has beenapplied bo selected cases of interest. In this methodologyan attempt has been made to present a framework in an order-ly, bop-down fashion to assist the designer of a benchmarkexperiment in the design and implementation of a benchmark.

In attempting to design a generalized, standard ap-proach bo benchmarking, the complexity of the actual task ofdesigning a specific benchmark must be taken into account.No generalized methodology can provide a complete list ofconsiderations for the design of an actual experiment. In-stead, the methodology can only provide the user with ascomprehensive a List of system parameters as possible. Eachexperiment and each system has its own characteristics andconstraints. While the methodology will help the designerby providing a comprehensive framework for the benchmark, itis the designer's task bo fit the particular aspects of eachdatabase system, application environment, and operating

-41-

4 8

constraints into a viable benchmark study.

-42- 49

REFERENCES

[APPL 73] Applied Computer Research, "IDS Simulator - Func-tional Description," Internal Report, Jan. 1973.

[ASTR 80] Astrahan, M., Schkolnick, M. and Kim W. "Perfor-mance of the System R Access Path SelectionMechanism," Proceedings IFIP Conference, 1980.

[AUER 81]

[BANE 80]

[BARL 81]

Auerbach Publishers Inc. practical Data BaseManagement, Reston Publishing Company, 1981.

Banerjee, J., Hsiao, D. irid Ng, F. "DatabaseTransformation, Query Translation, and PerformanceAnalysis of a New Database Computer in SupportingHierarchical Database Management," IEEE Transac-tions on Software Engineering, Vol. SE-6, No. 1,January 1980.

Barley, K. and Driscoll, J. "A Survey of Data-BaseManagement Systems for Microcomputers, " BITE No-vember 1981.

[BARO 82] Baroody, A. and DeWitt, D. "The Impact of Run-TimeSchema Inte rpre tation in a Ne twork Data ModelDBMS," IEEE Transactions on Software Engineering,Vol. SE:IT, -No. 2 March on.

[BATO 82] Batory, D. "Optimal File Designs and Reorganiza-tion Points," ACM Transactions on Database Sys:tems, Vol. 7, No. 1, March 1982.

[BITr 83] Bitton, H. Dewitt, D., and Turbyfill, C. "Bench-marking Database Systems: A Systematic Approach,"Computer Sciences Department Technical Report#526, Computer Sciences Department, University ofWisconsin, January 1983.

[BOGD 83] Bogdanowicz, R., Crocker, M., Hsaio, D., Ryder,C., Stone, V., and Strawser, P. "Experiments inBenchmarking Relational Database Machines,"Proceedings of the Third International Workshop onDatabase Machines, Munich, West Germany, Sept.1983.

[BORA 84] Boral, H., and Dewitt, D. "A Methodology for Data-base System Performance valuation," Computer Sci-ences Technical Repor t #532 , Computer Scienc esDepartment, University of Wisconsin, January 1984.

[BROD 82] Brodie, M. and Schmidt, J. "Final Report of the

-43--

5 0

ANSI/X3/SPARC DBS-SG Relational Database TaskGroup," Document SPARC-81-690, ACM SIGMOD Record,July 1982.

[CARD 73] Cardenas, A. "Evaluation and Selection of File Or-ganization - A Model and System," Communicationsof the ACM, Vol. 16, No. 9, 1973.

[CODA 76] CODASYL Systems Committee, Selection and Asstiisimtion of Data Base Management Systems, ACM, NewNo7W, IT76.

[COFF 81] Coffman, E., Gelenbe, E. and Plateau, B. "Optimi-zation of the Number of Copies uin a DistributedData Base," IEEE Transactions on Software En-Iineering, Vol. SE-7, No. 1, January 1981.

[COMP 78] Computer Surveys, Special Issue: Queueing NetworkModels of Computer System Performance, Vol. 10,No. 3, September 1978.

[DATE 81] Date, C. An Introduction to Database Systems,Third Edia6n, Addison-Wesley Inc., 1981.

[DEAR 78] Dearnley, P. "Monitoring Database System Perfor-mance," The Computer Journal, Vol. 21, No. 1,1978.

[DEUT 79] Deutsch, Don "Modeling and Measurement Techniquesfor the Evaluation of Design Alternatives in theImplementation of Database Management Software,"NBS Special Publication 500-49, U.S. Dept. of Com-merce, National Bureau of Standards, July 1979.

[EPST 80] Epstein, R. and Stonebraker, M. "Analysis of Dis-tributed Data Base Processing Strategies,"Proceedings of the 6th VLDB, Montreal, Canada,1980.

[FERR 78] Ferreri, D. Computer Systems Performance Evalua-tion, Prentice-Nall Inc., 1978.

[GALL 84] Gallagher, L.J., and Draper, J.M. Guide on DataModels in the Selection and Usi--67 DatabigiManagement Systems, NBS SpecTir Publication 500-108, January 1984.

[GARC 79] Garcia-Molina, N. "Performance of Update Algo-rithms for Replicated Data in a Distributed Data-base," Report STAN-C9-79-744, Stanford University,Dept. of Computer Science, 1979.

-44-

[GAVE 76] Gayer, D., Lavenberg, S. and Price, T. "Explorato-ry Analysis of Access Path Length Data for a DataBase Management System," IBM Journal of Researchand Development, Vol. 20, No. 5, Sept. 1976.

[GLES 81] Gleser, M., Bayard, J., and Lang, D. "Benchmarkingfor the Best," Datamation, May 1981.

[GOFF 73] Goff, N. "The Case for Benchmarking," Computersand Automation. May 1973.

[GR/F 751 Griffith, W. "A Simulation Model for UNIVAC 1PIS-1100 - More Than Just a Performance EvaluationTool," Proceedings of the Symposium on the Simula-tion of Computer Systems, Boulder, Colorado, 1975.

[HAWT 79] Hawthorn, P. and Stonebraker, M. "PerformanceAnalysis of a Relational Data Base Management Sys-tem," Proceedings of the ACM SIGMOD Conference,Bosb3n, 1979.

[HAWT 82] Hawthorn, P. and DeWitt, D. "Performing Analysisof Alternative Database Machine Architectures,"IEEE Transactions on Software Engineering, Vol.SE-8, No. 1, January 1982.

[HEVN 79]

[HILL 77]

Hevner, A. R. "The Optimizing on Distributed DaThesis, Database SystemsDB-80-02, Department ofUniversity, December 1979.

Hillman, H. "A PerformancCandidate Computers forSystem," MTR-4599, The MI1977.

ation of Query Process-tabase Systems," Ph.D.Research Center ReportComputer Science, Purdue

e Analysis of SeveralNASA's Flight Planning

TRE Corporation, March

[HULT 771 Hulten, C. and Soderlund, L. "A Simulation Modelfor Performance Analysis of Large Shared DataBases," Proceedings Third VLDB Conference, Tokyo,1977.

[KEEN 81] Keenan, M., "A Comparative Performance Evaluationof Database Management Systems", EECS Dept.,University of California, Berkeley, CA, 1981.

[LAVE 76] Lavenberg, S. and Shedler, G. "Stochastic Model-ling of Processor Scheduling with Application toData Base Management Systems," IBM Journal ofResearch and 122.3nent, Vol. 20, No. 5, Sept.1976.

-45-

52

[LAW 82] Law, A. and Kelton, W. Simulation Modelling andAnalysis, McGraw-Hill Book Company, 190-2.

[LETM 84] Letmanyi, Helen "Assessment of Techniques forEvaluating Computer Systems for Federal AgencyProcurement," NBS Special Publication 500-113,U.S. Dept. of Commerce, National Bureau of Stan-dards, March 1984.

[LEWI 76]

[LUCA 71]

[LUND 82]

Lewis, P. and Shedler, G. "Statistical Analysis ofNonstationary Series of Events in a Data Base Sys-tem," IBM Journal of Research and Development,Vol. 20, No, 5, Sept. 1976.

Lucas, H. "Performance Evaluation and Monitoring,"Computer Surveys, Vol. 3, No. 3, September 1971.

Lund, E. and Kameny, I., "Preliminary Comparisonof Performance Results of ORACLE Release 2.3.1 onVAX/VMS with IDM-500 Release 17 on VAX/UNIX", SDCdocument SP-4158/000/00.

[MIYA 75] Miyamoto, I. "Hierarchical Performance AnalysisModels for Database Systems," proceedings FirstVLDB Conference, Farmingham, 1975.

[NARA 75] Nakamura, F., Yoshida, I. and Kondo, H. "A Simula-tion Model for Data Base System PerformanceEvaluation," Proceedings NCC, 1975.

[NBS 80] NBS Guideline for Planning and Management of Data-base Applications,-PIPS PUB-77, September mu.-

[OWEN 71) Owen, P. "PHASE II: A Database Management ModelingSystem," Proceedings of the IFIP Conference, 1971.

[POTI 80) Potier, D. and Leblanc, P. "Analysis of LockingPolicies in] Database Management Systems, Communi-cations of the ACM, Vol. 23, No. 10, October 1980.

[RIES 77] Ries, D. and Stonebraker, M. "Effects of LockingGranularity in a Database Management System," ACMTransactions on Database Systems, Vol. 2, No. 3,September 1977.

[RIES 79] Ries, D. and Stonebraker, M. "Locking GranularityRevisited," ACM Transactions on Database Systems,Vol. 4, No. 2, June 1974.

[RODR 75] Rodriguez-Rosell, J. and Hilderbrand, D. "A Fram-work for Evaluation of Data Base Systems,"

-46-

53

Research Report RJ 1587, IBM San Jose, 1975.

[SIGN 82] Signal Technology, Inc., "OMNIBASE Test Results",Internal Report, 1982.

[SMIT 80] Smith, C. and Browne, J. "Aspects of SoftwareDesign Analysis: Conourrency and Blocking,"Proceedings of the Performance 80 Symposium,Toronto, 1980.

[SPIT 77] Spitzer, J. and Patton, J. "Benchmark Analysis ofJSC's Database Management Systems," ProceedingsSpring 1977 ASTUTE Conference.

[STON 74j Stonebraker, M. "The Choice of Partial Inversionsand Combined Indices," Jounal of Computer Informa-tion Sciences, Vol. 3, No. 2, 1974.

[STON 80] Stonebraker, M. "Retrospective on a Database Sys-tem," ACM Transaction on Database Systems, Vol. 5,No. 2,-19-80.

[STON 82] Stonebraker, M. et al. "Performance Analysis ofDistributed Data Base Systems," Memorandum No.UCB/ERL M82/85, College of Engineering, Universityof California, Berkeley, 1982.

[STON 83] Stonebraker, M. et al. "Performance Enhancementsto a Relational Database System," ACM Transactionson Database systems, vol. 6, No. 2, June 1983.

[SU 81a] Su, S. et al. "A DMS Cost/Benefit Decision Model:Cost and Preference Parameters," Report NBS-GCR-82-373, National Bureau of Standards, January1981.

[SU 81b] Su, S. et al. "A DMS Cost/Benefit Decision Model:Analysis, Comparison, and Selection of DBMS"s,"Report NBS-GCR-82-375, National Bureau of Stan-dards, July 1981.

[TEMP 82] Templeton, M., Kameny, I., Kogan, D., Lund, E.,Brill, D., "Evaluation of Ten Data Management Sys-tems", SDC document TM-7817/000/00.

[TEOR 76] Teorey, T. and Das, K. "Application of an Analyti-cal Model to Evaluate Storage Structures,"Proceedings of ACM SIGMOD Conference, 1976.

[TUEL 75] Tuel, W. and Rodriguez-Rosell, J. "A Methodologyfor Evaluation of Data Base Systems," IBM Research

-47-

54

(ULM 82]

[WALT 76]

[WEIS 81a]

(WEIS 81b]

[YAO 74]

[YAO 75]

[YAO 77a]

[YAO 77b]

Report RJ 1668, 1975.

Ullman, J. Rrinciples of Database Systems, SecondEdition, Computer Science Press, 1982.

Walters, R. "Benchmark Techniques: A ConstructiveApproach," The Computer Journal, Vol. 19, No. 1,1976.

Weiss, H. "Down-scaling DBMS to the Microworld,"Mini-Micro Systems, April 1981.

Weiss, H. "Which DBMS is Right for You?" Mini-Micro Exattma, October 1981.

Yao, S. "Evaluation and Optimization of: File Or-ganizations through Analytic Modeling," PhDThesis, University of Michigan, 1974.

Yao, S. and Merten, A. "Selection of File Organi-zations through Analytic Modeling," ProceedingsFirst VLDB Conference, Framingham, 1975.

Yao, S. "An Attribute Based Model for Database Ac-cess Cost Analysis," ACM Transactions on DatabaseSystems, Vol. 2, No. 1, 1977.

Yao, S. "Approximating Block Accesses in DatabaseOrganizations," Communications 91 the KM, Vol.20, No. 4, 1977.

[YAO 78] Yao, S. and DeJong, D. "Evaluation of Database Ac-cess Paths," Proceedings of ACM SIGMOD Conference,1978.

[YAO 79] Yao, S. "Optimization of Query Evaluation Algo-rithms," ACM Transactions on Database Systems,Vol. 4, No. 2, 1979.

[YAO 841 Yao, S.B., Hevner, A., and Yu, S.T. "ArchitecturalComparisons of Three Database Systems," Reportsubmitted to the National Bureau of Standards,April 1984.

(MOS 79) Youseffil K. and Wong, E. "Query Processing in aRelational Database System," Proceedings FifthVLDB, 1979.

BS.14A Rev. 2.80)u.s. DEFT. oF comm.

BIBLIOGRAPHIC DATASHEET (See Instructions)

1. PUBLICATION ORREPORT NO,

NBS/SP-500/118

2. Performing Organ. Report No. 3. Publication Date

December 19844. TITLE AND SUBTITLE

Computer Science and Technology:A Guide to Performance Evaluation of Database Systems

5. AUTHOR(S) Daniel R. Benigni, EditorPrepared by: S. Bing Yao and Alan R. Hevner

S. PERFORMING ORGANIZATION (If joint or other than NBS, see instructions)

Software Systems Technology, Inc.

7. Contract/Grant No.

8. Tyne of Report & Period Covered7100 Baltimore Avenue, Suite 206College Park, MD 20740

;'--

Final9. SPONSORING ORGANIZATION NAME AND COMPLETE ADDRESS (Street, City, State. ZIP)

National Bureau of Standards'dcaartment of CommerceCaithersburg, MD 20899

..:.i. jUPPLEMENTARY NOTES Library of Congress Catalog Card Number: 84-601144Related Documents: NBS-GCR-84-467, Performance Evaluation of Database Systems - ABenchmark Methodology by S. Bing Yao and Alan R. Hevner; NBS-GCR-84-468, AnAnalysis of Three Database System Architectures Using Benchmarks by S. Bing Yao and0 Document describes a computer program; SF-I8S, FIPS Software Summary, is attached. Alan R. Hevner

IL ABSTRACT (A 200-word or less factual summary of most significant Information. If document includes a significantbibliography or literature survey, mention it here)

This guide presents a generalized performance analysis methodology for the bench-marking of database systems. The methodology identifies criteria to be utilizedin the design, execution, and analysis of a database system benchmark. Thisgeneralized methodology can apply to most database system designs. In addition,presenting a wide variety of possible considerations in the design and implementa-tion of the benchmark, this methodology can be applied to the evaluation of eithera single system with several configurations, or to the comparison of severalsystems.

12. KEY WORDS (Six to twelve entries: slphabetical order; capitalize only proper names; and separate key words by semicolons)benchmark execution; benchmark methodology; benchmark workload; database systems;DBMS; indexina; perfa.rmance evaluation; query complexity; response time.

13. AVAILABILITY14, NO. OF

PRINTED PAGES[X] Unlimited

For Official Distribution. Do Not Release to NTIS 54a Order From Superintendent of Documents, U.S. Government Printing Office, Washington, D.C.20402. 15. Price

11111 Order From National Technical Information Service (NTIS), SpringIseld, VA. 22161

LISCOMM-DC 6043.P80

56

ANNOUNCEMENT OF NEW PUBLICATIONS ONCOMPUTER SCIENCE & TECHNOLOGY

Superintendent of Documents,Government Printing Office,Washington, DC 20402

Dear Sir:

Please add my name to the announcement list of new publications to be issued in theseries: National Bureau of Standards Special Publication 500-.

Name

Company

Address

City State Zip Code

(Notification key N-503)

5GOVERNMENT VRINTIIM aFICE 1 1984 G.461-105/10165

NBSTechnical Publications

Periodical

Journal of ResearchThe Journal of Research of the National Bureau of Standards reports NIBS researchand development in those disciplines of the physical and engineering sciences in which the Bureau is active.These include physics, chemistry, engiveering, mathematics, and computer sciences. Papers cover a broadrange of subjects, with major emphasis on measurement methodology and the basic technology underlyingstandardization. Also included from time to time are survey articles on topics closely related to the Bureau'stechnical and scientific programs. As a special !..avice to subscribers each issue contains complete citations toall recent Bureau publications in both NBS and non-NIBS media. Issued six times a year.

Nonperiodicals

MonographsMajor contributions to the technical literature on various subjects related to the Bureau's scien-tific and technical activities.

HandbooksRecommended codes of engineering and industrial practice (including safety codes) developed incooperation with interested industries, professional organizations, and regulatory bodies.Special PublicationsInclude proceedings of conferences sponsored by NBS, NBS annual reports, and otherspecial publications appropriate to this grouping such as wall charts, pocket cards, and bibliographies.Applied Mathematics SeriesMathematical tables, manuals, and studies of special interest to physicists,engineers, chemists, biologists, mathematicians, computer programmers, and others engaged in scientific andtechnical work.National Standard Reference Data SeriesProvides quantitative data on the physical and chemical propertiesof materials, compiled from the world's literature and critically evaluated. Developed under a worldwide trio,gram coordinated by NBS under the authority of the National Standard Data Act (Public Law 90-396).NOTE: The Journal of Physical and Chemical Reference Data (JPCRD) is published quarterly for NBS bythe American Chemical Society (ACS) and the American lnstitule of Physics (Al?). Subscriptions, reprints,and supplements are available from ACS, 1155 Sixteenth St., NW, Washington, DC 20056.Building Science SeriesDisseminates technical information developed at tr Bureau on building materials,components, systems, and whole structures. The series presents research results, test methods, and perfor-mance criteria related to the structural and environmental functions and the durability and safetycharacteristics of building elements and systems.Technical NotesStudies or reports which are complete in themselves but restrictive in their treatment of asubject. Analogous to monographs but not so comprehensive in scope or definnive in treatment of the subjectarca. Often serve as a vehicle for final reports of work performed at NIBS under the sponsorship of othergovernment agencies.

Voluntary Product StandardsDeveloped under procedures published by the Department of Commerce inPart 10, Title 15, of the Code of Federal Regulations. The standards establiEl: nationally recognized re-quirements for products, and provide all concerned interests with a basis for common understanding of thecharacteristics of the products. NBS administers this program as a supplemes., to the activities of the privatesector standardizing organizations.

Consumer Information SeriesPractical information, based on NIBS research and experience, covering areasof interest to the consumer. Easily understandable language and illustrations provide useful backgroundknowledge for shopping in today's technological marketplace.Order the above NBS publications frotn; Superintendent of Documents, Government Printing Office,Washington, DC 20402.Order the following NBS publications---FIPS and NBSIR's---froin the National Technical Information Ser-vice, Springfield, VA 22161.

Federal Information Processing Standards Publications (FIPS PUB)Publications in this series collectivelyconstitute the Federal Information Processing Standards Register. The Register serves as the official source ofinformation in the Federal Government regarding standards issued by NBS pursuant to the Federal Propertyand Administrative Services Act of 1949 as amended, Public Law 89-306 (79 Stat. 1127), and as implementedby Executive Order 11717 (38 FR 12315, dated May II, 1973) and Part 6 of Title 15 CFR (Code of FederalRegulations).

NBS Interagency Reports (NBSIR)--A special series of interim or final reports on work performed by NBSfor outside sponsors (both government and non-government). In general, initial distribution is handled by thesponsor; public distribution is by the National Technical Information Service, Springfield, VA 22161, in papercopy or microfiche form.

58

Date post:	31-Aug-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times