Post on 16-Jan-2016
description
transcript
A Framework for Testing Database Applications
Joint work with
Phyllis G. Frankl (Polytechnic)
Saikat Dan (Polytechnic)
Filippos Vokolos (Lucent Technologies)
Elaine J. Weyuker (AT&T Labs - Research)
David ChaysPolytechnic University
Brooklyn, NY
Motivation
• Database systems play an important role in virtually every modern organization
• Faults can be very costly
• Programmers/testers may lack experience
and/or time
• Little attention has been paid to DB application program correctness
Outline of Talk
• Background
• Aspects of DB system correctness
• Issues in testing DB application programs
• Architecture of tool set
• Tool for generating database states
• Additional issues and approaches
DBMS and DB application
DB application, eg.,/* C program with embedded SQL*/
Database Management
System
DB
DB schema, eg., Emp(ssn, name, addr, sal)
Dept(id, dept-name)
Relational databases
• Data is viewed as a collection of relations– relation schema– relation (relation state)
Table S ssn name001-00-0356 Johnson012-34-5678 Smith036-54-5555 Jones051-88-9911 Blake
• Tables, tuples, attributes, constraints for example, create table S (ssn char(11) primary key,
name char(25) not null)
Aspects of Correctness• Does the DBMS perform all operations correctly?
• Is concurrent access handled correctly?
• Is the system fault-tolerant?
• ...
Does the application program behave as intended?
Traditional vs. DB programs
• function• imperative nature
• function• declarative nature
input
output
input DB state
output DB state
• Customer-feature table:– customerID
– address
– features
– ...
• Billing table– customerID
– billing plan
– ...
Input customer ID and name of feature to which the customer wishes to subscribe.Invalid ID: return 0feature unavailable in that area: return code 2feature available but incompatible with existing features:
return code 3else update customer’s feature record, update billing
table, return code 1
Example of an Informal Specification
What are the Input/Output Spaces?• Naïve approach
– I = {customer-IDs} X {feature-names}– 0 = {0,1,2,3}
• More suitable approach:– I = {customer-IDs} X {feature-names} X {database-
states}– 0 = {0,1,2,3} X {database-states}
• Problem:– must control and observe the DB state
DB Application Testing Goal
• Select “interesting” DB states along with user inputs that exercise “interesting” behavior
• Cover wide variety of situations that could arise in practice
• Do so in a way that facilitates checking of output to user and resulting DB state
Situations to Explore• Customer already subscribes to that feature
• Feature not available in customer’s area
• Feature available, but incompatible with other features customer already has
• Feature available and compatible with existing features
• Customer doesn’t yet subscribe to any features
• ...
May involve interplay between several tables• Table 1:
incompatible features
• Table 2: features available
in various areas
• Table 3: customers
and features
feature incompatible_featureF1 F2... ...
feature areaF1 11235F2 11235... ...
ID area F1 F2 ... FN011 11235 ... ...
Will Live Data Suffice?
• May not reflect sufficiently wide variety of situations
• May be difficult to find the situations of interest
• May violate privacy or security constraints
Generating Synthetic Data
• DB state is a collection of relation states, each of which is a subset of the Cartesian product of some domains
• Generating domain elements and gluing them together isn’t enough, since constraints must be honored
• We attempt to generate interesting data that obey integrity constraints
• Use schema and user supplied info
Suggestionsfrom tester
DB schema
App source
App exec
User input
Output
DB state
Results
Input Generator
State Generator
State Checker
Output Checker
DB state generator• Inputs DB schema (in SQL)
• Parses schema to derive info about– attributes
– tables
– constraints : uniqueness, not-NULL, referential integrity
– inputs additional info from user
– suggested attribute values, divided into groups, similar to Category-Partition Testing [Ostrand-Balcer]
– additional annotations
create table s (sno char(5), sname char(20), status decimal(3), city char(15), primary key(sno)); create table p (pno char(6) primary key, pname char(20), color char(6), weight decimal(3), city char(15)); create table sp (sno char(5), pno char(6), qty decimal(5), primary key(sno,pno), foreign key(sno) references s, foreign key(pno) references p);
Example Schema
Create table s( sno char(5), primary key(sno) );
Create table s( sno char(5) primary key );
Column Definition
Nodetag type = T_ColumnDef
colname = “sno”
type name = “bpchar”
Constraints = NIL
Table Constraint
Nodetag type = T_Constraint
contype = CONSTR_PRIMARY
keys
T_IDENT
name = “sno”
Stmt
Create Stmt
Nodetag type = T_CreateStmt
relname = “s”
Column Definition
Nodetag type = T_ColumnDef
colname = “sno”
type name = “bpchar”
Constraints
contype = CONSTR_PRIMARY
Stmt
Create Stmt
Nodetag type = T_CreateStmt
relname = “s”
P | 5 |pname | F| F| F| F| F| F| F| pno | F| F| F| F| F| F| F|
weight| F| F| F| F| F| F| F| color | F| F| F| F| F| F| F|
0
1
2
3
4 city | P | char | ~pr | ~un | ~nn
pname | P | char | ~pr | ~un | ~nn pno | P | char | pr | un | ~nn
weight | P | dec | ~pr | ~un | ~nn color | P | char | ~pr | ~un | ~nn
0
1
2
3
cp
cpcp
cpcp
S | 4 |
globalTablePointer sname | F| F| F| F| F| F| F|
sno | F| F| F| F| F| F| F|
City | F| F| F| F| F| F| F| status | F| F| F| F| F| F| F|
0
1
2
3
sname | S | char | ~pr | ~un | ~nn sno | F| F| F| F| F| F| F|
City | F| F| F| F| F| F| F| status | F| F| F| F| F| F| F|
0
1
2
3
sno | S | char | pr | un | ~nn
city | S | char | ~pr | ~un | ~nn status | S | dec | ~pr | ~un | ~nn
0
1
2
3
cp
cp
cpcp
SP | 3 |
Null pno |SP | char | pr | un | ~nn | foreign
sno |SP | char | pr | un | ~nn | foreign
qty |SP | dec | ~pr | ~un | ~nn
0
1
2
cpcpcp
Selecting Attribute Values• Initial prototype queries tester for suggested values and guidance
on how to use those values
• Values may be partitioned into data groups (choices)
• Tester may specify probabilities for data groups
--choice_name: low102030------choice_name: medium300400------choice_name: high50006000
Each category (column) can have a list of choices pointed to by cp.
cp low highmedium
10
20
30
300
400
5000
6000
DB table generation
• Tester specifies table sizes
• Tool generates tuples for insertion– select data group or NULL, guided by annotations– select value from data group, obeying constraints– keep track of values used
• Outputs sequence of SQL insert statements
sno:--choice_name: sno S1S2S3S4S5
sname:--choice_name: snameSmithJonesBlakeClarkAdams
pname:--choice_name: interiorseatsairbagsdashboard------choice_name: exteriordoorswheelsbumper
city:--choice_name: domestic--choice_prob: 90BrooklynFlorham-ParkMiddletown------choice_name: foreign--choice_prob: 10LondonBombay
pno:--choice_name: pnoP1P2P3P4P5
status:--choice_name: status--null_prob: 500123
color:--choice_name: colorbluegreenyellow
weight:--choice_name: weight100300500
Input files for Parts-Supplier database
city:--choice_name: domestic--choice_prob: 90BrooklynFlorham-ParkMiddletown------choice_name: foreign--choice_prob: 10LondonBombay
status:--choice_name: status--null_prob: 500123
A database state produced by the tool
sno pno qtyS1 P1 5000S1 P2 300S1 P3 10S2 P1 6000S2 P2 400S2 P3 5000S3 P1 20S3 P2 300S3 P3 30S4 P1 6000
pno pname color weight cityP1 NULL blue 100 BrooklynP2 Seats green 300 Florham-ParkP3 airbags yellow 500 Middletown
sno sname status cityS1 NULL 0 BrooklynS2 Smith 1 Florham-ParkS3 Jones NULL LondonS4 Blake NULL Middletown
Table s Table sp
Table p
Related work
• Lyons-77, DB-Fill, TestBase
• Like our approach, rely on user to supply attribute values
• Do not handle integrity constraints as completely
• Require tester to describe tables in special-purpose language (rather than SQL)
Testing Techniques in DB literature
• Focus on DB system performance, rather than DB application correctness
• Benchmarks
• Performance of SQL processor– Generation of large number of DML statements
[Slutz]
• Generation of huge tables with given statistical properties [Grey et al]
Summary
• Issues
• Framework
• Prototype
Future Work• Refinement based on feedback from DB application
developers / testers
• Other DB state generation heuristics– boundary values
– “missing” constraints
– difficult SQL features
• Interplay between DB state and user inputs
• Checking DB state after test execution
• Checking application outputs