A Population Size Estimation Problem

transcript

Eliezer KantorowitzSoftware Engineering DepartmentOrt Braude College of Engineering

kantor@cs.technion.ac.il

11/7-2006 @2006 Eliezer Kantorowitz 2

Table of Contents

• The problem

• Capture Recapture Estimators

• Estimating number of software defects

• Defect injection estimators

• Our experiments

• Our estimator

• Conclusions and Research plans

Estimating Population Size

• Two steps1. Make an observation

2. Employ an estimator on the number of observed items

• Example: Industrial quality assurance1. Count the number of defects in a sample

2. Estimate the defect population size from the defects counted in the sample

Partial Observation Methods

• A Partial Observation Methods is an observation method that do not produce a count all the relevant items

• Example: Due to poor lighting, some of the defect items in the sample cannot be seen

This talk is about estimators applicable when using

partial observation methods

Table of Contents

• The problem

• Our experiments

• Our estimator

Counting Wild AnimalsCapture Recapture Estimators

• Example: Counting the gazelle population in upper Galilee

• Problem: We can only observe a part of the n members of the gazelle population

• Solution: We capture ntag gazelles in a trap. The gazelles are tagged and freed

• We assume that the freed gazelles are evenly mixed with the remaining n-ntag gazelles

Capture Recapture (CR) - 2

• We put a new trap and capture m gazelles of which mtag are recaptured gazelles

• The gazelle population size n may be estimated as

tag 0tagmassuming

Capture Recapture (CR) -3

• A number of different CR estimators corresponding to different sets of assumptions have been developed

• The essence of CR is that we enter (inject) a KNOWN number of tagged animals into the unknown number of animals. This known number can be employed in later statistical analysis

Table of Contents

• The problem

• Our experiments

• Our estimator

Problem discussed in the following :

Estimating the Number of Defects in Software Users Requirements

Document (URD)

Users Requirements Document (URD)

• Prepared by software analysts and users

• Part of software ordering contract

• In one case 55% of all defects (“bugs”) were URD defects

• URD validation usually done by inspection

Example: a URD used in ourexperiments

PURPOSEManage a costume shop, which rents and sells costumes.

Control the inventory and customer databases. Manage orders and invoices.

CUSTOMER DATABASE - SYSTEM ACTIVITIES Enter new customers.Automatic updates of the customer’s database.List of customers active over the last three years.List of customers ordered by the age of the children.

List of customers ordered by their purchase and rental transactions.

URD validation

• Usually done by inspection

Inspection Method (Fagan 1986)

• The inspected document is presented by its originator to a team of human inspectors

• Each inspector inspects the entire document and records the found defects

• Meeting of all inspectors, where defects found by different inspectors are checked and combined into one list

Inspection Problem

• Usually an inspector sees only a part of all defects

• Different inspectors usually see different sets of defects

• A team of j+1 inspectors usually detects more defects than a team of j inspectors

• Inspection costs proportional to j

Fault Detection Ratio (FDR) as Function of Inspector Team Size

Teams of two inspectors

2 4 6 8 10 12 14 16 18 20 22 24 26

number of inspectors

Experiment 2 Experiment 1

FDR=(number detected faults)/(total number of faults)

Using Capture Recapture (CR)

• CR was adapted to the inspection problem

• Defects detected by more than one inspector play a similar role to that of recaptured gazelles

• Extensive experiments suggest that CR is not providing sufficient accurate estimates

Table of Contents

• The problem

• Our experiments

• Our estimator

Defect Injection

• In CR methods we freed a KNOWN number of tagged animals

• In defect injection methods we enter (inject) a KNOWN number of defects into the document

Defect Injection Method

• ninjected – number of injected defects• ninjected-detected – number of detected injected

defects• nreal - number of real defects (the unknown)• ndetected-real – number of detected real defects• Estimated number or real defects:

nreal = ndetected-real(ninjected/ ninjected-detected)

Problems of Defect Injection

• The injected defects must “represent” the real defects “correctly”

Defect Types Distribution

Distribution of Fault Types

Missing Functionality

Inconsistent Information

Missing Information

Examples of Injected DefectsInconsistent InformationLists of customers entered by different techniques that contradict each other (lines 3 and 26).

Cancellation of an order that was reserved is illegal (lines 28 and 32).

The systems do not keep customer data for more than three years (lines 5 and 10).

There is not enough information about the customers in the system (lines 6 and 10).An article that was reserved cannot be sold. (lines 27 and 33).Missing functionality:

…Missing information:

Defect Injection Summary

• Common method for software documents

• Sufficient accurate estimates

• Difficult to produce “representative” defects

• Laborious

Table of Contents

• The problem

• Our experiments

• Our estimator

The Experimentators

• Eliezer Kantorowitz

• Arie Guttman

• Lior Arzi

• Assaf Harel

Experiments - 1

• Computer Science students at Technion– 250 freshmen– 69 senior

• Industry engineers– 25 engineers

• Two experiments from literature– 57 senior Computer Science students

• All together 401 persons involved

Experiments - 2

• Employed requirements documents– Costume shop information system– Missile launcher– Railroad system (in experiments from

litterature)

• Data of good quality– 401 persons– Careful preparation

Typical Results

Y axis is the number of inspectors that detected the different defects. The two “easiest to detect” defects were detected by 6 inspectors each

Table of Contents

• The problem

• Our experiments

• Our estimator

The Model - 1

i (fault number)

nmax n-10

Pi,1 (probability that fault i is detected by 1 inspector)

The linearity assumption

The Model - 2The linearity assumption

max1,0max

1,01, 0 niPn

Pi,1 - probability that one inspector detects defect i.

nmax – defects 0 ≤ i <nmax can be detected

nFDR max

maxmax

1,01, 01

1 FDRn

FDRPPi

The Model - 3

• P0,1 - The probability that one inspector detects the “easiest to detect” defect

• P0,1 ε[0,1] - A measure of the ease of detection

• FDRmax – The inspectors are able to detect the proportion FDRmax of the n defects, i.e. FDRmaxn defects

• FDRmax ε[0,1] – a measure of the domain knowledge of the inspectors

The Model – 4

jiji PP )1(1 1,,

, )()(FDR

ji dxxPjFDR

The probability that j inspectors will detect defect i may be estimated:

j inspectors are expected to detect FDR(j)n defects:

For n →∞

Kantorowitz Estimator

)1(11)(

max jP

PFDRjFDR

Example of application: A quality assurance manager can employ this estimator to estimate the number of inspectors j required to detect the proportion FDR(j) of all faults. The coefficients FDRmax and P0,1 must somehow be estimated

This estimator is implicitly the cost function required in a Total Quality Management (TQM). The number of inspectors j represent the costs, while FDR(j) represents the quality

Application example: What is the optimal inspector team size?

FDR vs. # of inspectors

number of inspectors

teams of 1 teams of 2 teams of 3

Teams of 2 detects the largest number of defects per inspector

Application example: Comparing Engineers with Students

1st year Students

0.400.99

Industry Engineers

0.740.99

Experiment with Missile launcher user requirements document

Example: 4 student teams achieve FDR=053 while only two engineer teams do FDR=0.54, i.e. an engineer detected about twice as many defects as a student

maxFDR

Summary of my estimator

• Based on a property of the data observed in a large number of experiments

• The estimator was derived by modeling the observed property of the data

• Sufficient accurate• Measuring the two coefficients of the model

P0,1and FDRmax is laborious, however, their numerical values may be estimated from similar cases

Table of Contents

• The problem

• Our experiments

• Our estimator

Surveyed Estimators for the incomplete Counting Problem

• Capture Recapture

• Defect Injection

• My estimator

My Estimator vs. Capture Recapture Estimators

• Our estimator was sufficient accurate for estimating the number of defects in use requirements documents, while the CR estimators were not sufficient accurate

• Were the data in the extensive CR experiments of sufficient good quality?

Why did My Estimator Work

• My estimator exploited a property of the data (the linearity assumption)

• The property was detected through careful extensive experimentation

Looking for Similar Applications

• Can the approach of this research be useful in other areas where the employed observation method only count part of the relevant items?

A Population Size Estimation Problem

Documents