+ All Categories
Home > Documents > Statistical software engineering

Statistical software engineering

Date post: 11-Sep-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
160
title: Statistical Software Engineering author: publisher: National Academies Press isbn10 | asin: 0309053447 print isbn13: 9780309053440 ebook isbn13: 9780585002101 language: English subject Software engineering--Statistical methods. publication date: 1996 lcc: QA76.758.N38 1996eb ddc: 005.1 subject: Software engineering--Statistical methods.
Transcript
Page 1: Statistical software engineering

title: StatisticalSoftwareEngineeringauthor:

publisher: NationalAcademiesPressisbn10|asin: 0309053447printisbn13: 9780309053440ebookisbn13: 9780585002101

language: Englishsubject Softwareengineering--Statisticalmethods.

publicationdate: 1996lcc: QA76.758.N381996ebddc: 005.1

subject: Softwareengineering--Statisticalmethods.

Page 2: Statistical software engineering

TheNationalResearchCouncilestablishedtheBoardonMathematicalSciencesin1984.TheobjectivesoftheBoardaretomaintainawarenessandactiveconcernforthehealthofthemathematicalsciencesandtoserveasthefocalpointintheNationalResearchCouncilforissuesconnectedwiththemathematicalsciences.TheBoardholdssymposiaandworkshopsandpreparesreportsonemergingissuesandareasofresearchandeducation,conductsstudiesforfederalagencies,andmaintainsliaisonwiththemathematicalsciencescommunities,academia,professionalsocieties,andindustry.

TheBoardgratefullyacknowledgesongoingcoresupportfromtheAirForceOfficeofScientificResearch,ArmyResearchOffice,DepartmentofEnergy,NationalScienceFoundation,NationalSecurityAgency,andOfficeofNavalResearch.

Page 3: Statistical software engineering

Pagei

StatisticalSoftwareEngineering

PanelonStatisticalMethodsinSoftwareEngineeringCommitteeonAppliedandTheoreticalStatistics

BoardonMathematicalSciencesCommissiononPhysicalSciences,Mathematics,andApplications

NationalResearchCouncil

NationalAcademyPressWashington,D.C.1996

Page 4: Statistical software engineering

Pageii

NOTICE:TheprojectthatisthesubjectofthisreportwasapprovedbytheGoverningBoardoftheNationalResearchCouncil,whosemembersaredrawnfromthecouncilsoftheNationalAcademyofSciences,theNationalAcademyofEngineering,andtheInstituteofMedicine.

TheNationalAcademyofSciencesisaprivate,nonprofit,self-perpetuatingsocietyofdistinguishedscholarsengagedinscientificandengineeringresearch,dedicatedtothefurtheranceofscienceandtechnologyandtotheiruseforthegeneralwelfare.UpontheauthorityofthechartergrantedtoitbytheCongressin1863,theAcademyhasamandatethatrequiresittoadvisethefederalgovernmentonscientificandtechnicalmatters.Dr.BruceAlbertsispresidentoftheNationalAcademyofSciences.

TheNationalAcademyofEngineeringwasestablishedin1964,underthecharteroftheNationalAcademyofSciences,asaparallelorganizationofoutstandingengineers.Itisautonomousinitsadministrationandintheselectionofitsmembers,sharingwiththeNationalAcademyofSciencestheresponsibilityforadvisingthefederalgovernment.TheNationalAcademyofEngineeringalsosponsorsengineeringprogramsaimedatmeetingnationalneeds,encourageseducationandresearch,andrecognizesthesuperiorachievementofengineers.Dr.HaroldLiebowitzispresidentoftheNationalAcademyofEngineering.

TheInstituteofMedicinewasestablishedin1970bytheNationalAcademyofSciencestosecuretheservicesofeminentmembersofappropriateprofessionsintheexaminationofpolicymatterspertainingtothehealthofthepublic.TheInstituteactsundertheresponsibilitygiventotheNationalAcademyofSciencesbyitscongressionalchartertobeanadvisertothefederalgovernmentand,

Page 5: Statistical software engineering

uponitsowninitiative,toidentifyissuesofmedicalcare,research,andeducation.Dr.KennethI.ShineispresidentoftheInstituteofMedicine.

TheNationalResearchCouncilwasorganizedbytheNationalAcademyofSciencesin1916toassociatethebroadcommunityofscienceandtechnologywiththeAcademy'spurposesoffurtheringknowledgeandadvisingthefederalgovernment.FunctioninginaccordancewithgeneralpoliciesdeterminedbytheAcademy,theCouncilhasbecometheprincipaloperatingagencyofboththeNationalAcademyofSciencesandtheNationalAcademyofEngineeringinprovidingservicestothegovernment,thepublic,andthescientificandengineeringcommunities.TheCouncilisadministeredjointlybybothAcademiesandtheInstituteofMedicine.Dr.BruceAlbertsandDr.HaroldLiebowitzarechairmanandvice-chairman,respectively,oftheNationalResearchCouncil.

ThisprojectwassupportedbytheAdvancedResearchProjectsAgency,ArmyResearchOffice,NationalScienceFoundation,andDepartmentoftheNavy'sOfficeoftheChiefofNavalResearch.Anyopinions,findings,andconclusionsorrecommendationsexpressedinthismaterialarethoseoftheauthorsanddonotnecessarilyreflecttheviewsofthesponsors.Furthermore,thecontentofthereportdoesnotnecessarilyreflectthepositionorthepolicyoftheU.S.government,andnoofficialendorsementshouldbeinferred.

Copyright1996bytheNationalAcademyofSciences.Allrightsreserved.

LibraryofCongressCatalogCardNumber95-71101InternationalStandardBookNumber0-309-05344-7

Additionalcopiesofthisreportareavailablefrom:NationalAcademyPress,Box2852101ConstitutionAvenue,N.W.

Page 6: Statistical software engineering

Washington,D.C.20055800-624-6242202-334-3313(intheWashingtonmetropolitanarea)B-676

PrintedintheUnitedStatesofAmerica

Page 7: Statistical software engineering

Pageiii

PANELONSTATISTICALMETHODSINSOFTWAREENGINEERING

DARYLPREGIBON,AT&TBellLaboratories,Chair

HERMANCHERNOFF,HarvardUniversity

BILLCURTIS,CarnegieMellonUniversity

SIDDHARTHAR.DALAL,Bellcore

GLORIAJ.DAVIS,NASA-AmesResearchCenter

RICHARDA.DEMILLO,Bellcore

STEPHENG.EICK,AT&TBellLaboratories

BEVLITTLEWOOD,CityUniversity,London,England

CHITOORV.RAMAMOORTHY,UniversityofCalifornia,Berkeley

Staff

JOHNR.TUCKER,Director

Page 8: Statistical software engineering

Pageiv

COMMITTEEONAPPLIEDANDTHEORETICALSTATISTICS

JONR.KETTENRING,Bellcore,Chair

RICHARDA.BERK,UniversityofCalifornia,LosAngeles

LAWRENCED.BROWN,UniversityofPennsylvania

NICHOLASP.JEWELL,UniversityofCalifornia,Berkeley

JAMESD.KUELBS,UniversityofWisconsin

JOHNLEHOCZKY,CarnegieMellonUniversity

DARYLPREGIBON,AT&TBellLaboratories

FRITZSCHEUREN,GeorgeWashingtonUniversity

J.LAURIESNELL,DartmouthCollege

ELIZABETHTHOMPSON,UniversityofWashington

Staff

JACKALEXANDER,ProgramOfficer

Page 9: Statistical software engineering

Pagev

BOARDONMATHEMATICALSCIENCES

AVNERFRIEDMAN,UniversityofMinnesota,Chair

LOUISAUSLANDER,CityUniversityofNewYork

HYMANBASS,ColumbiaUniversity

MARYELLENBOCK,PurdueUniversity

PETERE.CASTRO,EastmanKodakCompany

FANR.K.CHUNG,UniversityofPennsylvania

R.DUNCANLUCE,UniversityofCalifornia,Irvine

SUSANMONTGOMERY,UniversityofSouthernCalifornia

GEORGENEMHAUSER,GeorgiaInstituteofTechnology

ANILNERODE,CornellUniversity

IMGRAMOLKIN,StanfordUniversity

RONALDF.PEIERLS,BrookhavenNationalLaboratory

DONALDST.P.RICHARDS,UniversityofVirginia

MARYF.WHEELER,RiceUniversity

WILLIAMP.ZIEMER,IndianaUniversity

ExOfficioMember

JONR.KETTENRING,BellcoreChair,CommitteeonAppliedandTheoreticalStatistics

Staff

JOHNR.TUCKER,Director

Page 10: Statistical software engineering

JACKALEXANDER,ProgramOfficer

RUTHE.O'BRIEN,StaffAssociate

BARBARAW.WRIGHT,AdministrativeAssistant

Page 11: Statistical software engineering

Pagevi

COMMISSIONONPHYSICALSCIENCES,MATHEMATICS,ANDAPPLICATIONS

ROBERTJ.HERMANN,UnitedTechnologiesCorporation,Chair

STEPHENL.ADLER,InstituteforAdvancedStudy

PETERM.BANKS,EnvironmentalResearchInstituteofMichigan

SYLVIAT.CEYER,MassachusettsInstituteofTechnology

L.LOUISHEGEDUS,W.R.GraceandCompany

JOHNE.HOPCROFT,CornellUniversity

RHONDAJ.HUGHES,BrynMawrCollege

SHIRLEYA.JACKSON,U.S.NuclearRegulatoryCommission

KENNETHI.KELLERMANN,NationalRadioAstronomyObservatory

KENKENNEDY,RiceUniversity

THOMASA.PRINCE,CaliforniaInstituteofTechnology

JEROMESACKS,NationalInstituteofStatisticalSciences

L.E.SCRIVEN,UniversityofMinnesota

LEONT.SILVER,CaliforniaInstituteofTechnology

CHARLESP.SLICHTER,UniversityofIllinoisatUrbana-Champaign

ALVINW.TRIVELPIECE,OakRidgeNationalLaboratory

SHMUELWINOGRAD,IBMT.J.WatsonResearchCenter

CHARLESA.ZRAKET,MitreCorporation(retired)

Page 12: Statistical software engineering

NORMANMETZGER,ExecutiveDirector

Page 13: Statistical software engineering

Pagevii

PrefaceThedevelopmentandtheproductionofhigh-quality,reliable,complexcomputersoftwarehavebecomecriticalissuesintheenormousworldwidecomputertechnologymarket.Thecapabilitytoefficientlyengineercomputersoftwaredevelopmentandproductionprocessesiscentraltothefutureeconomicstrength,competitiveness,andnationalsecurityoftheUnitedStates.However,problemsrelatedtosoftwarequality,reliability,andsafetypersist,aprominentexamplebeingthefailureonseveraloccasionsofmajorlocalandnationaltelecommunicationsnetworks.Itisnowacknowledgedthatthecostsofproducingandmaintainingsoftwaregreatlyexceedthecostsofdeveloping,producing,andmaintaininghardware.Thusthedevelopmentandapplicationofcost-savingtools,alongwithtechniquesforensuringqualityandreliabilityinsoftwareengineering,areprimarygoalsintoday'ssoftwareindustry.Theenormityofthissoftwareproductionandmaintenanceactivityissuchthatanytoolscontributingtoseriouscostsavingswillyieldatremendouspayoffinabsoluteterms.

AtameetingoftheCommitteeonAppliedandTheoreticalStatistics(CATS)oftheNationalResearchCouncil(NRC),participantsidentifiedsoftwareengineeringasanareapresentingnumerousopportunitiesforfruitfulcontributionsfromstatisticsandofferingexcellentpotentialforbeneficialinteractionsbetweenstatisticiansandsoftwareengineersthatmightpromoteimprovedsoftwareengineeringpracticeandcostsavings.Todelineatetheseopportunitiesandfocusattentiononcontextspromisingusefulinteractions,CATSconvenedastudypaneltogatherinformationandproduceareportthatwould(1)exhibitimprovedmethodsforassessingsoftwareproductivity,quality,

Page 14: Statistical software engineering

reliability,associatedrisk,andsafetyandformanagingsoftwaredevelopmentprocesses,(2)outlineaprogramofresearchinthestatisticalsciencesandtheirapplicationstosoftwareengineeringwiththeaimofmotivatingandattractingnewresearchersfromthemathematicalsciences,statistics,andsoftwareengineeringfieldstotackletheseimportantandpressingproblemareas,and(3)emphasizetherelevanceofusingrigorousstatisticalandprobabilistictechniquesinsoftwareengineeringcontextsandsuggestopportunitiesforfurtherresearchinthisdirection.

Tohelpidentifyimportantissuesandobtainabroadrangeofperspectivesonthem,thepanelorganizedaninformation-gatheringforumonOctober11-12,1993,atwhich12invitedspeakersaddressedhowstatisticalmethodsimpingeonthesoftwaredevelopmentprocess,softwaremetrics,softwaredependabilityandtesting,andsoftwarevisualization.Theforumalsoincludedconsiderationofnonstandardmethodsandselectcasestudies(seetheforumprogramintheappendix).Thepanelhopesthatitsreport,whichisbasedonthepanel'sexpertiseaswellasinformationpresentedattheforum,willcontributetopositiveadvancesinsoftwareengineeringand,asasubsidiarybenefit,beastimulusforothercloselyrelateddisciplines,e.g.,appliedmathematics,operationsresearch,computerscience,andsystemsandindustrialengineering.Thepanelis,infact,veryenthusiasticabouttheopportunitiesfacingthestatisticalcommunityandhopestoconveythisenthusiasminthisreport.

Thepanelgratefullyacknowledgestheassistanceandinformationprovidedbyanumberofindividuals,includingthe12forumspeakersT.W.Keller,D.Card,V.R.Basili,J.C.Munson,J.C.Knight,R.Lipton,T.Yamaura,S.Zweben,M.S.Phadke,E.E.Sumner,Jr.,W.Hill,andJ.Staskofouranonymousreviewers,theNRCstaffoftheBoardonMathematicalScienceswhosupportedthevariousfacetsofthisproject,andSusanMauriziforherworkineditingthemanuscript.

Page 15: Statistical software engineering
Page 16: Statistical software engineering

Pageix

ContentsEXECUTIVESUMMARY 1

1INTRODUCTION 5

2CASESTUDY:NASASPACESHUTTLEFLIGHTCONTROLSOFTWARE

9

OverviewofRequirements 9

TheOperationalLifeCycle 10

AStatisticalApproachtoManagingtheSoftwareProductionProcess

10

FaultDetection 11

SafetyCertification 12

3ASOFTWAREPRODUCTIONMODEL 13

ProblemFormulationandSpecificationofRequirements 14

Design 14

Implementation 16

Testing 18

4CRITIQUEOFSOMECURRENTAPPLICATIONSOFSTATISTICSINSOFTWAREENGINEERING

27

CostEstimation 27

StatisticalInadequaciesinEstimating 29

ProcessVolatility 30

Page 17: Statistical software engineering

MaturityandDataGranularity 30

ReliabilityofModelInputs 31

ManagingtoEstimates 32

AssessmentandReliability 32

ReliabilityGrowthModeling 32

InfluenceoftheDevelopmentProcessonSoftwareDependability

36

InfluenceoftheOperationalEnvironmentonSoftwareDependability

37

Safety-CriticalSoftwareandtheProblemofAssuringUltrahighDependability

38

DesignDiversity,FaultTolerance,andGeneralIssuesofDependence

38

JudgmentandDecision-makingFramework 39

StructuralModelingIssues 40

Experimentation,DataCollection,andGeneralStatisticalTechniques

40

SoftwareMeasurementandMetrics 41

5STATISTICALCHALLENGES 43

SoftwareEngineeringExperimentalIssues 43

CombiningInformation 46

VisualizationinSoftwareEngineering 48

Page 18: Statistical software engineering
Page 19: Statistical software engineering

Pagex

ConfigurationManagementData 49

FunctionCallGraphs 50

TestCodeCoverage 50

CodeMetrics 50

ChallengesforVisualization 52

OpportunitiesforVisualization 52

OrthogonalDefectClassification 59

6 SUMMARYANDCONCLUSIONS 61

InstitutionalModelforResearch 62

ModelforDataCollectionandAnalysis 62

IssuesinEducation 64

REFERENCES 67

APPENDIX:FORUMPROGRAM 72

Page 20: Statistical software engineering

Page1

ExecutiveSummarySoftware,acriticalcoreindustrythatisessentialtoU.S.interestsinscience,technology,anddefense,isubiquitousintoday'ssociety.Softwarecoexistswithhardwareinourtransportation,communication,financial,andmedicalsystems.Asthesesystemsgrowinsizeandcomplexityandourdependenceonthemincreases,theneedtoensuresoftwarereliabilityandsafety,faulttolerance,anddependabilitybecomesparamount.Buildingsoftwareisnowviewedasanengineeringdiscipline,softwareengineering,whichaimstodevelopmethodologiesandprocedurestocontrolthewholesoftwaredevelopmentprocess.Besidestheissueofcontrollingandimprovingsoftwarequality,theissueofimprovingtheproductivityofthesoftwaredevelopmentprocessisalsobecomingimportantfromtheindustrialperspective.

PURPOSEANDSCOPEOFTHISSTUDY

Althoughstatisticalmethodshavealonghistoryofcontributingtoimprovedpracticesinmanufacturingandintraditionalareasofscience,technology,andmedicine,theyhaveuptonowhadlittleimpactonsoftwaredevelopmentprocesses.Thisreportattemptstobridgetheislandsofknowledgeandexperiencebetweenstatisticsandsoftwareengineeringbyenunciatinganewinterdisciplinaryfield:statisticalsoftwareengineering.Itishopedthatthereportwillhelpseedthefieldofstatisticalsoftwareengineeringbyindicatingopportunitiesforstatisticalthinkingtocontributetoincreasedunderstandingofsoftwareandsoftwareproduction,andtherebyenhancethequalityandproductivityofboth.

Thisreportistheresultofastudybyapanelconvenedbythe

Page 21: Statistical software engineering

CommitteeonAppliedandTheoreticalStatistics(CATS),astandingcommitteeoftheBoardonMathematicalSciencesoftheNationalResearchCouncil,toidentifychallengesandopportunitiesinthedevelopmentandimplementationofsoftwareinvolvingsignificantstatisticalcontent.Inadditiontopointingouttherelevanceofrigorousstatisticalandprobabilistictechniquestopressingsoftwareengineeringconcerns,thepaneloutlinesopportunitiesforfurtherresearchinthestatisticalsciencesandtheirapplicationstosoftwareengineering.Theaimistomotivatenewresearchersfromstatisticsandthemathematicalsciencestotackleproblemswithrelevanceforsoftwaredevelopment,aswellastosuggestastatisticalapproachtosoftwareengineeringconcernsthatthepanelhopessoftwareengineerswillfindrefreshingandstimulating.Thisreportalsotouchesonimportantissuesintrainingandeducationforsoftwareengineersinthestatisticalsciencesandforstatisticianswithaninterestinsoftwareengineering.

Centraltothisreport'stheme,andessentialtostatisticalsoftwareengineering,istheroleofdata:whereverdataareusedorcanbegeneratedinthesoftwarelifecycle,statisticalmethodscanbebroughttobearfordescription,estimation,andprediction.Nevertheless,themajorobstacletoapplyingstatisticalmethodstosoftwareengineeringisthelackofconsistent,high-qualitydataintheresource-allocation,design,review,implementation,andteststagesofsoftwaredevelopment.Statisticiansinterestedinconductingresearchinsoftwareengineering

Page 22: Statistical software engineering

Page2

mustplayaleadershiproleinjustifyingthatresourcesareneededtoacquireandmaintainhigh-qualityandrelevantdata.

Thepanelconjecturesthattheuseofadequatemetricsanddataofgoodqualityistheprimarydifferentiatorbetweensuccessful,productivesoftwaredevelopmentorganizationsandthosethatarestruggling.Althoughthesinglelargestareaofoverlapbetweenstatisticsandsoftwareengineeringcurrentlyconcernssoftwaredevelopmentandproduction,itisthepanel'sviewthatthelargestcontributionsofstatisticstosoftwareengineeringwillbethoseaffectingthequalityandproductivityoffront-endprocesses,thatis,processesthatprecedecodegeneration.Oneofthebiggestimpactsthatthestatisticalcommunitycanmakeinsoftwareengineeringistocombineinformationacrosssoftwareengineeringprojectsasameansofevaluatingeffectsoftechnology,language,organization,andprocess.

CONTENTSOFTHISREPORT

Followinganintroductoryopeningchapterintendedtofamiliarizereaderswithbasicstatisticalsoftwareengineeringconceptsandconcerns,acasestudyoftheNationalAeronauticsandSpaceAdministration(NASA)spaceshuttleflightcontrolsoftwareispresentedinChapter2toillustratesomeofthestatisticalissuesinsoftwareengineering.Chapter3describesawell-knowngeneralsoftwareproductionmodelandassociatedstatisticalissuesandapproaches.AcritiqueofsomecurrentapplicationsofstatisticsandsoftwareengineeringispresentedinChapter4.Chapter5discussesanumberofstatisticalchallengesarisinginsoftwareengineering,andthepanel'sclosingsummaryandconclusionsappearinChapter6.

STATISTICALCHALLENGES

Page 23: Statistical software engineering

Incomparisonwithotherengineeringdisciplines,softwareengineeringisstillinthedefinitionstage.Characteristicsofestablisheddisciplinesincludehavingdefined,tested,crediblemethodologiesforpractice,assessment,andpredictability.Softwareengineeringcombinesapplicationdomainknowledge,computerscience,statistics,behavioralscience,andhumanfactorsissues.Statisticalchallengesinsoftwareengineeringdiscussedinthisreportincludethefollowing:

Generalizingparticularstatisticalsoftwareengineeringexperimentalresultstoothersettingsandprojects,

Scalingupresultsobtainedinacademicstudiestoindustrialsettings,

Combininginformationacrosssoftwareengineeringprojectsandstudies,

Adoptingexploratorydataanalysisandvisualizationtechniques,

Educatingthesoftwareengineeringcommunityregardingstatisticalapproachesanddataissues,

Developingmethodsofanalysistocopewithqualitativevariables,

Page 24: Statistical software engineering

Page3

Providingmodelswiththeappropriateerrordistributionsforsoftwareengineeringapplications,and

Enhancingacceleratedlifetesting.

SUMMARYANDCONCLUSIONS

Inthe1990s,complexhardware-basedfunctionalityisbeingreplacedbymoreflexible,software-basedfunctionality,andmassivesoftwaresystemscontainingmillionsoflinesofcodearebeingcreatedbymanyprogrammerswithdifferentbackgrounds,training,andskills.Thechallengeistobuildhuge,high-qualitysystemsinacost-effectivemanner.Thepanelexpectsthischallengetopreoccupythefieldofsoftwareengineeringfortherestofthedecade.Anysetofmethodologiesthatcanhelpinthistaskwillbeinvaluable.Moreimportantly,theuseofsuchmethodologieswilllikelydeterminethecompetitivepositionsoforganizationsandnationsinvolvedinsoftwareproduction.Whatisneededisadetailedunderstandingbystatisticiansofthesoftwareengineeringprocess,aswellasanappreciationbysoftwareengineersofwhatstatisticianscanandcannotdo.

Catalystsessentialforthisproductiveinteractionbetweenstatisticiansandsoftwareengineers,andsomeoftheinterdisciplinaryresearchopportunitiesforsoftwareengineersandstatisticians,includethefollowing:

Amodelforstatisticalresearchinsoftwareengineeringthatiscollaborativeinnature.Theidealcollaborationpartnersstatisticians,softwareengineers,andarealsoftwareprocessorproduct.Barrierstoacademicrewardandrecognitionbarriers,aswellasobstaclestothefundingofcross-disciplinaryresearch,canbeexpectedtodecreaseovertime;intheinterim,industrycanplayaleadershiprolein

Page 25: Statistical software engineering

nurturingcollaborationsbetweensoftwareengineersandstatisticiansandcanreduceitsownsetofbarriers(forinstance,thoserelatedtoproprietaryandintellectualpropertyinterests).

Amodelfordatacollectionandanalysisthatensurestheavailabilityofhigh-qualitydataforstatisticalapproachestoissuesinsoftwareengineering.Carefulattentiontodataissuesrangingfromdefinitionofmetricstofeed-back/-forwardloops,includingexploratorydataanalysis,statisticalmodeling,defectanalysis,andsoon,isessentialifstatisticalmethodsaretohaveanyappreciableimpactonagivensoftwareprojectunderstudy.Forthisreasonitiscrucialthatthesoftwareindustrytakealeadpositioninresearchonstatisticalsoftwareengineering.

Attentiontorelevantissuesineducation.Enormousopportunitiesandmanypotentialbenefitsarepossibleifthesoftwareengineeringcommunitylearnsaboutrelevantstatisticalmethodsandifstatisticianscontributetoandcooperateintheeducationoffuturesoftwareengineers.Somerelevantareasinclude:

Page 26: Statistical software engineering

Page4

Designedexperiments.Softwareengineeringisinherentlyexperimental,yetrelativelyfewdesignedexperimentshavebeenconducted.Softwareengineeringeducationprogramsmuststressthedesirability,wherefeasible,ofvalidatingnewtechniquesusingstatisticallyvaliddesignedexperiments.

Exploratorydataanalysis.Exploratorydataanalysismethodsareessentially''modelfree,"wherebytheinvestigatorhopestobesurprisedbyunexpectedbehaviorratherthanhavingthinkingconstrainedtowhatisexpected.

Modeling.Recentadvancesinthestatisticalcommunityinthepastdecadehaveeffectivelyrelaxedthelinearityassumptionsofnearlyallclassicaltechniques.Thereshouldbeanemphasisoneducationalinformationexchangeleadingtomoreandwideruseoftheserecentlydevelopedtechniques.

Riskanalysis.Aparadigmformanagingriskforthespaceshuttleprogram,discussedinChapter2ofthisreport,andthecorrespondingstatisticalmethodscanplayacrucialroleinidentifyingrisk-pronepartsofsoftwaresystemsandofcombinedhardwareandsoftwaresystems.

Attitudetowardassumptions.Softwareengineersshouldbeawarethatviolatingassumptionsisnotasimportantasthoroughlyunderstandingtheviolation'seffectsonconclusions.Statisticstextbooks,courses,andconsultingactivitiesshouldconveythestatistician'slevelofunderstandingaboutandperspectiveontheimportanceandimplicationsofassumptionsforstatisticalinferencemethods.

Visualization.Graphicsisimportantinexploratorystagesinhelpingtoascertainhowcomplexamodelthedataoughttosupport;intheanalysisstage,bywhichresidualsaredisplayedtoexaminewhatthe

Page 27: Statistical software engineering

currentlyentertainedmodelhasfailedtoaccountfor;andinthepresentationstage,inwhichgraphicscanprovidesuccinctandconvincingsummariesofthestatisticalanalysisandtheassociateduncertainty.Visualizationcanhelpsoftwareengineerscopewith,andunderstand,thehugequantitiesofdatacollectedaspartofthesoftwaredevelopmentprocess.

Tools.Itisimportanttoidentifygoodstatisticalcomputingtoolsforsoftwareengineers.Anoverviewofstatisticalcomputing,languages,systems,andpackagesshouldbedonethatisfocusedspecificallyforthebenefitofsoftwareengineers.

Page 28: Statistical software engineering

Page5

1Introductionstatistics.Themathematicsofthecollection,organization,andinterpretationofnumericaldata,especiallytheanalysisofpopulationcharacteristicsbyinferencefromsampling.

1

softwareengineering.(1)Theapplicationofasystematic,disciplined,quantifiableapproachtothedevelopment,operation,andmaintenanceofsoftware;thatis,theapplicationofengineeringtosoftware.(2)Thestudyofapproachesasin(1).2

statisticalsoftwareengineering.Theinterdisciplinaryfieldofstatisticsandsoftwareengineeringspecializingintheuseofstatisticalmethodsforcontrollingandimprovingthequalityandproductivityofthepracticesusedincreatingsoftware.

Theabovedefinitionsdescribetheislandsofknowledgeandexperiencethatthisreportattemptstobridge.SoftwareisacriticalcoreindustrythatisessentialtoU.S.nationalinterestsinscience,technology,anddefense.Itisubiquitousintoday'ssociety,coexistingwithhardware(micro-electroniccircuitry)inourtransportation,communication,financial,andmedicalsystems.Thesoftwareinamoderncardiacpacemaker,forexample,consistsofapproximatelyone-halfmegabyteofcodethathelpscontrolthepulserateofpatientswithheartdisorders.Inthisandotherapplications,issuessuchasreliabilityandsafety,faulttolerance,anddependabilityareobviouslyimportant.Fromtheindustrialperspective,soalsoareissues

Page 29: Statistical software engineering

concernedwithimprovingthequalityandproductivityofthesoftwaredevelopmentprocess.Yetstatisticalmethods,despitethelonghistoryoftheirimpactinmanufacturingaswellasintraditionalareasofscience,technology,andmedicine,haveasyethadlittleimpactoneitherhardwareorsoftwaredevelopment.

ThisreportistheproductofapanelconvenedbytheBoardonMathematicalSciences'CommitteeonAppliedandTheoreticalStatistics(CATS)toidentifychallengesandopportunitiesinsoftwaredevelopmentandimplementationthathaveasignificantstatisticalcomponent.Inattemptingtoidentifyinterrelatedaspectsofstatisticsandsoftwareengineering,itenunciatesanewinterdisciplinaryfield:statisticalsoftwareengineering.Whileemphasizingtherelevanceofapplyingrigorousstatisticalandprobabilistictechniquestoproblemsinsoftwareengineering,thepanelalsopointsoutopportunitiesforfurtherresearchinthestatisticalsciencesandtheirapplicationstosoftwareengineering.Itshopeisthatnewresearchersfromstatisticsandthemathematicalscienceswillthusbemotivatedtoaddressrelevantandpressingproblemsof

1SeeTheAmericanHeritageDictionaryoftheEnglishLanguage(1981)2SeeInstituteofElectricalandElectronicsEngineers(1990)

Page 30: Statistical software engineering

Page6

softwaredevelopmentandalsothatsoftwareengineerswillfindthestatisticalemphasisrefreshingandstimulating.Thisreportalsoaddressestheimportantissuesoftrainingandeducationofsoftwareengineersinthestatisticalsciencesandofstatisticianswithaninterestinsoftwareengineering.

Atthepanel'sinformation-gatheringforuminOctober1993,12invitedspeakersdescribedtheirviewsontopicsthatareconsideredindetailinChapters2through6ofthisreport.Oneofthespeakers,JohnKnight,pointedoutthatthedateoftheforumcoincidednearlytothedaywiththe25thanniversaryoftheGarmischConference(RandellandNaur,1968),aNATO-sponsoredworkshopatwhichtheterm"softwareengineering"isgenerallyacceptedtohaveoriginated.Theparticularironyofthiscoincidenceisthatitisalsogenerallyacceptedthatalthoughmuchmoreambitioussoftwaresystemsarenowbeingbuilt,littlehaschangedintherelativeabilitytoproducesoftwarewithpredictablequality,costs,anddependability.OneoftheoriginalGarmischparticipants,A.G.Fraser,nowassociatevicepresidentintheInformationSciencesResearchDivisionatAT&TBellLaboratories,defendstheapparentlackofprogressbythereminderthatpriortoGarmisch,therewasno"collectiverealization"thattheproblemsindividualorganizationswerefacingweresharedacrosstheindustrythusGarmischwasacriticalfirststeptowardaddressingissuesinsoftwareproduction.Itishopedthatthisreportwillplayasimilarroleinseedingthefieldofstatisticalsoftwareengineeringbyindicatingopportunitiesforstatisticalthinkingtohelpincreaseunderstanding,aswellastheproductivityandquality,ofsoftwareandsoftwareproduction.

Inpreparingthisreport,thepanelstruggledwiththeproblemofprovidingthe"bigpicture"ofthesoftwareproductionprocess,whilesimultaneouslyattemptingtohighlightopportunitiesforrelated

Page 31: Statistical software engineering

researchonstatisticalmethods.Theproblemsfacingthesoftwareengineeringfieldareindeedbroad,andnonstatisticalapproaches(e.g.,formalmethodsforverifyingprogramspecifications)areatleastasrelevantasstatisticalones.Thusthisreporttendstoemphasizethelargercontextinwhichstatisticalmethodsmustbedeveloped,basedontheunderstandingthatrecognitionofthescopeandtheboundariesofproblemsisessentialtocharacterizingtheproblemsandcontributingtotheirsolution.Itmustbenotedattheoutset,forexample,thatsoftwareengineeringisconcernedwithmorethantheendproduct,namely,code.Theproductionprocessthatresultsincodeisacentralconcernandthusisdescribedindetailinthereport.Toalargeextent,thepresentationofmaterialmirrorsthestepsinthesoftwaredevelopmentprocess.Althoughcurrentlythesinglelargestareaofoverlapbetweenstatisticsandsoftwareengineeringconcernssoftwaretesting(whichimpliesthatthecodeexists),itisthepanel'sviewthatthelargestcontributionstothesoftwareengineeringfieldwillbethoseaffectingthequalityandproductivityoftheprocessesthatprecedecodegeneration.

Thepanelalsoemphasizesthattheprocessandmethodsdescribedinthisreportpertaintothecaseofnewsoftwareprojects,aswellastothemoreordinarycircumstanceofevolvingsoftwareprojectsor"legacysystems."Forinstance,thesoftwarethatcontrolsthespaceshuttleflightsystemsorthatrunsmoderntelecommunicationnetworkshasbeenevolvingforseveraldecades.Thesetwocasesarereferredtofrequentlytoillustratesoftwaredevelopmentconceptsandcurrentpractice,andalthoughthesoftwaresystemsmaybeuncharacteristicallylarge,theyarearguablyforerunnersofwhatliesaheadinmanyapplications.Forexample,laserprintersoftwareiswitnessinganorder-of-magnitude(base-10)increaseinsizewitheachnewrelease.

Page 32: Statistical software engineering
Page 33: Statistical software engineering

Page7

Similarincreasesinsizeandcomplexityareexpectedinallconsumerelectronicproductsasincreasedfunctionalityisintroduced.

Centraltothisreport'stheme,andessentialtostatisticalsoftwareengineering,istheroleofdata,therealmwhereopportunitieslieanddifficultiesbegin.Theopportunitiesareclear:wheneverdataareusedorcanbegeneratedinthesoftwarelifecycle,statisticalmethodscanbebroughttobearfordescription,estimation,andprediction.Thisreporthighlightssuchareasandgivesexamplesofhowstatisticalmethodshavebeenandcanbeused.

Nevertheless,themajorobstacletoapplyingstatisticalmethodstosoftwareengineeringisthelackofconsistent,high-qualitydataintheresource-allocation,design,review,implementation,andteststagesofsoftwaredevelopment.Statisticiansinterestedinconductingresearchinsoftwareengineeringmustacknowledgethisfactandplayaleadershiproleinprovidingadequategroundsfortheresourcesneededtoacquireandmaintainhigh-quality,relevantdata.Astatementbyoneoftheforumparticipants,DavidCard,capturestheseriousproblemthatstatisticiansfaceindemonstratingthevalueofgooddataandgooddataanalysis:"Itmaynotbethateffectivetobeabletorigorouslydemonstratea10%or15%or20%improvement(inqualityorproductivity)whenwithnodataandnoanalysis,youcanclaim50%oreven100%."

Thecostofcollectingandmaintaininghigh-qualityinformationtosupportsoftwaredevelopmentisunfortunatelyhigh,butarguablyessentialastheNASAcasestudypresentedinChapter2makesclear.Thepanelconjecturesthatuseofadequatemetricsanddataofgoodqualityis,ingeneral,theprimarydifferentiatorbetweensuccessful,productivesoftwaredevelopmentorganizationsandthosethatarestruggling.Traditionalmanufacturershavelearnedthevalueof

Page 34: Statistical software engineering

investinginaninformationsystemtosupportproductdevelopment;softwaredevelopmentorganizationsmusttakeheed.Alltoooften,asareleasedateapproaches,allavailableresourcesarededicatedtomovingasoftwareproductoutthedoor,withtheresultthatfewornoresourcesareexpendedoncollectingdataduringthesecrucialperiods.Subsequentattemptsatretrospectiveanalysistohelpforecastcostsforanewproductoridentifyrootcausesoffaultsfoundduringproducttestingareinconclusivewhenspeculationratherthanharddataisallthatisavailabletoworkwith.Butevensoftwaredevelopmentorganizationsthatrealizetheimportanceofhistoricaldatacangetcaughtinadownwardspiral:effortisexpendedoncollectionofdatathatinitiallyareinsufficienttosupportinferences.Whendataarenotbeingused,effortstomaintaintheirqualitydecrease.Butthenwhenthedataareneeded,theirqualityisinsufficienttoallowdrawingconclusions.Thespiralhasbegun.

Asonemeansofcapturingvaluablehistoricaldata,effortsareunderwaytocreaterepositoriesofdataonsoftwaredevelopmentexperimentsandprojects.Thereismuchapprehensioninthesoftwareengineeringcommunitythatsuchdatawillnotbehelpfulbecausetherelevantmetadata(dataaboutthedata)arenotlikelytobeincluded.Thepanelsharesthisconcernbecausetheexclusionofmetadatanotonlyencouragessometimesthoughtlessanalyses,butalsomakesittooeasyforstatisticianstoconductisolatedresearchinsoftwareengineering.Thepanelbelievesthattrulycollaborativeresearchmustbeundertakenandthatitmustbedonewithakeeneyetosolvingtheparticularproblemsfacedbythesoftwareindustry.Nevertheless,thepanelrecognizesbenefitstocollectingdataorexperimentationinsoftwaredevelopment.AsispointedoutinmoredetailinChapter5,oneofthelargestimpactsthestatisticalcommunity

Page 35: Statistical software engineering
Page 36: Statistical software engineering

Page8

canhaveinsoftwareengineeringconcernseffortstocombineinformation(NRC,1992)acrosssoftwareengineeringprojectsasameansofevaluatingtheeffectsoftechnology,language,organization,andthedevelopmentprocessitself.Althoughdifficultissuesareposedbytheneedtoadjustappropriatelyfordifferencesinprojects,theinconsistencyofmetrics,andvaryingdegreesofdataquality,theavailabilityofadatarepositoryatleastallowsforsuchresearchtobegin.

Althoughthisreportservesasareviewofthesoftwareproductionprocessandrelatedresearchtodate,itisnecessarilyincomplete.Limitationsonthescopeofthepanel'seffortsprecludedafullertreatmentofsomematerialandtopicsaswellasinclusionofcasestudiesfromawidervarietyofbusinessandcommercialsectors.Thepanelresistedthetemptationtodrawonanalogiesbetweensoftwaredevelopmentandtheconvergingareaofcomputerhardwaredevelopment(whichforthemostpartisinitiallyrepresentedinsoftware).Theoneapproachitisconfidentofnotreflectingisover-simplificationoftheproblemdomainitself.

Page 37: Statistical software engineering

Page9

2CaseStudy:NASASpaceShuttleFlightControlSoftwareTheNationalAeronauticsandSpaceAdministrationleadstheworldinresearchinaeronauticsandspace-relatedactivities.Thespaceshuttleprogram,beguninthelate1970s,wasdesignedtosupportexplorationofEarth'satmosphereandtoleadthenationbackintohumanexplorationofspace.

IBM'sFederalSystemsDivision(nowLoral),whichwascontractedtosupportNASA'sshuttleprogrambydevelopingandmaintainingthesafety-criticalsoftwarethatcontrolsflightactivities,hasgainedmuchexperienceandinsightinthedevelopmentandsafeoperationofcriticalsoftware.Throughouttheprogram,theprevailingmanagementphilosophyhasbeenthatqualitymustbebuiltintosoftwarebyusingsoftwarereliabilityengineeringmethodologies.Thesemethodologiesarenecessarilydependentontheabilitytomanage,control,measure,andanalyzethesoftwareusingdescriptivedatacollectedspecificallyfortrackingandstatisticalanalysis.BasedonapresentationbyKeller(1993)atthepanel'sinformation-gatheringforum,thefollowingcasestudydescribesspaceshuttleflightsoftwarefunctionalityaswellasthesoftwaredevelopmentprocessthathasevolvedforthespaceshuttleprogramoverthepast15years.

OVERVIEWOFREQUIREMENTS

Theprimaryavionicssoftwaresystem(PASS)isthemission-criticalon-boarddataprocessingsystemforNASA'sspaceshuttlefleet.Inflight,allshuttlecontrolactivitiesincludingmainenginethrottling,directingcontroljetstoturnthevehicleinadifferentorientation,

Page 38: Statistical software engineering

firingtheengines,orprovidingguidancecommandsforlandingareperformedmanuallyorautomaticallywiththissoftware.IntheeventofaPASSfailure,thereisabackupsystem.Asindicatedinthespaceshuttleflightloghistory,thebackupsystemhasneverbeeninvoked.

Toensurehighreliabilityandsafety,IBMhasdesignedthespaceshuttlecomputersystemtohavefourredundant,synchronizedcomputers,eachofwhichisloadedwithanidenticalversionofthePASS.Every3to4milliseconds,thefourcomputerscheckwithoneanothertoassurethattheyareinlockstepandaredoingthesamething,seeingthesameinput,sendingthesameoutput,andsoforth.Theoperatingsystemisdesignedtoinstantaneouslydeselectafailedcomputer.

ThePASSissafety-criticalsoftwarethatmustbedesignedforqualityandsafetyattheoutset.Itconsistsofapproximately420,000linesofsourcecodedevelopedinHAL,anengineeringlanguageforreal-timesystems,andishostedonflightcomputerswithverylimitedmemory.Softwareisintegratedwithintheflightcontrolsystemintheformofoverlays-onlythesmallamountofcodenecessaryforaparticularphaseoftheflight(e.g.,ascent,on-orbit,orentryactivities)isloadedincomputermemoryatanyonetime.Atquiescentpointsinthe

Page 39: Statistical software engineering

Page10

mission,thememorycontentsare"swappedout"forprogramapplicationsthatareneededforthenextphaseofthemission.

Insupportofthedevelopmentofthissafety-criticalflightcode,thereareanother1.4millionlinesofcode.Thisadditionalsoftwareisusedtobuild,develop,andtestthesystemaswellastoprovidesimulationcapabilityandperformconfigurationcontrol.Thissupportsoftwaremusthavethesamehighqualityastheon-boardsoftware,giventhatflawedgroundsoftwarecanmaskerrors,introduceerrorsintotheflightsoftware,orprovideanincorrectconfigurationofsoftwaretobeloadedaboardtheshuttle.

Inshort,IBM/Loralmaintainsapproximately2millionlinesofcodeforNASA'sspaceshuttleflightcontrolsystem.ThecontinuallyevolvingrequirementsofNASA'sspaceflightprogramresultinanevolvingsoftwaresystem:thesoftwareforeachshuttlemissionflownisacompositeofcodethathasbeenimplementedincrementallyover15years.Atanygiventime,thereisasubsetoftheoriginalcodethathasneverbeenchanged,codethatwassequentiallyaddedineachupdate,andnewcodepertainingtothecurrentrelease.Approximately275peoplesupportthespaceshuttlesoftwaredevelopmenteffort.

THEOPERATIONALLIFECYCLE

OriginallythePASSwasdevelopedtoprovideabasicflightcapabilityofthespaceshuttle.Thefirstflownversionwasdevelopedandsupportedforflightsin1981through1982.However,therequirementsoftheflightmissionsevolvedtoincludeincreasedoperationalcapabilityandmaintenanceflexibility.Amongtheshuttleprogramenhancementsthatchangedtheflightcontrolsystemrequirementswerechangesinpayloadmanifestcapabilitiesandmainenginecontroldesign,crewenhancements,additionofanexperimentalautopilotfororbiting,systemimprovements,abort

Page 40: Statistical software engineering

enhancements,provisionsforextendedlandingsites,andhardwareplatformchanges.FollowingtheChallengeraccident,whichwasnotrelatedtosoftware,manynewsafetyfeatureswereaddedandthesoftwarewaschangedaccordingly.

Foreachreleaseofflightsoftware(calledanoperationalincrement),anominal6-to9-monthperiodelapsesbetweendeliverytoNASAandactualflight.Duringthistime,NASAperformssystemverification(toassurethatthedeliveredsystemcorrectlyperformsasrequired)andvalidation(toassurethattheoperationiscorrectfortheintendeddomain).Thisphaseofthesoftwarelifecycleiscriticaltoassuringsafetybeforeasafety-criticaloperationoccurs.Itisatimeforacompleteintegratedsystemtest(flightsoftwarewithflighthardwareinoperationaldomainscenarios).Crewtrainingformissionpracticesisalsoperformedatthistime.

ASTATISTICALAPPROACHTOMANAGINGTHESOFTWAREPRODUCTIONPROCESS

Tomanagethesoftwareproductionprocessforspaceshuttleflightcontrol,descriptivedataaresystematicallycollected,maintained,andanalyzed.Atthebeginningofthespaceshuttleprogram,globalmeasurementsweretakentotrackschedulesandcosts.Butassoftware

Page 41: Statistical software engineering

Page11

developmentcommenced,itbecamenecessarytoretainmuchmoreproduct-specificinformation,owingtothecriticalnatureofspaceshuttleflightaswellastheneedforcompleteaccountabilityfortheshuttle'soperation.Thedetailandgranularityofdatadictatenotonlythetypebutalsothelevelofanalysisthatcanbedone.Datarelatedtofailureshavebeenspecificallyaccumulatedinadatabasealongwithalltheothercorollaryinformationavailable,andaprocedurehasbeenestablishedforreliabilitymodeling,statisticalanalysis,andprocessimprovementbasedonthisinformation.

Acompositedescriptionofallspaceshuttlesoftwareofvariousagesismaintainedthroughaconfigurationmanagement(CM)system.TheCMdataincludenotonlyachangeitself,butalsothelinesofcodeaffected,reasonsforthechange,andthedateandtimeofchange.Inaddition,theCMsystemincludesdatadetailingscenariosforpossiblefailuresandtheprobabilityoftheiroccurrence,userresponseprocedures,theseverityofthefailures,theexplicitsoftwareversionandspecificlinesofcodeinvolved,thereasonsfornopreviousdetection,howlongthefaulthadexisted,andtherepairorresolution.Althoughthesedataseemabundant,itisimportanttoacknowledgetheirtimedependence,becausethesoftwaresystemtheydescribeissubjecttoconstant"churn."

Overtheyears,theCMsystemforthespaceshuttleprogramhasevolvedintoacommon,minimumsetofdatathatmustberetainedregardingeveryfaultthatisrecognizedanywhereinthelifecycle,includingfaultsfoundbyinspectionsbeforesoftwareisactuallybuilt.Thisevolutionarydevelopmentisamenabletoevaluationbystatisticalmethods.Trendanalysisandpredictionsregardingtesting,allocationofresources,andestimationofprobabilitiesoffailureareexamplesofthemanyactivitiesthatdrawonthedatabase.Thisdatabasealsocontinuestobethebasisfordefininganddevelopingsophisticated,

Page 42: Statistical software engineering

insightfulestimationtechniquessuchasthosedescribedbyMunson(1993).

FaultDetection

Managementphilosophyprescribesthatprocessimprovementispartoftheprocess.Suchproactiveprocessimprovementincludesinspectionateverystepoftheprocess,detaileddocumentationoftheprocess,andanalysisoftheprocessitself.

Thecriticalimplicationsofanill-timedfailureinspaceshuttleflightcontrolsoftwarerequirethatremediesbedecisiveandaggressive.Whenafaultisidentified,afeedbackprocessinvolvingdetailedinformationonthefaultenforcesasearchforsimilarfaultsintheexistingsystemandchangestheprocesstoguardactivelyagainstsuchfaultsinflightcontrolsoftwaredevelopment.Thecharacteristicsofasinglefaultareactivelydocumentedinthefollowingfour-stepreactiveprocess-improvementprotocol:

1.Removethefault,

2.Identifytherootcauseofthefault,

3.Eliminatetheprocessdeficiencythatletthefaultescapeearlierdetection,and

4.Analyzetheproductforother,similarfaults.

Page 43: Statistical software engineering

Page12

Furtherscrutinyofwhatoccurredintheprocessbetweenintroductionanddetectionofafaultisaimedatdeterminingwhydownstreamprocesselementsfailedtodetectandremovethefault.Suchintrospectiveanalysisisdesignedtoimprovetheprocessandspecificprocesselementssothatifasimilarfaultisintroducedagain,theseprocesselementswilldetectitbeforeitgetstoofaralongintheproductlifecycle.Thisfour-stepprocessimprovementisachievablebecauseofthematurityoftheoverallIBM/Loralsoftwaremanagementprocess.ThecompleterecordingofprojecteventsintheCMsystem(phaseoftheprocess,changehistoryofinvolvedline(s)ofcode,thelineofcodethatincludedanerror,theindividualsinvolved,andsoon)allowshindsightsothatthedevelopmentteamcanapproachtheoccurrenceofanerrornotasafailurebutratherasanopportunitytoimprovetheprocessandtofindother,similarerrors.

SafetyCertification

Thedependabilityofsafety-criticalsoftwarecannotbebasedmerelyontestingthesoftware,countingandrepairingthefaults,andconducting"livetests"onshuttlemissions.Testingofsoftwareformany,manyyears,muchlongerthanitslifecycle,wouldberequiredinordertodemonstratesoftwarefailureprobabilitylevelsof10-7or10-9peroperationalhour.Aprocessmustbeestablished,anditmustbedemonstratedstatisticallythatifthatprocessisfollowedandmaintainedunderstatisticalcontrol,thensoftwareofknownqualitywillresult.Oneresultistheabilitytopredictaparticularleveloffaultdensity,inthesensethatfaultdensityisproportionaltofailureintensity,andsoprovideaconfidencelevelregardingsoftwarequality.Thisapproachisdesignedtoensurethatqualityisbuiltintothesoftwareatameasurablelevel.IBM'shistoricaldatademonstrateaconstantlyimprovingprocessforcomfortofspaceshuttleflight.Theuseofsoftwareengineeringmethodologiesthatincorporatestatistical

Page 44: Statistical software engineering

analysismethodsgenerallyallowstheestablishmentofabenchmarkforobtainingavalidmeasureofhowwellaproductmeetsaspecifiedlevelofquality.

Page 45: Statistical software engineering

Page13

3ASoftwareProductionModelThesoftwaredevelopmentprocessspansthelifecycleofagivenproject,fromthefirstidea,toimplementation,throughcompletion.Manyprocessmodelsfoundintheliteraturedescribewhatisbasicallyaproblem-solvingeffort.Theonediscussedindetailbelow,asaconvenientwaytoorganizethepresentation,isoftendescribedasthewaterfallmodel.Itisthebasisfornearlyallthemajorsoftwareproductsinusetoday.Butaswithallgreatworkhorses,itisbeginningtoshowitsage.Newmodelsincurrentuseincludethosewithdesignandimplementationoccurringinparallel(e.g.,rapidprototypingenvironments)andthoseadoptingamoreintegrated,lesslinear,viewofaprocess(e.g.,thespiralmodelreferredtoinChapter6).Althoughthediscussioninthischapterisspecifictoaparticularmodel,thatinsubsequentchapterscutsacrossallmodelsandemphasizestheneedtoincorporatestatisticalinsightintothemeasurement,datacollection,andanalysisaspectsofsoftwareproduction.

Thefirststepofthesoftwarelifecycle(Boehm,1981)isthegenerationofsystemrequirementswherebyfunctionality,interactions,andperformanceofthesoftwareproductarespecifiedin(usually)numerousdocuments.Inthedesignstep,systemrequirementsarerefinedintoacompleteproductdesign,anoverallhardwareandsoftwarearchitecture,anddetaileddescriptionsofthesystemcontrol,data,andinterfaces.Theresultofthedesignstepis(usually)asetofdocumentslayingoutthesystem'sstructureinsufficientdetailtoensurethatthesoftwarewillmeetsystemrequirements.Mostoften,bothrequirementsanddesigndocumentsareformallyreviewedpriortocodinginordertoavoiderrorscausedbyincorrectlystated

Page 46: Statistical software engineering

requirementsorpoordesign.Thecodingstagecommencesoncethesereviewsaresuccessfullycompleted.Sometimesschedulingconsiderationsleadtoparallelreviewandcodingactivities.Normallyindividualsorsmallteamsareassignedspecificmodulestocode.Codeinspectionshelpensurethatmodulequality,functionality,andschedulearemaintained.

Oncemodulesarecoded,thetestingstepbegins.(ThistopicisdiscussedinsomedetailinChapter3.)Testingisdoneincrementallyonindividualmodules(unittesting),onsetsofmodules(integrationtesting),andfinallyonallmodules(systemtesting).Inevitably,faultsareuncoveredintestingandareformallydocumentedasmodificationrequests(MRs).OnceallMRsareresolved,ormoreusuallyasschedulesdictate,thesoftwareisreleased.Fieldexperienceisrelayedbacktothedeveloperasthesoftwareis"burnedin"inaproductionenvironment.Patchesorrereleasesfollowbasedoncustomerresponse.Backwardcompatibilitytests(regressiontesting)areconductedtoensurethatcorrectfunctionalityismaintainedwhennewversionsofthesoftwareareproduced.

Theaboveoverviewisnoticeablynonquantitative.Indeed,thisnonquantitativecharacteristicisthemoststrikingdifferencebetweensoftwareengineeringandmoretraditional(hardware)engineeringdisciplines.Measurementofsoftwareiscriticalforcharacterizingboththeprocessandtheproduct,andyetsuchmeasurementhasproventobeelusiveandcontroversial.AsarguedinChapter1,theapplicationofstatisticalmethodsispredicatedontheexistenceofrelevantdata,andtheissueofsoftwaremeasurementsandmetricsisdiscussed

Page 47: Statistical software engineering

Page14

prominentlythroughoutthereport.Thisisnottoimplythatmeasurementshaveneverbeenmadeorthatdataaretotallylacking.Unfortunatelymetricstendtodescribepropertiesandconditionsforwhichitiseasytogatherdataratherthanthosethatareusefulforcharacterizingsoftwarecontent,complexity,andform.

PROBLEMFORMULATIONANDSPECIFICATIONOFREQUIREMENTS

Withinthecontextofsystemdevelopment,specificationsforrequiredsoftwarefunctionsarederivedfromthelargersystemrequirements,whicharetheprimarysourcefordeterminingwhatthedeliveredsoftwareproductwilldoandhowitwilldoit.Theserequirementsaretranslatedbythedesignerordesignteamintoafinishedproductthatdeliversallthatisexplicitlystatedanddoesnotincludeanythingexplicitlyforbidden.SomecommonreferencesregardingrequirementsspecificationarementionedinIEEEStandardforSoftwareProductivityMetrics(IEEE,1993).

Requirementsthefirstformaltangibleproductobtainedinthedevelopmentofasystem-aresubjectivestatementsspecifyingthesystem'svariousdesiredoperationalcharacteristics.Errorsinrequirementsariseforanumberofreasons,includingambiguousstatements,inconsistentinformation,unclearuserrequirements,andincompleterequests.Projectsthathaveill-definedorunstatedrequirementsaresubjecttoconstantiteration,andalackofpreciserequirementsisakeysourceofsubsequentsoftwarefaults.Ingeneral,thelongerafaultresidesinasystembeforeitisdetected,thegreateristhecostofremovingitorrecoveringfromrelatedfailures.Thisconditionisaprimarydriverofthereviewprocessthroughoutsoftwaredevelopment.

Theformulationrequirementsstartwithcustomersrequestinganew

Page 48: Statistical software engineering

functionality.Systemsengineerscollectinformationdescribingthenewfunctionalityanddevelopacustomerspecificationdescription(CSD)describingthecustomer'sviewofthefeature.TheCSDisusedinternallybysoftwaredevelopmentorganizationstoformulatecostestimatesforbidding.Afterthefeatureiscommitted(sold),systemsengineerswriteafeaturespecificationdescription(FSD)describingtheinternalviewofthefeature.TheFSDiscommonlyreferredtoas''requirements."BoththeCSDandFSDarecarefullyreviewedandmustmeetformalcriteriaforapproval.

DESIGN

Theheartofthesoftwaredevelopmentcycleisthetranslationandrefinementoftherequirementsintocode.Softwarearchitectstransformtherequirementsforeachspecifiedfeatureintoahigh-leveldesign.Aspartofthisprocess,theydeterminewhichsubsystems(e.g.,databases)andmodulesarerequiredandhowtheyinteractorcommunicate.Thebroad,high-leveldesignisthenrefinedintoadetailedlow-leveldesign.Thistransformationinvolvesmuchinformationgatheringanddetectivework.Thesoftwarearchitectsareoftenthemostexperiencedandknowledgeableofthesoftwareengineers.

Thesequenceofcontinualrefinementsultimatelyresultsinamappingofhigh-levelfunctionsintomodulesandcode.Partofthisdesignprocessisselectinganappropriate

Page 49: Statistical software engineering

Page15

representation,whichinmostcasesisaspecificprogramminglanguage.Selectionofarepresentationinvolvesfactorssuchasoperationaldomain,systemperformance,andfunction,amongothers.Whencompleted,thehigh-leveldesignisreviewedbyall,includingthoseconcernedwiththeaffectedsubsystemsandtheorganizationresponsiblefordevelopment.

Thehumanelementisacriticalissueintheearlystagesofasoftwareproject.Quantitativedataarepotentiallyavailablefollowingdocumentreviews.Specifically,earlyinthedevelopmentcycleofsoftwaresystems,(paper)documentsareprepareddescribingfeaturerequirementsorfeaturedesign.Priortoaformaldocumentreview,thereviewersindividuallyreadthedocument,notingissuesthattheybelieveshouldberesolvedbeforethedocumentisapprovedandfeaturedevelopmentisbegun.Atthereviewmeeting,asinglelistofissuesispreparedthatincludestheissuesnotedbythereviewersaswellastheonesdiscoveredduringthemeetingitself.Thisprocessthusgeneratesdataconsistingofatabulationofissuesfoundbyeachreviewer.Thedegreeofoverlapprovidesinformationregardingthenumberofremainingissues,thatis,thoseyettobeidentified.Ifthisnumberisacceptablysmall,theprocesscanproceedtothenextstep;ifnot,furtherdocumentrefinementisnecessaryinordertoavoidcostlyfixeslaterintheprocess.Theproblemasstatedbearsacertainresemblancetocapture-recapturemodelsinwildlifestudies,andsoappropriatestatisticalmethodscanbedevisedforanalyzingthereviewdata,asillustratedinthefollowingexample.

Example.Table1containsdataonissuesidentifiedforaparticularfeaturefortheAT&T5ESSswitch(Eicketal.,1992a).Sixreviewersfoundatotalof47distinctissues.Acommoncapture-recapturemodelassumesthateachissuehasthesameprobabilityofbeingcaptured(detected)andthatreviewersworkindependentlywiththeirown

Page 50: Statistical software engineering

chanceofcapturinganissue,ordetectionprobability.Undersuchamodel,likelihoodmethodsyieldanestimateofN=65,implyingthatapproximately20issuesremaintobeidentifiedinthedocument.Anupper95%confidenceboundforNunderthismodelis94issues.

Suchamodelisnaturalbutsimplistic.Thesoftwaredevelopmentenvironmentisnotconducivetoindependenceamongreviewers(sothatsomedegreeofcollusionisunavoidable),andreviewersalsoareselectedtocovercertainareasofspecialization.Ineithercase,thecornerstoneofcapture-recapturemodels,thebinomialdistribution,isnolongerappropriateforthetotalnumberofissues.Itispossibletodevelopalikelihood-basedtestforpairwisecollusionofreviewersandreviewer-specifictestsofspecialization.Intheexampleabove,thereisnoevidenceofcollusionamongreviewers,butreviewerCexhibitsasignificantlygreaterdegreeofspecializationthandotheotherreviewers.Whenthisrevieweristreatedasaspecialist,themaximumlikelihoodestimate(MLE)ofthenumberofissuesisreducedto53,implyingthatonlyahalfdozenissuesremaintobediscoveredinthedocument.

Othermismatchesbetweenthedataarisinginsoftwarereviewandthoseincapture-recapturewildlifepopulationstudiesinducebiasintheMLE.Anotherpossibleestimatorforthisproblemisthejackknifeestimator(BurnhamandOverton,1978).ButthisestimatorseemsinfacttobemorebiasedthantheMLE(VanderWielandVotta,1993).Botharerescuedtoalargeextentbytheircategorizationoffaultsintoclasses(e.g.,"easytofind"versus"hardtofind").Inanygiven

Page 51: Statistical software engineering

Page16

Table1.Issuediscovery.Therowsofthetablerepresent47issuesnotedbysixreviewerspriortoreviewmeetings.Anentryincelli,jofthetableindicatesthatissuei(i=1,...,47)wasnotedbyreviewerj(j=A,...,F).Rowswithnoentries(i.e.,columnsumsofzero)correspondtoissuesdiscoveredatthemeeting.

Issue A B C D E F Sum Issue A B C D E F Sum

1 1 1 25 1 1 2

2 1 1 2 26 1 1 2

3 1 1 27 1 1

4 1 1 28 1 1

5 0 29 1 1 2

6 1 1 30 1 1

7 1 1 31 1 1

8 1 1 32 1 1

9 0 33 1 1

10 1 1 34 1 1 1 3

11 1 1 35 1 1 2

12 1 1 36 1 1

13 0 37 1 1

14 1 1 2 38 1 1

15 1 1 39 1 1

16 0 40 1 1

17 1 1 2 41 1 1

Page 52: Statistical software engineering

18 1 1 42 1 1 2

19 1 1 2 43 1 1

20 1 1 2 44 1 1

21 1 1 1 1 1 5 45 1 1

22 1 1 2 46 1 1

23 1 1 47 1 1

24 1 1 SUM 25 3 4 13 9 6 60

application,itisnecessarytoverifythatthe"easytofind"and"hardtofind"classificationismeaningful,ortodeterminethatitismerelypartitioningthedistributionofdifficultyinanarbitrarymanner.Arelevantpointinthisandotherapplicationsofstatisticalmethodsinsoftwareengineeringisthataddressingaspectsoftheproblemthatinducestudybiasisimportantandvalued-theoreticalworkaddressingaspectsofstatisticalbiasisnotlikelytobeashighlyvalued.

IMPLEMENTATION

Thephaseinthesoftwaredevelopmentprocessthatisoftenreferredtointerchangeablyascoding,development,orimplementationistheactualtransformationoftherequirementsintoexecutableform."Implementationinthesmall"referstocoding,and"implementationinthelarge"referstodesigninganentiresysteminatop-downfashionwhilemaintainingaperspectiveonthefinalintegratedsystem.

Page 53: Statistical software engineering

Page17

Low-leveldesigns,orcodingunits,arecreatedfromthehigh-leveldesignforeachsubsystemandmodulethatneedstobechanged.Eachcodingunitspecifiesthechangestobemadetotheexistingfiles,newormodifiedentrypoints,andanyfilethatmustbeadded,aswellasotherchanges.Afterdocumentreviewsandapprovals,thecodingmaybegin.Usingprivatecopiesofthecode,developersmakethechangesandaddthefilesspecifiedinthecodingunit.Codingisdelicatework,andgreatcareistakensothatunwantedsideeffectsdonotbreakanyoftheexistingcode.Aftercompletion,thecodeistestedbythedeveloperandcarefullyreviewedbyotherexperts.Thechangesaresubmittedtoapublicload(codefromallprogrammersthatismergedandloadedsimultaneously)usinganMRnumber.TheMRistiedbacktothefeaturetoestablishacorrespondencebetweenthecodeandthefunctionalitythatitprovides.

MRsareassociatedwiththesystemversionmanagementsystem,whichmaintainsacompletehistoryofeverychangetothesoftwareandcanrecreatethecodeasitexistedatanypointintime.Forproductionsoftwaresystems,versionmanagementsystemsarerequiredtoensurecodeintegrity,tosupportmultiplesimultaneousreleases,andtofacilitatemaintenance.Ifthereisaproblem,itmaybenecessarytobackoutchanges.Besidesarecordoftheaffectedlines,otherinformationiskept,suchasthenameoftheprogrammermakingthechanges,theassociatedfeaturenumber,whetherachangefixesafaultoraddsnewfunctionality,thedateofachange,andsoon.

Theconfigurationmanagementdatabasecontainstherecordofcodechanges,orchangehistoryofthecode.Eicketal.(1992b)describeavisualizationtechniquefordisplayingthechangehistoryofsourcecode.Thegraphicaltechniquerepresentseachfileasaverticalcolumnandeachlineofcodeasacolor-codedrowwithinthecolumn.Therowindentationandlengthtrackthecorrespondingtext,andthe

Page 54: Statistical software engineering

rowcoloristiedtoastatistic.Iftherowtrackingisliteralaswithcomputersourcecode,thedisplaylooksasifthetexthadbeenprintedincolorandthenphoto-reducedforviewingasasinglefigure.Thespatialpatternofcolorshowsthedistributionofthestatisticwithinthetext.

Example.Developinglargesoftwaresystemsisaproblemofscale.Inmultimillion-linesystemstheremaybehundredsofthousandsoffilesandtensofthousandsofmodules,workedonbythousandsofprogrammersformultiyearperiods.Justdiscoveringwhattheexistingcodedoesisamajortechnicalproblemconsumingsignificantamountsoftime.Acontinuingandsignificantproblemisthatofcodediscovery,wherebyprogrammerstrytounderstandhowunfamiliarcodeworks.Itmaytakeseveralweeksofdetailedstudytochangeafewlinesofcodewithoutcausingunwantedsideeffects.Indeed,muchoftheeffortinmaintenanceinvolveschangingcodewrittenbyanotherprogrammer.Becauseofvariationinprogrammerstaffsizesandinevitableturnover,trainingnewprogrammersisimportant.Visualizationtechniques,describedfurtherinChapter5,canimproveproductivitydramatically.

Figure1displaysamodulecomposedof20sourcecodefilescontaining9,365linesofcode.Theheightofeachcolumnindicatesthesizeofthefile.Fileslongerthanonecolumnarecontinuedovertothenext.Therowcolorindicatestheageofeachlineofcodeusingarainbowcolorscalewiththenewestlinesinredandtheoldestinblue.Ontheleftisaninteractivecolorscaleshowingacolorforeachofthe324changesbythe126programmersmodifyingthiscode

Page 55: Statistical software engineering

Page18

overthelast10years.Thevisualimpressionisthatofaminiaturepictureofallofthesourcecode,withtheindentationshowingtheusualClanguagecontrolstructure.

Theperceptionofcolorsisblurred,butthereareclearpatterns.Filesinapproximatelythesamehuewerewrittenataboutthesametimeandarerelated.Rainbowfileswithmanydifferenthuesareunstableandarelikelytobetroublespotsbecauseofallthechanges.Thebiggestfilehasabout1,300linesofcodeandtakesacolumnandahalf.

Changesfrommanycodingunitsareperiodicallycombinedtogetherintoaso-calledcommonloadofthesoftwaresystem.Theloadiscompiled,madeavailabletodevelopersfortesting,andinstalledinthelaboratorymachines.Bringingthechangestogetherisnecessarysothatdevelopersworkingondifferentcodingunitsofacommonfeaturecanensurethattheircodeworkstogetherproperlyanddoesnotbreakanyotherfunctionality.Developersalsousethepublicloadtotesttheircodeonlaboratorymachines.

Afterallcodingunitsassociatedwithafeaturearecompleteandithasbeentestedbythedevelopersinthelaboratory,thefeatureisturnedovertotheintegrationgroupforindependenttesting.TheintegrationgrouprunstestsofthefeatureaccordingtoafeaturetestplanthatwaspreparedinparallelwiththeFSD.Eventuallythenewcodeisreleasedaspartofanupgradeorsentoutdirectlyifitfixesacriticalfault.Atthisstage,maintenanceonthecodebegins.Ifcustomershaveproblems,developerswillneedtosubmitfaultmodificationrequests.

TESTING

Manysoftwaresystemsinusetodayareverylarge.Forexample,the

Page 56: Statistical software engineering

softwarethatsupportsmoderntelecommunicationsnetworks,orprocessesbankingtransactions,orchecksindividualtaxreturnsfortheInternalRevenueServicehasmillionsoflinesofcode.Thedevelopmentofsuchlarge-scalesoftwaresystemsisacomplexandexpensiveprocess.Becauseasinglesimplefaultinasystemmaycripplethewholesystemandresultinasignificantloss(e.g.,lossoftelephoneserviceinanentirecity),greatcareisneededtoassurethatthesystemisflawlesslyconstructed.Becauseafaultcanoccurinonlyasmallpartofasystem,itisnecessarytoassurethatevensmallprogramsareworkingasintended.Suchcheckingforconformanceisaccomplishedbytestingthesoftware.

Specifically,thepurposeofsoftwaretestingistodetecterrorsinaprogramand,intheabsenceoferrors,gainconfidenceinthecorrectnessoftheprogramorthesystemundertest.Althoughtestingisnosubstituteforimprovingaprocess,itdoesplayacrucialroleintheoverallsoftwaredevelopmentprocess.Testingisimportantbecauseitiseffective,ifcostly.Itisvariouslyestimatedthatthetotalcostoftestingisapproximately20to33%ofthetotalsoftwarebudgetforsoftwaredevelopment(Humphrey,1989).ThisfractionamountstobillionsofdollarsintheU.S.softwareindustryalone.Further,softwaretestingisverytimeconsuming,becausethetimefortestingistypicallygreaterthanthatforcoding.Thus,effortstoreducethecostsandimprovetheeffectivenessoftestingcanyieldsubstantialgainsinsoftwarequalityandproductivity.

Page 57: Statistical software engineering

Page19

Figure1.ASeeSoftTMdisplayshowingamodulewith20filesand9,365linesofcode.Eachfileisrepresentedasacolumnandeachlineofcodeasacoloredrow.Thenewestrowsareinredandtheoldestinblue,withacolorspectruminbetween.Thisoverviewhighlightsthelargestfilesandprogramcontrolstructures,whilethecolorshowsrelationshipsbetweenfiles,aswellasunstable,frequentlychangedcode.Eicketal.(1992b).

Page 58: Statistical software engineering

Page21

Muchofthedifficultyofsoftwaretestingisinthemanagementofthetestingprocess(producingreports,enteringMRs,documentingMRscleared,andsoon),themanagementoftheobjectsofthetestingprocess(testcases,testdrivers,scripts,andsoon),andthemanagementofthecostsandtimeoftesting.

Typically,softwaretestingreferstothephaseoftestingcarriedoutafterpartsofcodearewrittensothatindividualprogramsormodulescanbecompiled.Thisphaseincludesunit,integration,system,product,customer,andregressiontesting.Unittestingoccurswhenprogrammerstesttheirownprograms,andintegrationtestingisthetestingofpreviouslyseparatepartsofthesoftwarewhentheyareputtogether.Systemtestingisthetestingofafunctionalpartofthesoftwaretodeterminewhetheritperformsitsexpectedfunction.Producttestingismeanttotestthefunctionalityofthefinalsystem.Customertestingisoftenproducttestingperformedbytheintendeduserofthesystem.Regressiontestingismeanttoassurethatanewversionofasystemfaithfullyreproducesthedesirablebehavioroftheprevioussystem.

Besidesthestagesoftesting,therearemanydifferenttestingmethods.Inwhiteboxtesting,testsaredesignedonthebasisofdetailedarchitecturalknowledgeofthesoftwareundertest.Inblackboxtesting,onlyknowledgeofthefunctionalityofthesoftwareisusedfortesting;knowledgeofthedetailedarchitecturalstructureoroftheproceduresusedincodingisnotused.Whiteboxtestingistypicallyusedduringunittesting,inwhichthetester(whoisusuallythedeveloperwhocreatedthecode)knowstheinternalstructureandtriestoexerciseitbasedondetailedknowledgeofthecode.Blackboxtestingisusedduringintegrationandsystemtesting,whichemphasizestheuserperspectivemorethantheinternalworkingsofthesoftware.Thus,blackboxtestingtriestotestthefunctionalityof

Page 59: Statistical software engineering

softwarebysubjectingthesystemundertesttovarioususer-controlledinputsandbyassessingitsresultingperformanceandbehavior.

Sincethenumberofpossibleinputsortestcasesisalmostlimitless,testersneedtoselectasample,asuiteoftestcases,basedontheireffectivenessandadequacy.Hereinliesignificantopportunitiesforstatisticalapproaches,especiallyasappliedtoblackboxtesting.Adhocblackboxtestingcanbedonewhentesters,perhapsbasedontheirknowledgeofthesystemundertestanditsusers,decidespecificinputs.Anotherapproach,basedonstatisticalsamplingideas,istogeneratetestcasesrandomly.Theresultsofthistestingcanbeanalyzedbyusingvarioustypesofreliabilitygrowthcurvemodels(see"AssessmentandReliability"inChapter4).Randomgenerationrequiresastatisticaldistribution.Sincethepurposeofblackboxtestingistosimulateactualusage,ahighlyrecommendedtechniqueistogeneratetestcasesrandomlyfromthestatisticaldistributionneededbyusers,oftenreferredtoastheoperationalprofileofasystem.

Thereareseveraladvantagesanddisadvantagestostatisticaloperationalprofiletesting.Akeyadvantageisthatifonetakesalargeenoughsample,thenthesystemundertestwillbetestedinallthewaysthatausermayneeditandthusshouldexperiencefewerfieldfaults.Anotheradvantageofthismethodisthepossibilityofbringingthefullforceofstatisticaltechniquestobearoninferentialproblems;thatis,theresultsobtainedduringtestingcanbegeneralizedtomakeinferencesaboutthefieldbehaviorofthesystemundertest,includinginferencesaboutthenumberoffaultsremaining,thefailurerateinthefield,andsoon.

Inspiteofalltheseadvantages,statisticaloperationalprofiletestinginitspurestformisrarelyused.Therearemanydifficulties;someareoperationalandothersaremorebasic.Forexample,onecanneverbecertainabouttheoperationalprofileintermsofinputs,andespecially

Page 60: Statistical software engineering
Page 61: Statistical software engineering

Page22

intermsoftheirprobabilitiesofoccurrence.Also,forlargesystems,theinputspaceishigh-dimensional.Thus,anotherproblemishowtosamplefromthishigh-dimensionalspace.Further,thedistributionisnotstatic;itwill,inalllikelihood,changeovertimeasnewusersexercisethesysteminunanticipatedways.Evenifthispossibilitycanbediscounted,questionsremainabouttheefficiencyofstatisticaloperationalprofiletesting,whichcanbeveryinefficient,becausemostoftenthesystemundertestwillbeusedinroutineways,andthusarandomlydrawnsamplewillbehighlyweightedbyroutineoperations.Thishighweightingmaybefineifthenumberoftestcasesisverylarge.Butthentestingwouldbeveryexpensive,perhapsevenprohibitivelyso.Therefore,testersoftenadoptsomevariantofdrawingarandomsample;forexample,testersgivemoreweighttoboundaryvaluesthosevaluesaroundwhichthesystemisexpectedtochangeitsbehaviorandthereforewherefaultsarelikelytobefound.Thisandothercleverstrategiesadoptedbytesterstypicallyresultinatestingdistributionthatisquitedifferentfromtheoperationalprofile.Ofcourse,insuchacasetheresultsofthetestinglaboratorywillnotbegeneralizableunlesstherelationshipsbetweenthetwodistributionsaretakenintoaccount.

Thus,totakeadvantageoftheattractivenessofoperationalprofiletesting,somekeyproblemshavetobesolved:

1.Howtoobtaintheoperationalprofile,

2.Howtosampleaccordingtoastatisticaldistributioninhigh-dimensionalspace,and

3.Howtogeneralizeresultsobtainedinthetestinglaboratorytothefieldwhenthetestingdistributionisavariantoftheoperationalprofiledistribution.

Page 62: Statistical software engineering

Allofthesequestionscanbedealtwithconceptuallyusingstatisticalapproaches.

For(1),aBayesianelicitationprocedurecanbeenvisionedtoderivetheoperationalprofile.ThiselicitationisdoneroutinelyinBayesianapplications,butbecausethespaceisveryhighdimensional,techniquesareneededforBayesianelicitationinveryhighdimensionalspaces.

Concerning(2),ifthejointdistributioncorrespondingtotheoperationalprofileisknown,schemescanbeusedthataremoreefficientthansimplerandomsamplingschemes.Simplerandomsamplingisinefficientbecauseittypicallygiveshigherprobabilitytothemiddleofadistributionthantoitstails,especiallyinhighdimensions.Amoreefficientschemewouldsamplethetailsquickly.Thiscanbeaccomplishedbystratifyingthesupportofthedistribution.

McKayetal.(1979)formalizedthisideausingLatinhypercubesampling.SupposewehaveaK-dimensionalrandomvectorX=(X1,...,XK)andwewanttogetasampleofsizeNfromthejointdistributionofX.IfthecomponentsofXareindependent,thentheschemeissimple,namely:

DividetherangeofeachcomponentrandomvariableinNintervalsofequalprobability,

RandomlysampleoneobservationforeachcomponentrandomvariableineachofthecorrespondingNintervals,andfinally

RandomlycombinethecomponentstocreateX.

Stein(1987)showedthatthissamplingschemecanbesubstantiallybetterthansimplerandomsampling.ImanandConover(1982)andStein(1987)bothdiscussedextensionsfor

Page 63: Statistical software engineering
Page 64: Statistical software engineering

Page23

nonindependentcomponentvariables.Ofcourse,ifspecifyinghomogenousstrataispossible,itshouldbedonepriortoapplyingtheLatinhypercubesamplingmethodtoincreasetheoveralleffectivenessofthesamplingscheme.

Example:Considerasoftwaresystemcontrollingthestateofanair-to-groundmissile.Thekeyinputsforthesoftwarearealtitude,attackandbankangles,speed,pitch,roll,andyaw.Typically,thesevariablesareindependentlycontrolled.Totestthissoftwaresystem,combinationsofalltheseinputsmustbeprovidedandtheoutputfromthesoftwaresystemcheckedagainstthecorrespondingphysics.Onewouldliketogeneratetestcasesthatincludeinputsoverabroadrangeofpermissiblevalues.Totestallthevalidpossibilities,itwouldbereasonabletotryuniformdistributionsforeachinput.Supposewedecideuponasampleofsize6.ThecorrespondingLatinhypercubedesigniseasilyconstructedbydividingeachvariableintosixequalprobabilityintervalsandsamplingrandomlyfromeachinterval.Becausewehaveindependentrandomvariableshere,thefinalstepconsistsofrandomlycouplingthesesamples.Thedesignisdifficulttovisualizeinmorethantwodimensions,butonesuchsampleforattackandbankanglesisdepictedinFigure2.Notethatthereisexactlyoneobservationineachcolumnandineachrow,thusthename"Latinhypercube."

Figure2.Latinhypercube.N=6andK=2.

Page 65: Statistical software engineering
Page 66: Statistical software engineering

Page24

Finally,concerning(3),tomakeinferencesaboutfieldperformance,theissueofthediscrepancybetweenthestatisticaloperationalprofileandthetestingdistributionmustbeaddressed.Atthispoint,adistinctioncanbemadebetweentwotypesofextrapolationtofieldperformanceofthesystemundertest.Itisclearthatevenifthetrueoperationalprofiledistributionisnotavailable,totheextentthatthetestingdistributionhasthesamesupportastheoperationalprofiledistribution,statisticalinferencescanbemadeaboutthenumberofremainingfaults.Ontheotherhand,toextrapolatethefailureintensityfromthetestinglaboratorytothefield,itisnotenoughtohavethesamesupport;rather,identicaldistributionsareneeded.Ofcourse,itisunlikelythatafterspendingmuchtimeandmoneyontesting,onewouldagaintestwiththestatisticaloperationalprofile.Whatisneededisawayofreusingtheinformationgeneratedinthetestinglaboratory,perhapsbyatransformationinwhichsomestatisticaltechniquesbasedonreweightingcanhelp.Therearetwobasicideas,bothrelyingheavilyontheassumptionthatthetestingandthefield-usedistributionshavethesamesupport.Oneideaistouseallthedatafromthetestinglaboratory,butwithaddedweightstochangethesampletoresemblearandomsamplefromtheoperationalprofile.Theapproachissimilartoreweightinginimportancesampling.Anotherideaistoacceptorrejecttheinputsusedintestingwithaprobabilitydistributionbasedontheoperationalprofile.Foradescriptionofbothofthesetechniques,seeBeckmanandMcKay(1987).

Inhispresentationatthepanel'sforum,Phadke(1993)suggestedanothersetofstatisticaltechniques,basedonorthogonalarrays,forparsimonioustestingofsoftware.Theexampledescribedaboveprovesusefulinanelaboration.

Example.Forthesoftwaresystemthatdeterminesthestateofanattackplane,letusassumethatinterestcentersontestingonlytwo

Page 67: Statistical software engineering

conditionsforeachinputvariable.Thissituationarises,forexample,whentheprimaryinterestliesinboundaryvaluetesting.Letthelowervaluebeinputstate0andtheuppervaluebeinputstate1foreachofthevariables.Theninthelanguageofstatisticalexperimentaldesign,wehavesevenfactors,A,...,G(altitude,attackangle,bankangle,speed,pitch,roll,andyaw),eachattwolevels(0,1).Totestallofthepossiblecombinations,onewouldneedacompletefactorialexperiment,whichwouldhave27=128testcasesconsistingofallpossiblesequencesof0'sand1's.Forastatisticalexperimentintendedtoaddressonlymaineffects,ahighlyfractionatedfactorialdesignwouldbesufficient.However,inthecaseofsoftwaretesting,thereisnostatisticalvariabilityandlittleornointerestinestimatingvariouseffects.Rather,theinterestisincoveringthetestspaceasmuchaspossibleandcheckingwhetherthetestcasespassorfail.Eveninthiscase,itisstillpossibletousestatisticaldesignideas.Forexample,considerthesequenceoftestcasesgiveninTable2.Thisdesignrequires8testcasesinsteadof128.Inthiscase,sincethereisnostatisticalvariation,maineffectsdonothaveanypracticalmeaning.However,lookingatthepatterninthetable,itisclearthatallpossiblecombinationsofanytwopairsarecoveredinabalancedway.Thus,testingaccordingtothisdesignwillprotectagainstanyincorrectimplementationofthecodeinvolvingapairwiseinteraction.

Page 68: Statistical software engineering

Page25

A B C D E F G A B C D E F G

1 0 0 0 0 0 0 0 10 0 0 0 0 0 0

2 0 0 0 1 1 1 1 21 1 1 1 1 1 0

3 0 1 1 0 0 1 1 30 0 1 1 1 0 1

4 1 1 1 1 1 0 0 41 0 0 0 1 1 1

5 1 0 1 0 1 0 1 51 1 1 0 0 0 1

6 1 0 1 1 0 1 0 60 1 0 1 0 1 1

7 1 1 0 0 1 1 0

8 1 1 0 1 0 0 1

Table2a.Orthogonalarray.Testcasesinrows.Testfactorsincolumns.

Table2b.Combinatorialdesign.Testcasesinrows.Testfactorsincolumns.

Ingeneral,followingTaguchi,Phadke(1993)suggestsorthogonalarraydesignsofstrengthtwo.Thesedesigns(aspecificinstanceofwhichisgivenintheaboveexample)guaranteethatallpossiblepairwisecombinationswillbetriedoutinabalancedway.AnotherapproachbasedoncombinatorialdesignswasproposedbyCohenetal.(1994).Theirdesignsdonotconsiderbalancetobeanoverridingdesigncriterion,andaccordinglytheyproducedesignswithsmallernumbersofrequiredtestcases.Forexample,Table2bcontainsacombinatorialdesignwithcompletepairwisecoverageinsixrunsinsteadoftheeightrequiredbyorthogonalarrays(Table2a).Thisnotionhasbeenextendedtonotionsofhigher-ordercoverageaswell.Theefficacyoftheseandothertypesofdesignshastobeevaluatedinthetestingcontext.

Page 69: Statistical software engineering

Besidesthetypesoftestingdiscussedabove,thereareotherstatisticalstrategiesthatcanbeused.Forexample,DeMilloetal.(1988)havesuggestedtheuseoffaultinsertiontechniques.Thebasicideaisakintocapture-recapturesamplinginwhichsampledunitsofapopulation(usuallywildlife)arereleasedandinversesamplingisdonetoestimatetheunknownpopulationsize.TheMOTHRAsystembuiltbyDeMilloandhiscolleaguesimplementssuchascheme.Whiletherearemanypossiblesamplingschemes(Nayak,1988),thedifficultywithfaultinsertionisthatthefaultsinsertedoughttobesubtleenoughsothatthesystemcanbecompiledandtested;notwoinsertedfaultsshouldinteractwitheachother;andwhileitmaybepossibleattheunittestinglevel,itisprohibitivelyexpensiveforintegrationtesting.Itshouldbepointedoutthattheuseofcapture-recapturesampling,outlinedinthischapter'ssubsectiontitled''Design,"forquantifyingdocumentreviewsdoesnotrequirefaultseedingand,accordingly,isnotsubjecttotheabovedifficulties.

Anotherkeyproblemintestingisdeterminingwhentherehasbeenenoughtesting.Forunittestingwheremuchofthetestingiswhiteboxandthemodulesaresmall,onecanattempttocheckwhetherallthepathshavebeencoveredbythetestcases,anideaextendedsubstantiallybyHorganandLondon(1992).However,forintegrationandsystemtesting,thisparticularapproach,coveragetesting,isnotpossiblebecauseofthesizeandthenumberofpossiblepathsthroughthesystem.Hereisanotheropportunityforusingstatisticalapproachestodevelopatheoryofstatisticalcoverage.Coveragetestingrelatestoderivingmethodsandalgorithmsfor

Page 70: Statistical software engineering

Page26

generatingtestcasessothatonecanstate,withaveryhighprobability,thatonehascheckedmostoftheimportantpathsofthesoftware.Thiskindofmethodologyhasbeenusedwithprobabilisticalgorithmsinprotocoltesting,wherethestructureoftheprogramcanbedescribedingreatdetail.(Aprotocolisaveryprecisedescriptionoftheinterfacebetweentwodiversesystems.)LeeandYanakakis(1992)haveproposedalgorithmswherebyoneisguaranteed,withahighdegreeofprobability,thatallthestatesoftheprotocolsarechecked.Thedifficultywiththisapproachisthatthenumberofstatesbecomeslargeveryquickly,andexceptforasmallpartofthesystemundertest,itisnotclearthatsuchatechniquewouldbepractical(undercurrentcomputingtechnology).Theseideashavebeenmathematicallyformalizedinthevibrantareaoftheoremcheckingandproving(Blumetal.,1990).Thekeyideaistotaketransformsofprogramssuchthattheresultsareinvariantunderthesetransformsifthesoftwareiscorrect.Thus,anyvariationintheresultssuggestspossiblefaultsinthesoftware.Blumetal.(1989)andLipton(1989),amongothers,havedevelopedanumberofalgorithmstogiveprobabilisticboundsonthecorrectnessofsoftwarebasedonthenumberofdifferenttransformations.

Inalloftheseveralapproachestotestingdiscussedabove,thenumberoftestcasescanbeextraordinarilylarge.Becauseofthecostoftestingandtheneedtosupplysoftwareinareasonableperiodoftime,itisnecessarytoformulaterulesaboutwhentostoptesting.Hereinliesanothersetofinterestingproblemsinsequentialanalysisandstatisticaldecisiontheory.AspointedoutbyDalalandMallows(1988,1990,1992),Singpurwalla(1991),andothers,thekeyissueistoexplicitlyincorporatetheeconomictrade-offbetweenthedecisiontostoptesting(andabsorbthecostoffixingsubsequentfieldfaults)andthedecisiontocontinuetesting(andincurongoingcoststofind

Page 71: Statistical software engineering

andfixfaultsbeforereleaseofasoftwareproduct).Sincethetestingprocessisnotdeterministic,thefault-findingprocessismodeledbyastochasticreliabilitymodel(seeChapter4forfurtherdiscussion).Theopportunemomentforreleaseisdecidedusingsequentialdecisiontheory.Therulesaresimpletoimplementandhavebeenusedinanumberofprojects.Thisframeworkhasbeenextendedtotheproblemofbuyingsoftwarewithsomesortofprobabilisticguaranteeonthenumberoffaultsremaining(DalalandMallows,1992).Anotherextensionwithpracticalimportance(DalalandMcIntosh,1994)dealswiththeissueofasystemundertestnothavingbeencompletelydeliveredatthestartoftesting.Thissituationisacommonoccurrenceforlargesystems,whereinordertomeetschedulingmilestones,testingbeginsimmediatelyonmodulesandsetsofmodulesastheyarecompleted.

Page 72: Statistical software engineering

Page27

4CritiqueofSomeCurrentApplicationsofStatisticsinSoftwareEngineering

COSTESTIMATION

Oneofsoftwareengineering'slong-standingproblemsistheconsiderableinaccuracyofthecost,resource,andscheduleestimatesdevelopedforprojects.Theseestimatesoftendifferfromthefinalcostsbyafactoroftwoormore.Suchinaccuracieshaveasevereimpactonprocessintegrityandultimatelyonfinalsoftwarequality.Fivefactorscontributetothiscontinuingproblem:

1.Mostcostestimateshavelittlestatisticalbasisandhavenotbeenvalidated;

2.Thevalueofhistoricaldataindevelopingpredictivemodelsislimited,sincenoconsistentsoftwaredevelopmentprocesshasbeenadoptedbyanorganization;

3.Thematurityofanorganization'sprocesschangesthegranularityofthedatathatcanbeusedeffectivelyinprojectcostestimation;

4.Thereliabilityofinputstocostestimationmodelsvarieswidely;and

5.Managersattempttomanagetotheestimates,reducingthevalidityofhistoricaldataasabasisforvalidation.

Certainoftheaboveissuescenterontheso-calledmaturityofanorganization(Humphrey,1988).Fromapurelystatisticalresearchperspective,(5)maybethemostinterestingarea,butthemajorchallengefacingthesoftwarecommunityisfindingtherightmetrics

Page 73: Statistical software engineering

tomeasureinthefirstplace.

Example.ThedataplottedinFigure3pertaintotheproductivityofaconventionalCOBOLdevelopmentenvironment(Kitchenham,1992).Foreachof46differentproducts,size(numberofentitiesandtransactions)andeffort(inperson-hours)weremeasured.FromFigure3,itisapparentthatdespitesubstantialvariability,astrong(log-log)linearrelationshipexistsbetweenprogramsizeandprogrameffort.

Asimplemodelrelatingefforttosizeis

log10(effort)= +ßlog10(size)+noise.

Page 74: Statistical software engineering

Page28

Figure3.DataontherelationshipbetweendevelopmenteffortandproductsizeinaCOBOLdevelopmentorganization.

AleastsquaresfittothesedatayieldsCoeff.SE t

Intercept 1.120 0.30243.702log10(size)1.049 0.12508.397RMS 0.194

Thesefittedcoefficientssuggestthatdevelopmenteffortisproportionaltoproductsize;aformaltestofthehypothesis,H:ß=1,givesatvalueatthe.65significancelevel.

Theestimatedinterceptafterfixingß=1is1.24;theresultingfitanda95%predictionintervalareoverlaidonthedatainFigure3.Thismodelpredictsthatitrequiresapproximately17hours(=101.24)toimplementeachunitofsize.

Suchmodelsareusedforpredictionandtoolvalidation.Consideranadditionalobservationmadeofaproductdevelopedusingafourth-

Page 75: Statistical software engineering

generationlanguageandrelationaldatabases.Undertheexperimentaldevelopmentprocess,ittook710hourstoimplementtheproductofsize183(thispointisdenotedbyXinFigure3).Thefittedmodelpredictsthatthisproductwouldhave

Page 76: Statistical software engineering

Page29

takenapproximately3,000hourstocompleteusingtheconventionaldevelopmentenvironment.The95%predictionintervalatX=183rangesfromapproximately1,000to9,000hours;thus,assumingthatotherfactorsarenotcontributingtotheapparentshortdevelopmentcycleofthisproduct,theuseofthosenewfourth-generationtoolshasdemonstrablydecreasedthedevelopmenteffort(andhencethecost).

StatisticalInadequaciesinEstimating

Mostcostestimationmethodsdevelopaninitialrelationshipbetweentheestimatedsizeofasystem(inlinesofcode,forinstance)andtheresourcesrequiredtodevelopit.Suchequationsareoftenoftheformillustratedintheaboveexample:effortisproportionaltosizeraisedtotheßpower.Thisinitialestimateisthenadjustedbyanumberoffactorsthatarethoughttoaffecttheproductivityofthespecificproject,suchastheexperienceoftheassignedstaff,theavailabletools,therequirementsforreliability,andthecomplexityoftheinteractionwiththecustomer.Thustheestimatingequationassumestheloglinearform:

effort» sizeßXaiajakalam...az,

wherethea'sarethecoefficientsfortheadjustmentfactors.Unfortunately,theseadjustmentfactorsarenottreatedasvariablesinaregressionequation;rather,eachhasasetoffixedcoefficients(termed"weightingfactors")associatedwitheachlevelofthevariable.Theseareindependentlyappliedasifthevariableswereuncorrelated(anassumptionknowntobeincorrect).Theseweightingschemeshavebeendevelopedbasedonintuitionabouteachvariable'spotentialimpactratherthanonastatisticalmodelfittingusinghistoricaldata.Thus,althoughtherelationshipbetweeneffortandsize

Page 77: Statistical software engineering

isoftenrecalibratedfordifferentorganizations,theweightingfactorsarenot.

Exacerbatingtheproblemswithexistingcostestimationmodelsisthelackofrigorousvalidationoftheequations.Forinstance,Boehm(1981)hasacknowledgedthathiswell-knownCOCOMOestimatingmodelwasnotdevelopedusingstatisticalmethods.Manyindividualsmarketingcostestimationmodelingtoolsdenigratethevalueofstatisticalapproachescomparedtocleverintuition.Totheextentthatanalyticalmethodsareusedinthedevelopmentorvalidationofthesemodels,theyareoftenperformedondatasetsthatcontainasmanypredictorvariables(productivityfactors)asprojects.Thusdeterminationoftheseparateorindividualcontributionsofthevariablesalmostcertainlydependstoomuchonchanceandcanbedistortedbycollinearrelationships.Thesemodelsarerarelysubjectedtoindependentvalidationstudies.Further,littleresearchhasbeendonethatattemptstorestrictthesemodelstoincludingonlythoseproductivityfactorsthatreallymatter(i.e.,subsetselection).

Becauseofthelackofstatisticalrigorinmostcostestimationmodels,softwaredevelopmentorganizationsusuallyhandcraftweightingschemestofittheirhistoricalresults.Thus,thespecificinstantiationofmostcostestimationmodelsdiffersacrossorganizations.Undertheseconditions,cross-validationoftheweightingschemesisverydifficult,ifnotimpossible.Anew

Page 78: Statistical software engineering

Page30

approachtodevelopingcostestimationmodelswouldbebeneficial,onethatinvokessoundstatisticalprinciplesinfittingsuchequationstohistoricaldataandtovalidatingtheirapplicabilityacrossorganizations.Iftheinstantiationofsuchmodelsisfoundtobedomain-specific,statisticallyvalidmethodsshouldbesoughtforregeneratingaccuratemodelsindifferentdomains.

ProcessVolatility

Inimmaturesoftwaredevelopmentorganizations,theprocessesuseddifferacrossprojectsbecausetheyarebasedontheexperiencesandpreferencesoftheindividualsassignedtoeachproject,ratherthanoncommonorganizationalpractice.Thus,insuchorganizationscostestimationmodelsmustattempttopredicttheresultsofaprocessthatvarieswidelyacrossprojects.Inpoorlyrunprojectsthesignal-to-noiseratioislow,inthatthereislittleconsistentpracticethatcanbeusedasthebasisfordependableprediction.Insuchprojects,neitherthesizenortheproductivityfactorsprovideanyconsistentinsightintotheresourcesrequired,sincetheyarenotsystematicallyrelatedtotheprocessesthatwillbeused.

Thehistoricaldatacollectedfromprojectsinimmaturesoftwaredevelopmentorganizationsaredifficulttointerpretbecausetheyreflectwidelydivergentpractices.Suchdatasetsdonotprovideanadequatebasisforvalidation,sinceprocessvariationcanmaskunderlyingrelationships.Infact,becausetherelationshipsamongindependentvariablesmaychangewithvariationsintheprocess,differentprojectsmayrequiredifferentvaluesoftheparametersinthecostestimationmodels.Asorganizationsmatureandstabilizetheirprocesses,theaccuracyoftheestimatingmodelstheyuseusuallyincreases.

MaturityandDataGranularity

Page 79: Statistical software engineering

Inmatureorganizationsthesoftwaredevelopmentprocessiswelldefinedandisappliedconsistentlyacrossprojects.Themorecarefullydefinedtheprocess,thefinerthegranularityoftheprocessesthatcanbemeasured.Thus,assoftwareorganizationsmature,theentirebasisfortheircostestimationmodelscanchange.Immatureorganizationshavedataonlyatthelevelofoverallprojectsize,numberofperson-yearsrequired,andoverallcost.Withincreasingorganizationalmaturity,itbecomespossibletoobtaindataonprocessdetailssuchashowmanyreviewsmustbeconductedateachlifecyclestagebasedonthesizeofthesystem,howmanytestcasesmustberun,andhowmanydefectsmustbefixedbasedonthedefectremovalefficiencyofeachstageoftheverificationprocess.Thus,estimationinfullydevelopedorganizationscanbebasedonabottom-upanalysisinwhichthehistoricaldatacanbemoreaccuratebecausetheobjectsofestimation,andtheefforttheyrequire,aremoreeasilycharacterized.

Asorganizationsmature,thestructureofrelevantcostestimationmodelscanchange.Whenprocessmodelsarenotdefinedindetail,modelsmusttaketheformofregressionequationsbasedonvariablesthatdescribethetotalimpactofapredictorvariableonaproject's

Page 80: Statistical software engineering

Page31

developmentcycle.Thereislittlenotioninthesemodelsofthedetailedpracticesthatmakeupthetotality.Inmatureorganizationssuchpracticesaredefinedandcanbeanalyzedindividuallyandbuiltupintoatotalestimate.Normallytheerrorsinestimatingthesesmallercomponentsaresmallerthanthecorrespondingerroratthetotalprojectlevel,anditisassumedthatthesummaryeffectofaggregatingthesesmallererrorsisstillsmallerthantheerrorintheestimateatthetotalprojectlevel.

ReliabilityofModelInputs

Evenifacostestimationmodelisstatisticallysound,thedataonwhichitisbasedcanhavelowvalidity.Often,managersdonothavesufficientknowledgeofcrucialvariablesthatmustbeenteredintoamodel,suchastheestimatedsizeofvariousindividualcomponentsofasystem.Insuchinstances,processesexistforincreasingtheaccuracyofthesedata.Forinstance,Delphitechniquescanbeusedbysoftwareengineerswhohavepreviousexperienceindevelopingvarioussystemcomponents.Thelessexperienceanorganizationhaswithaparticularcomponentofasystem,thelessreliableisthesizeestimateforthatcomponent.Typically,componentsizesareunderestimated,withruinouseffectsontheresourcesandscheduleestimatedforaproject.Sometimeshistorical"fudgefactors"areappliedtoaccountforunderestimation,althoughamorerigorousdata-basedapproachisrecommended.Toaidinidentifyingthepotentialrisksinasoftwaredevelopmentproject,itwouldalsobebeneficialtohavereliableconfidenceboundsfordifferentcomponentsoftheestimatedsizeoreffort.

Statisticalmethodscanbeappliedtodeveloppriorprobabilities(e.g.,forBayesianestimationmodels)fromknowledgeablesoftwareengineersandtoadjusttheseusinghistoricaldata.Thesemethods

Page 81: Statistical software engineering

shouldbeusednotonlytosuggesttheconfidencethatcanbeplacedinanestimate,butalsotoindicatethecomponentswithinasystemthatcontributemosttoinaccuraciesinanestimate.

Asprojectsprogressduringtheirlifecyclefromspecificationsofrequirementstodesigntogenerationofcode,theinformationonwhichestimatescanbebasedgrowsmorereliable:thereisthusgreatercertaintyinestimatingfromthearchitecturaldesignofasystemorthedetaileddesignofeachmodulethaninestimatingfromtextualstatements.Inshort,thesourcesfromwhichestimatescanbedevelopedchangeastheprojectcontinuesthroughitsdevelopmentcycle.Eachsucceedinglevelofinputisamorereliableindicatoroftheultimatesystemsizethanaretheinputsavailableinearlierstagesofdevelopment.Thustheoverallestimateofsize,resources,andschedulepotentiallybecomesmoreaccurateinsucceedingphasesofaproject.Yetitisimportanttodeterminethemostaccurateindicatorsofcrucialparameterssuchassize,effort,andscheduleveryearlyinaproject,whentheleastreliabledataareavailable.Assuch,thereisaneedforstatisticallyvalidwaysofdevelopingmodelinputsfromlessreliableformsofdata(theseinputsmustreliablyestimatelatermeasuresthatwillbemorevalidinputs)andofestimatinghowmucherrorisintroducedintoanestimatebasedonthereliabilityoftheinputs.

Page 82: Statistical software engineering

Page32

ManagingtoEstimates

Complicatingtheabilitytovalidatecostestimationmodelsfromhistoricaldataisthefactthatprojectmanagerstrytomanagetheirprojectstomeetreceivedestimatesforcost,effort,schedule,andothersuchvariables.Thus,anestimateaffectsthesubsequentprocess,andhistoricaldataaremadeartificiallymoreaccuratebymanagementdecisionsandotherfactorsthatareoftenmaskedinprojectdata.Forinstance,projectswhoserequiredlevelofefforthasbeenunderestimatedoftensurviveonlargeamountsofunreportedovertimeputinbythedevelopmentstaff.Moreover,manymanagersarequiteskilledatcuttingfunctionalityfromasysteminordertomeetadeliverydate.Intheworstcases,engineersshort-cuttheirordinaryengineeringprocessestomeetanunrealisticschedule,usuallywithdisastrousresults.Techniquesformodelingsystemsdynamicsprovideonewaytocharacterizesomeoftheinteractionsthatoccurbetweenanestimateandthesubsequentprocessthatisgeneratedbytheestimate(Abdel-Hamid,1991).

Thevalidationofcostestimationmodelsmustbeconductedwithanunderstandingofsuchinteractionsbetweenestimatesandaprojectmanager'sdecisions.Someofthesedynamicsmaybeusefullydescribedbystatisticalmodelsorbytechniquesdevelopedinpsychologicaldecisiontheory(Kahnemanetal.,1982).Thus,itmaybepossibletodevelopastatisticaldynamicmodel(e.g.,amultistagelinearmodel)thatcharacterizesthereliabilityofinputstoanestimate,theestimateitself,decisionsmadebasedontheestimate,theresultingperformanceoftheproject,measuresthatemergelaterintheproject,subsequentdecisionmakingbasedontheselatermeasures,andtheultimateperformanceoftheproject.Suchmodelswouldbevaluableinhelpingprojectmanagerstounderstandtheramificationsofdecisionsbasedonaninitialestimateandalsoonsubsequentperiodic

Page 83: Statistical software engineering

updates.

ASSESSMENTANDRELIABILITY

ReliabilityGrowthModeling

Manyreliabilitymodelsofvaryingdegreesofplausibilityareavailabletosoftwareengineers.Thesemodelsareappliedateitherthetestingstageorthefield-monitoringstage.Mostofthemodelstakeasinputeitherfailuretimeorfailurecountdataandfitastochasticprocessmodeltoreflectreliabilitygrowth.Thedifferencesamongthemodelslieprincipallyinassumptionsmadebasedontheunderlyingstochasticprocessgeneratingthedata.Abriefsurveyofsomeofthewell-knownmodelsandtheirassumptionsandefficacyisgiveninAbdel-Ghalyetal.(1986).

Althoughmanysoftwarereliabilitygrowthmodelsaredescribedintheliterature,theevidencesuggeststhattheycannotbetrustedtogiveaccuratepredictionsinallcasesandalsothatitisnotpossibletoidentifyaprioriwhichmodel(ifany)willbetrustworthyinaparticular

Page 84: Statistical software engineering

Page33

context.Nodoubtworkwillcontinueinrefiningthesemodelsandintroducing"improved"ones.Althoughsuchworkisofsomeinterest,thepaneldoesnotbelievethatitmeritsextensiveresearchbythestatisticalcommunity,butthinksratherthatstatisticalresearchcouldbedirectedmorefruitfullytoprovidinginsighttotheusersofthemodelsthatcurrentlyexist.

Theproblemisvalidationofsuchmodelswithrespecttoaparticulardatasource,toallowuserstodecidewhich,ifany,predictionschemeisproducingaccurateresultsfortheactualsoftwarefailureprocessunderexamination.Someworkhasbeendoneonthisproblem(Abdel-Ghalyetal.,1986;BrocklehurstandLittlewood,1992),usingacombinationofprobabilityforecastingandsequentialprediction,theso-calledprequentialapproachdevelopedbyDawid(1984),butthisworkhassofarbeenratherinformal.Itwouldbehelpfultohavemoreproceduresforassessingtheaccuracyofcompetingpredictionsystemsthatcouldthenbeusedroutinelybyindustrialsoftwareengineerswithoutadvancedstatisticaltraining.

Statisticalinferenceintheareaofreliabilitytendsalmostinvariablytobeofaclassicalfrequentistkind,eventhoughmanyofthemodelsoriginatefromasubjectiveBayesianprobabilityviewpoint.ThisunsatisfactorystateofaffairsarisesfromthesheerdifficultyofperformingthecomputationsnecessaryforaproperBayesiananalysis.Itseemslikelythattherewouldbeprofitintryingtoovercometheseproblems,perhapsviatheGibbssamplingapproach(see,e.g.,SmithandRoberts,1993).

Anotherfruitfulavenueforresearchconcernstheintroductionofexplanatoryvariables,so-calledcovariates,intosoftwarereliabilitygrowthmodels.Mostexistingmodelsassumethatnoexplanatoryvariablesareavailable.Thisassumptionisassuredlysimplistic

Page 85: Statistical software engineering

concerningtestingforallbutsmallsystemsinvolvingshortdevelopmentandlifecycles.Forlargesystems(i.e.,thosewithmorethan100,000linesofcode)therearevariables,otherthantime,thatareveryrelevant.Forexample,itistypicallyassumedthatthenumberoffaults(foundandunfound)inasystemundertestremainsstablei.e.,thatthecoderemainsfrozenduringtesting.However,thisisrarelythecaseforlargesystems,sinceaggressivedeliverycyclesforcethefinalphasesofdevelopmenttooverlapwiththeinitialstagesofsystemtesting.Thus,thesizeofcodeand,consequently,thenumberoffaultsinalargesystemcanvarywidelyduringtesting.Ifthesechangesincodesizearenotconsidered,theresult,atbest,islikelytobeanincreaseinvariabilityandalossinpredictiveperformance,andatworst,apoorlyfittingmodelwithunstableparameterestimates.Takingthislogiconestepfurthersuggeststheneedtodistinguishbetweennewlinesofcode(newfaults)andcodecomingfrompreviousreleases(oldfaults),andpossiblytheageofdifferentpartsofcode.Ofcourse,onecancarrythislogictoanextremeandhaveunwieldymodelswithmanycovariates.Inpractice,whatisrequiredisacompromisebetweenthetwoextremesofhavingnocovariatesandhavinghundredsofthem.Thisiswhereopportunitiesaboundforapplyingstate-of-the-artstatisticalmodelingtechniques.DescribedbrieflybelowisacasestudyreportedbyDalalandMcIntosh(1994)dealingwithreliabilitymodelingwhencodeischanging.

Page 86: Statistical software engineering

Page34

Example.Consideranewreleaseofalargetelecommunicationssystemwithapproximately7millionnoncommentarysourcelines(NCSLs)and400,000linesofnoncommentaryneworchangedsourcelines(NCNCSLs).Forafasterdeliverycycle,thesourcecodeusedforsystemtestwasupdatedeverynightthroughoutthetestperiod.Attheendofeachof198calendardaysinthetestcycle,thenumberoffaultsfound,NCNCSLs,andthestafftimespentontestingwerecollected.Figure4(top)portraysgrowthofthesystemasafunctionofstafftime.ThedataareprovidedinTable3.

Figure4.Plotsofmodulesize(NCNCSLs)versusstafftime(days)foralargetelecommunicationssoftware

system(top).Observedandfittedcumulativefaultsversusstafftime(bottom).Thedottedline(barelyvisible)

representsthefittedmodel,thesolidlinerepresentstheobserveddata,andthedashedline(alsodifficulttosee)

istheextrapolationofthefittedmodel.

Page 87: Statistical software engineering
Page 88: Statistical software engineering

Page35

Table3.Dataoncumulativesize(NCNCSLs),cumulativestafftime(days),andcumulativefaultsforalargetelecommunicationssystemon198consecutivecalendardays(withduplicatelinesrepresentingweekendsorholidays).

Cum.StaffDays

Cum.Faults

Cum.NCNCSLs

Cum.StaffDays

Cum.Faults

Cum.NCNCSLs

Cum.StaffDays

Cum.Faults

Cum.NCNCSLs

0 0 0 334.8231 261669 776.5 612 318476

4.8 0 16012 342.7243 262889 793.5 621 320125

6 0 16012 350.5252 263629 807.2 636 321774

6 0 16012 356.3259 264367 811.8 639 321774

14.3 7 32027 360.6271 265107 812.5 639 321774

22.8 7 48042 365.7277 265845 829 648 323423

32.1 7 58854 365.7277 265845 844.4 658 325072

41.4 7 69669 365.7277 265845 860.5 666 326179

51.2 11 80483 374.9282 266585 876.7 674 327286

51.2 11 80483 386.5290 267325 892 679 328393

51.2 11 80483 396.5300 268607 895.5 686 328393

60.6 12 91295 408 310 269891 895.5 686 328393

70 13 102110 417.3312 271175 910.8 690 329500

79.9 15 112925 417.3312 271175 925.1 701 330608

91.3 20 120367 417.3312 271175 938.3 710 330435

97 21 127812 424.9321 272457 952 720 330263

Page 89: Statistical software engineering

97 21 127812 434.2326 273741 965 729 330091

97 21 127812 442.7339 275025 967.7 729 330091

97 21 127812 451.4346 276556 968.6 731 330091

107.722 135257 456.1347 278087 981.3 740 329919

119.128 142702 456.1347 278087 997 749 329747

127.640 150147 456.1347 278087 1013.9759 330036

135.144 152806 460.8351 279618 1030.1776 330326

135.144 152806 466 356 281149 1044 781 330616

135.144 152806 472.3359 283592 1047 782 330616

142.846 155464 476.4362 286036 1047 782 330616

148.948 158123 480.9367 288480 1059.7783 330906

156.652 160781 480.9367 288480 1072.6787 331196

163.952 167704 480.9367 288480 1085.7793 331486

169.759 174626 486.8374 290923 1098.4796 331577

170.159 174626 495.8376 293367 1112.4797 331669

170.659 174626 505.7380 295811 1113.5798 331669

174.763 181548 516 392 298254 1114.1798 331669

179.668 188473 526.2399 300698 1128 802 331760

185.571 194626 527.3401 300698 1139.1805 331852

194 88 200782 527.3401 300698 1151.4811 331944

200.393 206937 535.8405 303142 1163.2823 332167

200.393 206937 546.3415 304063 1174.3827 332391

200.393 206937 556.1425 305009 1174.3827 332391

207.297 213093 568.1440 305956 1174.3827 332391

Page 90: Statistical software engineering

211.998 219248 577.2457 306902 1184.6832 332615

217 105 221355 578.3457 306902 1198.3834 332839

223.5113 223462 578.3457 306902 1210.3836 333053

227 113 225568 587.2467 307849 1221.1839 333267

227 113 225568 595.5473 308795 1230.5842 333481

227 113 225568 605.6480 309742 1231.6842 333481

234.1122 227675 613.9491 310688 1231.6842 333481

241.6129 229784 621.6496 311635 1240.9844 333695

250.7141 233557 621.6496 311635 1249.5845 333909

259.8155 237330 621.6496 311635 1262.2849 335920

268.3166 241103 623.4496 311635 1271.3851 337932

268.3166 241103 636.3502 311750 1279.8854 339943

268.3166 241103 649.7517 311866 1281 854 339943

277.2178 244879 663.9527 312467 1281 854 339943

285.5186 247946 675.1540 313069 1287.4855 341955

294.2190 251016 677.4543 313069 1295.1859 341967

295.7190 251016 677.9544 313069 1304.8860 341979

298 190 254086 688.4553 313671 1305.8865 342073

298 190 254086 698.1561 314273 1313.3867 342168

298 190 254086 710.5573 314783 1314.4867 342168

305.2195 257155 720.9581 315294 1314.4867 342168

312.3201 260225 731.6584 315805 1320 867 342262

318.2209 260705 732.7585 315805 1325.3867 342357

328.9224 261188 733.6585 315805 1330.6870 342357

Page 91: Statistical software engineering

334.8231 261669 746.7586 316316 1334.2870 342358

334.8231 261669 761 598 316827 1336.7870 342358

SOURCE:DalalandMcIntosh(1994).

Page 92: Statistical software engineering

Page36

Assumethatthetestingprocessisobservedattimeti,i=0,...,h,,andatanygiventime,theamountoftimeittakestofindaspecific''bug"isexponentialwithratem.Attime,thetotalnumberoffaultsremaininginthesystemisPoissonwithmeanli+1,andNCNCSLisincreasedbyanamount.ThischangeaddsaPoissonnumberoffaultswithmeanproportionaltoC,sayqCi.Theseassumptionsleadtothemassbalanceequation,namely,thattheexpectednumberoffaultsinthesystematti(afterpossiblemodification)istheexpectednumberoffaultsinthesystematti-1adjustedbytheexpectednumberfoundintheinterval(ti-1,ti)plusthefaultsintroducedbythechangesmadeatti:

li+1=lie-m(ti-ti-1)+qCi,

fori=1,...h.NotethatrepresentsthenumberofnewfaultsenteringthesystemperadditionalNCNCSL,andrepresentsthenumberoffaultsinthecodeatthestartofsystemtest.Bothoftheseparametersmakeitpossibletodifferentiatebetweenthenewcodeaddedinthecurrentreleaseandtheoldercode.Forthedataathand,theestimatedparametersareq=0.025,m=0.002,andl1=41.ThefittedandtheobserveddataareplottedagainststafftimeinFigure4(bottom).Thefitisevidentlyverygood.Ofcourseassessingthemodelonindependentornewdataisrequiredforpropervalidation.

Theefficacyofcreatingastatisticalmodelisnowexamined.Theestimateofqishighlysignificant,bothstatisticallyandpractically,showingtheneedforincorporatingchangesinNCNCSLsasacovariate.Itsnumericalvalueimpliesthatforeveryadditional10,000NCNCSLsaddedtothesystem,25faultsarebeingaddedaswell.Forthesedata,thepredictednumberoffaultsattheendofthetestperiodisPoissondistributedwithmean145.DividingthisquantitybythetotalNCNCSLsgives4.2per10,000NCNCSLsasanestimatedfield

Page 93: Statistical software engineering

faultdensity.Theseestimatesoftheincomingandoutgoingqualityareveryvaluableinjudgingtheefficacyofsystemtestingandfordecidingwhereresourcesshouldbeallocatedtoimprovethequality.Here,forexample,systemtestingwaseffectiveinthatitremoved21ofevery25faults.However,itraisesanotherissue:25faultsper10,000NCNCSLsenteringsystemtestmaybetoohighandaplanoughttobeconsideredtoimprovetheincomingquality.

Noneoftheaboveconclusionscouldhavebeenmadewithoutusingastatisticalmodel.Theseconclusionsarevaluableforcontrollingandimprovingthereliabilitytestingprocess.Further,forthisanalysisitwasessentialtohaveacovariateotherthantime.

InfluenceoftheDevelopmentProcessonSoftwareDependability

Asnotedabove,surprisinglylittleusehasbeenmadeofexplanatoryvariablemodels,suchasproportionalhazardsregression,inthemodelingofsoftwaredependability.Amajorreason,thepanelbelieves,isthedifficultythatsoftwareengineershaveinidentifyingvariablesthatcan

Page 94: Statistical software engineering

Page37

playagenuinelyexplanatoryrole.Anotherdifficultyisthecomparativepaucityofdataowingtothedifficultiesofreplication.Thus,forexample,forpurposesofidentifyingthoseattributesofthesoftwaredevelopmentprocessthataredriversofthefinalproduct'sdependability,itisverydifficulttoobtainsomethingakintoa"randomsample"of"similar"subjectprograms.Thoseissuesarenotunliketheonesfacedinothercontextswherethesetechniquesareused,forexample,inmedicaltrials,buttheyseemparticularlyacuteforevaluationofsoftwaredependability.

Afurtherproblemisthattheobservableinthissoftwaredevelopmentapplicationisarealizationofastochasticprocess,andnotmerelyofalifetimerandomvariable.Thusthereseemstobeanopportunityforresearchintomodelsthat,ontheonehand,capturecurrentunderstandingofthenatureofthegrowthinreliabilitythattakesplaceasaresultofdebuggingand,ontheotherhand,allowinputaboutthenatureofthedevelopmentprocessorthearchitectureoftheproduct.

InfluenceoftheOperationalEnvironmentonSoftwareDependability

Itcanbemisleadingtotalkofthereliabilityofaprogram:asisthecaseforthereliabilityofhardware,thereliabilityofaprogramdependsonthenatureofitsuse.Forsoftware,however,onedoesnothavethesimplenotionsofstressthataresometimesplausibleinthehardwarecontext.Itisthusnotpossibletoinferthereliabilityofaprograminoneenvironmentfromevidenceoftheprogram'sfailurebehaviorinanother.Thisisaseriousdifficultyforseveralreasons.

First,onewouldliketobeabletopredicttheoperationalreliabilityofaprogramfromtestdata.Thesimplestapproachatpresentistoensurethatthetestenvironment,thatis,thetypeofusage,isexactlysimilarto,ordiffersinknownproportionsforspecifiedstratafrom,theoperationalenvironment.Realsoftwaretestingregimesareoften

Page 95: Statistical software engineering

deliberatelymadetobedifferentfromoperationalones,sinceitisclaimedthatinthiswayreliabilitycanbeachievedmoreefficiently:thisargumentissimilartothatforhardwarestresstestingbutismuchlessconvincinginthesoftwarecontext.

Afurtherreasontobeinterestedinthisproblemofinferringprogramreliabilityisthatmostsoftwaregetsbroadlydistributedtodiverselocationsandisusedverydifferentlybydifferentusers:thereisgreatdisparityinthepopulationofuserenvironments.Vendorswouldliketobeabletopredictdifferentusers'perceptionsofaproduct'sreliability,butitisclearlyimpracticaltoreplicateinatesteverydifferentpossibleoperationalenvironment.Vendorswouldalsoliketobeabletopredictthecharacteristicsofapopulationofusers.Thusitmightbeexpectedthatalessdisparatepopulationofuserswouldbepreferabletoamoredisparateone:intheformercase,forexample,problemsreportedatdifferentsitesmightbesimilarandthusbelessexpensivetofix.

Explanatoryvariablemodelingmayplayausefulroleifsuitablyinformative,measurableattributesofoperationalusagecanbeidentified.Theremaybeotherwaysofformingstochasticcharacterizationsofoperationalenvironments.Markovmodelsofthesuccessiveactivationofmodules,oroffunctions,havebeenproposed(Littlewood,1979;Siegrist,1988a,b)buthavenot

Page 96: Statistical software engineering

Page38

beenwidelyused.Furtherworkonsuchapproaches,andontheproblemsofstatisticalinferenceassociatedwiththem,couldbepromising.

Safety-CriticalSoftwareandtheProblemofAssuringUltrahighDependability

Itseemsclearthatcomputerswillplayincreasinglycriticalrolesinsystemsuponwhichhumanlivesdepend.Already,systemsarebeingbuiltthatrequireextremelyhighdependabilityafigureof10-9probabilityoffailureperhourofflighthasbeenstatedastherequirementforrecentfly-by-wiresystemsincivilaircraft.Thereareclearlimitationstothelevelsofdependabilitythatcanbeachievedwhenwearebuildingsystemsofacomplexitythatprecludesclaimsthattheyarefreeofdesignfaults.Moreimportantly,evenifwewereabletobuildasystemtomeetarequirementforultrahighdependability,wecouldhaveonlylowconfidencethatwehadachievedthatgoal,becausetheproblemofassessingtheselevelsissuchthatitwouldbeimpracticaltoacquiresufficientsupportingevidence(LittlewoodandStrigini,1993).

Althoughacompletesolutiontotheproblemofassessingultrahighdependabilityisnotanticipated,thereiscertainlyroomforimprovingonwhatcanbedonecurrently.Probabilisticandstatisticalproblemsaboundinthisarea,anditisnecessarytosqueezeasmuchaspossiblefromrelativelysmallamountsofoftendisparateevidence.Thefollowingaresomeoftheareasthatcouldbenefitfrominvestigation.

DesignDiversity,FaultTolerance,andGeneralIssuesofDependence

Onepromisingapproachtotheproblemofachievinghighdependability(herereliabilityand/orsafety)isdesigndiversity:buildingtwoormoreversionsoftherequiredprogramandallowing

Page 97: Statistical software engineering

anadjudicationmechanism(e.g.,avoter)tooperateatrun-time.Althoughsuchsystemshavebeenbuiltandareinoperationinsafety-criticalcontexts,thereislittletheoreticalunderstandingoftheirbehaviorinoperation.Inparticular,thereliabilityandsafetymodelsarequitepoor.

Forexample,thereisampleevidence(KnightandLeveson,1986)that,inthepresenceofdesignfaults,onecannotsimplyassumethatdifferentversionswillfailindependentlyofoneanother.Thusthesimplehardwarereliabilitymodelsthatinvolvemereredundancy,andassumeindependenceofcomponentfailures,cannotbeused.Itisonlyquiterecentlythatprobabilitymodelinghasstartedtoaddressthisproblemseriously(EckhardtandLee,1985;LittlewoodandMiller,1989).Thesemodelsprovideaformalconceptualframeworkwithinwhichitispossibletoreasonaboutthesubtleissuesofconditionalindependenceinvolvedinthefailureprocessesofdesign-diversesystems.However,theyprovidelittlequantitativepracticalassistancetoasoftwaredesignerorevaluator.

Furtherprobabilisticmodelingisneededtoelucidatesomeofthecomplexissues.Forexample,littleattentionhasbeenpaidtomodelingthefullfaulttolerantsystem,involvingdiversityandadjudication.Inparticular,thepropertiesofthestochasticprocessoffailuresof

Page 98: Statistical software engineering

Page39

suchsystemsarenotunderstood.If,asseemslikely,individualversionsofaprograminareal-timecontrolsystemexhibitclustersoffailuresintime,howdoestheclusterprocessofthesystemrelatetotheclusterprocessesoftheindividualversions?Althoughsuchissuesseemnarrowlytechnical,theyarevitallyimportantinthedesignofrealsystems,whosephysicalintegritymaybesufficienttosurviveoneortwofailedinputcycles,butnotmany.

Anotherareathathashadlittleworkisprobabilisticmodelingofdifferentpossibleadjudicationmechanismsandtheirfailureprocesses.

JudgmentandDecision-makingFramework

Althoughprobabilityseemstobethemostappropriatemechanismforrepresentinguncertaintyaboutsystemdependability,othercandidatessuchasShafer-Dempsterandpossibilitytheoriesmightbeplausiblealternativesinsafety-criticalcontextswherequantitativemeasuresarerequiredintheabsenceofdataforexample,whenoneisforcedtorelyontheengineeringjudgmentofanexpert.Furtherworkisneededtoelucidatetherelativeadvantagesanddisadvantagesofthedifferentapproachesapplicableinthesoftwareengineeringdomain.

Thereisevidencethathumanjudgment,evenin"hard"sciencessuchasphysics,canbeseriouslyinerror(HenrionandFischhoff,1986):peopleseemtomakeconsistenterrorsandtendtobeoptimisticintheirownjudgmentregardingtheirlikelyerror.Itislikelythatsoftwareengineeringjudgmentsaresimilarlyfallible,andsothisareacallsforsomestatisticalexperimentation.Inaddition,itwouldbebeneficialtohaveformalmechanismsforassessingwhetherjudgmentsarewellcalibratedandforrecalibratingjudgmentandpredictionschemes(ofhumansormodels)thathavebeenshowntobeinaccurate.Thisproblemhassomesimilaritytotheproblemsof

Page 99: Statistical software engineering

validatingsoftwarereliabilitymodels,alreadymentioned,inwhichprequentiallikelihoodplaysavitalrole.ItalsobearsonmoregeneralapplicationsofBayesianmodelingwhereelicitationofaprioriprobabilityvaluesisrequired.

Itseemsinevitablethatreasoningandjudgmentaboutthefitnessofsafety-criticalsystemswilldependonevidencethatisdisparateinnature.Suchevidencecouldincludedataonfailures,asinreliabilitygrowthmodels;humanexpertjudgment;resultsregardingtheefficacyofdevelopmentprocesses;informationaboutthearchitectureofasystem;orevidencefromformalverification.Iftherequiredjudgmentdependsonanumericalassessmentofasystem'sdependability,thereareclearlyimportantissuesconcerningthecompositionofverydifferentkindsofevidencefromdifferentsources.Theseissuesmay,indeed,beoverridingwhenitcomestochoosingamongthedifferentwaysofrepresentinguncertainty.TheBayestheorem,forexample,mayprovideaneasierwaythandoespossibilitytheorytocombineinformationfromdifferentsourcesofuncertainty.

Aparticularlyimportantproblemconcernsthewayinwhichdeterministicreasoningcanbeincorporatedintothefinalassessmentofasystem.Formalmethodsofachievingdependabilityarebecomingincreasinglyimportant.Suchmethodsrangefromformalnotations,whichassistintheelicitationandexpressionofrequirements,tofullmathematicalverificationofthecorrespondencebetweenaformalspecificationandanimplementation.Oneviewisthattheseapproachesincorporatingdeterministicreasoningtosystemdevelopmentremoveaparticular

Page 100: Statistical software engineering

Page40

typeofuncertainty,leavingothersuntouched(uncertaintyaboutthecompletenessofaformalspecification,thepossibilityofincorrectproof,andsoon).Oneshouldfactorintothefinalassessmentofasystem'sdependabilitythecontributionfromsuchdeterministic,logicalevidence,neverthelesskeepinginmindthatthereisanirreducibleuncertaintyinone'spossibleknowledgeofthefailurebehaviorofasystem.

StructuralModelingIssues

Concernsaboutthesafetyandreliabilityofsoftware-basedsystemsnecessarilyarisefromtheirinherentcomplexityandnovelty.Systemsnowbeingbuiltaresocomplexthattheycannotbeguaranteedtobefreefromdesignfaults.Theextenttowhichconfidencecanbecarriedoverfromthebuildingofprevioussystemsismuchmorelimitedinsoftwareengineeringthanin"real"engineering,becausesoftware-basedsystemstendtobecharacterizedbyagreatdealofnovelty.

Designersneedhelpinmakingdecisionsthroughoutthedesignprocess,especiallyattheveryhighestlevel.Realsystemsareoftendifficulttoassessbecauseofearlydecisionsregardinghowmuchsystemcontrolwilldependoncomputers,hardware,andhumans.FortheAirbusA320,forexample,theearlydecisiontoplaceahighleveloftrustinthecomputerizedfly-by-wiresystemmeantthatthissystem(andthusitssoftware)neededtohaveabetterthanprobabilityoffailureinatypicalflight.Stochasticmodelingmightaidinsuchhigh-leveldesigndecisionssothatdesignerscanmake"whatif"calculationsatanearlystage.

Experimentation,DataCollection,andGeneralStatisticalTechniques

Adearthofdatahasbeenaprobleminmuchofsafety-criticalsoftwareengineeringsinceitsinception.Onlyahandfulofpublished

Page 101: Statistical software engineering

datasetsexistsevenforthesoftwarereliabilitygrowthproblem,whichisbyfarthemostextensivelydevelopedaspectofsoftwaredependabilityassessment.Whenthelackofdataarisesfromtheneedforconfidentialityindustrialcompaniesareoftenreluctanttoallowaccesstodataonsoftwarefailuresbecauseofthepossibilitythatpeoplemaythinklesshighlyoftheirproductslittlecanbedonebeyondmakingeffortstoresolveconfidentialityproblems.However,insomecasestheavailabledataaresparsebecausethereisnostatisticalexpertiseonhandtoadviseonwaysinwhichdatacanbecollectedcost-effectively.Itmaybeworthwhiletoattempttoproducegeneralguidelinesfordatacollectionthataddressthespecificdifficultiesofthesoftwareengineeringproblemdomain.

Withnotableexceptions(Eckhardtetal.,1991;KnightandLeveson,1986),experimentationhassofarplayedalow-keyroleinsoftwareengineeringresearch.Somewhatsurprisingly,inviewofitsdifficultyandcost,themostextensiveexperimentationhasinvestigatedtheefficacyofdesigndiversity.Otherareaswhereexperimentalapproachesseemfeasibleandshouldbeencouragedincludetheobviousandgeneralquestionofwhichsoftwaredevelopmentmethodsaremostcost-effectiveinproducingsoftwareproductswithdesirableattributessuchasdependability.Statisticaladviceonthedesignofsuchexperimentswouldbeessential;itmight

Page 102: Statistical software engineering

Page41

alsobethecasethatinnovationinthedesignofexperimentscouldmakefeasiblesomeinvestigationsthatcurrentlyseemtooexpensivetocontemplate:themainproblemarisesfromtheneedforreplicationovermanysoftwareproducts.

Ontheotherhand,areaswhereexperimentscanbeconductedwithoutthereplicationproblembeingoverwhelminginvolvetheinvestigationofquiterestrictedhypothesesabouttheeffectivenessofspecifictechniques.Forexample,experimentationcouldaddresswhetherthetechniquesthatareclaimedtobeeffectiveforachievingreliability(i.e.,effectivenessofdebugging)aresignificantlybetterthanthose,suchasoperationaltesting,thatwillallowreliabilitytobemeasured.

SOFTWAREMEASUREMENTANDMETRICS

Measurementisatthefoundationofscienceandengineering.Animportantgoalsharedbysoftwareengineersandstatisticiansistoderivereliable,reproducible,andaccuratemeasuresofsoftwareproductsandprocesses.Measurementsareimportantforassessingtheeffectsofproposed"improvements"insoftwareproduction,whethertheybetechnologicalorprocessoriented.Measurementsserveanequallyimportantroleinscheduling,planning,resourceallocation,andcostestimation(seethefirstsectioninthischapter).

EarlypioneeringworkbyMcCabe(1976)andHalstead(1977)seededthefieldofsoftwaremetrics;anoverviewisprovidedbyZuse(1991).Muchoftheattentioninthisareahasfocusedonstaticmeasurementsofcode.Lessattentionhasbeenpaidtodynamicmeasurementsofsoftware(e.g.,measuringtheconnectivityofsoftwaremodulesunderoperatingconditions)andaspectsofthesoftwareproductionprocesssuchassoftwarereuse,especiallyinsystemsemployingobject-orientedlanguages.

Page 103: Statistical software engineering

Themostwidelyusedcodemetric,theNCSL(noncommentarysourceline),isoftenusedasasurrogateforfunctionality.Surprisingly,sincesoftwareisnownearly50yearsold,standardsforcountingNCSLsremainelusiveinpractice.Forexample,shouldasingle,two-linestatementinClanguagecountasoneNCSLortwo?

Countsoftokens(operatorsoroperands),delimiters,andbranchingstatementsareusedasotherstaticmetrics.Althoughsomeoftheseareclearlymeasuresofsoftwaresize,otherspurporttomeasuremoresubtlenotionsofsoftwarecomplexityandstructure.Ithasbeenobservedthatallsuchmetricsarehighlycorrelatedwithsize.Atthepanel'sinformation-gatheringforum,Munson(1993)concludedthatcurrentsoftwaremetricscaptureapproximatelythree"independent"featuresofasoftwaremodule:programcontrol,programsize,anddatastructure.Astatistical(principal-components)analysisof13metricsonHALprogramsinthespaceshuttleprogramwasthekeytothisfinding.Whileonemightarguethatperformingacommonstatisticaldecompositionofmultivariatedataishardlynovel,itmostcertainlyisinsoftwareengineering.Theimportantimplicationofthatfindingisthattherearefeaturesofsoftwarethatarenotbeingcapturedbytheexistingbatteryofsoftwaremetrics(e.g.,cohesionandcoupling)andifthesearekeydifferentiatorsofpotentiallyhigh-andlow-faultprograms,thereisnowaythatananalysisoftheavailablemetricswillhighlightthiscondition.Ontheothersideoftheledger,thestatisticalcostsofincluding"noisy"versionsofthesame(latent)variableinmodelsandanalysis

Page 104: Statistical software engineering

Page42

methodsthatarebasedonthesemetrics,suchascostestimation,seemnottohavebeenappreciated.Subsetselectionmethods(e.g.,Mallows,1973)provideonewaytoassessvariableredundancyandtheeffectonfittedmodels,butotherapproachesthatusejudgmentcomposites,orcompositesbasedonotherbodiesofdata(Tukey,1991),willoftenbemoreeffectivethandiscardingmetrics.

Metricstypicallyinvolveprocessesorproducts,aresubjectiveorobjective,andinvolvedifferenttypesofmeasurementscales,forexample,nominal,ordinal,interval,orratio.Anobjectivemetricisameasurementtakenonaproductorprocess,usuallyonanintervalorratioscale.Someexamplesincludethenumberoflinesofcode,developmenttime,numberofsoftwarefaults,ornumberofchanges.Asubjectivemetricmayinvolveaclassificationorqualificationbasedonexperience.Examplesincludethequalityofuseofamethodortheexperienceoftheprogrammersintheapplicationorprocess.

OnestandardforsoftwaremeasurementistheBasiliandWeiss(1984)Goal/Question/Metricparadigm,whichhasfiveparameters:

1.Anobjectofthestudyaprocess,product,oranyotherexperiencemodel;

2.Afocuswhatinformationisofinterest;

3.Apointofviewtheperspectiveofthepersonneedingtheinformation;

4.Apurposehowtheinformationwillbeused;and

5.Adeterminationofwhatmeasurementswillprovidetheinformationthatisneeded.

Theresultsarestudiedrelativetoaparticularenvironment.

Page 105: Statistical software engineering
Page 106: Statistical software engineering

Page43

5StatisticalChallengesIncomparisonwithotherengineeringdisciplines,softwareengineeringisstillinthedefinitionstage.Characteristicsofestablisheddisciplinesincludehavingdefined,time-tested,crediblemethodologiesfordisciplinarypractice,assessment,andpredictability.Softwareengineeringcombinesapplicationdomainknowledge,computerscience,statistics,behavioralscience,andhumanfactorsissues.Statisticalresearchandeducationchallengesinsoftwareengineeringinvolvethefollowing:

Generalizingparticularexperimentalresultstoothersettingsandprojects,

Scalingupresultsobtainedinacademicstudiestoindustrialsettings,

Combininginformationacrosssoftwareengineeringprojectsandstudies,

Adoptingexploratorydataanalysisandvisualizationtechniques,

Educatingthesoftwareengineeringcommunityastostatisticalapproachesanddataissues,

Developinganalysismethodstocopewithqualitativevariables,

Providingmodelswiththeappropriateerrordistributionsforsoftwareengineeringapplications,and

Improvingacceleratedlifetesting.

Thefollowingsectionselaborateoncertainofthesechallenges.

Page 107: Statistical software engineering

SOFTWAREENGINEERINGEXPERIMENTALISSUES

Softwareengineeringisanevolutionaryandexperimentaldiscipline.AsarguedforcefullybyBasili(1993),itisalaboratoryorexperimentalscience.Theterm"experimentalscience"hasdifferentmeaningsforengineersandstatisticians.Forengineers,softwareisexperimentalbecausesystemsarebuilt,studied,andevaluatedbasedontheory.Eachsysteminvestigatesnewideasandadvancesthestateoftheart.Forstatisticians,thepurposeofexperimentsistogatherstatisticallyvalidevidenceabouttheeffectsofsomefactor,perhapsinvolvingtheprocess,methodology,orcodeinasystem.

Therearethreeclassesofexperimentsinsoftwareengineering:

Casestudies,

Academicexperiments,and

Industrialexperiments.

Casestudiesareperhapsthemostcommonandinvolvean"experiment"onasinglelarge-scaleproject.Academicexperimentsusuallyinvolveasmall-scaleexperiment,oftenonaprogramor

Page 108: Statistical software engineering

Page44

methodology,typicallyusingstudentsastheexperimentalsubjects.Industrialexperimentsfallsomewherebetweencasestudiesandacademicexperiments.Becauseoftheexpenseanddifficultyofperformingextensivecontrolledexperimentsonsoftware,casestudiesareoftenresortedto.Theidealsituationistobeabletotakeadvantageofreal-worldindustrialoperationswhilehavingasmuchcontrolasisfeasible.Muchofthepresentworkinthisareaisatbestanecdotalandwouldbenefitgreatlyfrommorerigorousstatisticaladviceandcontrol.Thepanelforeseesanopportunityforinnovativeworkoncombininginformation(seebelow)fromrelativelydisparateexperiences.

Conductingstatisticallyvalidsoftwareexperimentsischallengingforseveralreasons:

Thesoftwareproductionprocessisoftenchaoticanduncontrolled(i.e.,immature);

Humanvariabilityisacomplicatingfactor;and

Industrialexperimentsareverycostlyandthereforemustproducesomethinguseful.

Manyvariablesinthesoftwareproductionprocessarenotwellunderstoodandaredifficulttocontrolfor.Forsoftwareengineeringexperiments,thefactorsofinterestincludethefollowing:

"People"factors:number,level,organization,processexperience;

Problemfactors:applicationdomain,constraints,susceptibilitytochange;

Processfactors:lifecyclemodel,methods,tools,programminglanguage;

Page 109: Statistical software engineering

Productfactors:deliverables,systemsize,systemreliability,portability;and

Resourcefactors:targetanddevelopmentmachines,calendartime,budget,existingsoftware,andsoon.

Eachofthesecharacteristicsmustbemodeledorcontrolsdonefortheexperimenttobevalid.

Humanvariabilityisparticularlychallenging,giventhatthedifferenceinqualityandproductivitybetweenthebestandworstprogrammersmaybe20to1.Forexample,inanexperimentcomparingbatchversusinteractivecomputing,Sackman(1970)observeddifferencesinabilityofupto28to1inprogrammersperformingthesametask.Thisvariationcanoverwhelmtheeffectsofachangeinmethodologythatmayaccountfora10%to15%differenceinqualityorproductivity.

Thehumanfactorissostronglyintegratedwitheveryaspectofthesubjectivedisciplineofsoftwareengineeringthatitaloneistheprimedriverofissuestobeaddressed.Thehumanfactorcreatesissuesintheprocess,theproduct,andtheuserenvironment.Measurementsoftheobjects(theproductandtheprocess)areobscuredwhenqualifiedbytheattributes(ambiguousrequirementsandproductivityissuesarekeyexamples).Recognizingandcharacterizingthehumanattributeswithinthecontextofthesoftwareprocessarekeytounderstandinghowtoincludetheminsystemandstatisticalmodels.

Thecapabilitiesofindividualsstronglyinfluencethemetricscollectedthroughoutthesoftwareproductionprocess.Capabilitiesincludeexperience,intelligence,familiaritywiththeapplicationdomain,abilitytocommunicatewithothers,abilitytoenvisiontheproblemspatially,andabilitytoverballydescribethatspatialunderstanding.Althoughnotscientificallyfounded,anecdotalinformationsupportstheincidenceofthesecapabilities(Curtis,1988).

Page 110: Statistical software engineering
Page 111: Statistical software engineering

Page45

Forsoftwareengineeringexperiments,thekeyproblemsinvolvesmallsamplesizes,highvariability,manyuncontrolledfactors,andextremedifficultyincollectingexperimentaldata.Traditionalstatisticalexperimentaldesigns,originallydevelopedforagriculturalexperiments,arenotwellsuitedforsoftwareengineering.Atthepanel'sforum,Zweben(1993)discussedaninterestingexampleofanexperimentfromobject-orientedprogramming,involvingafairlycomplexdesignandanalysis.Object-orientedprogrammingisanapproachthatissweepingthesoftwareindustry,butforwhichmuchofthesupportingevidenceisanecdotal.

Example.Thepurposeofthesoftwaredesignandanalysisexperimentwastogatherstatisticallyvalidevidenceabouttheeffectoneffortandqualityofusingtheprinciplesofabstraction,encapsulation,andlayeringtoenhancecomponentsofsoftwaresystems.Theexperimentwasdividedintotwotypesoftasks:

1.Enhancinganexistingcomponenttoprovideadditionalfunctionality,and

2.Modifyingacomponenttoprovidedifferentfunctionality.

Theexperimentalsubjectswerestudentsingraduateclassesonsoftwarecomponentdesignanddevelopment.Thetwoapproachesforthismaintenanceproblemare"whitebox,"whichinvolvesmodifyingtheoldcodetogetthenewfunctionality,and"blackbox,"whichinvolveslayeringonthenewfunctionality.Theexperimentsweredesignedtodetect,foreachtask,differencesbetweenthetwoapproachesinthetimerequiredtomakethemodificationandinthenumberofassociatedfaultsuncovered.Threeexperimentswereconducted.ExperimentAinvolvedanunboundedqueuecomponent.ThesubjectsweregivenabasicAdapackageimplementingenque,deque,andisempty,andthetaskwastoimplementtheoperatorsadd,

Page 112: Statistical software engineering

copy,clear,append,andreverse.Thesubjectwasinstructedtokeeptrackofthetimespentindesigning,coding,testing,anddebuggingeachoperator,andalsotheassociatednumberofbugsuncoveredineachtask.Thetaskswerecompletedintwoways:bydirectlyimplementingnewoperationsusingtherepresentationofthequeue,andbylayeringonthenewoperatorsascapabilities.ExperimentBinvolvedapartialmapcomponent,andexperimentCinvolvedanalmostconstantmapcomponent.Giventhatinexperimentsinvolvingstudents,theresultsmaybeinvalidatedbyproblemswithdataintegrity,forthisexperimentthestudentparticipantsweretoldthattheresultsoftheexperimentwouldhavenoeffectoncoursegrades.Thecodewasvalidatedbyaninstructortoensurethattherewerenolingeringdefects.Theexperimentalplanwasconductedusingacrossoverdesign.Eachsubjectimplementedtheenhancementstwice,usingboththewhiteboxandtheblackboxmethods.Thisparticularexperimentaldesigncouldtestforthetreatment(layeringornot)effectandtreatmentbysequenceinteraction.Thesubjectdifferenceswerenestedwithinthesequences,andthesequenceswerecounterbalancedbasedonexperiencelevel.Thecarryovereffectofthefirsttreatmentinfluencesthechoiceregardingthecorrectwayoftestingfortreatmenteffects.

Thestatisticalmodelusedtorepresentthebehaviorinthenumberofbugswassophisticatedaswell,anoverdispersedloglinearmodel.Theuseofthismodelallowedforananalysisofnonnormalresponsedatawhilealsopreventinginvalidinferencesthatwouldhaveoccurredhad

Page 113: Statistical software engineering

Page46

overdispersionnotbeentakenintoaccount.Indeed,onlyexperimentBdisplayedasignificanttreatmenteffectafteradjustmentforoverdispersion.

COMBININGINFORMATION

Theresultsofmanydiversesoftwareprojectsandstudiestendtoleadtomoreconfusionthaninsight.Thesoftwareengineeringcommunitywouldbenefitifmorevalueweregainedfromtheworkthatisbeingdone.Totheextentthatprojectsandstudiesfocusonthesameendpoint,statisticscanhelptofusetheindependentresultsintoaconsistentandanalyticallyjustifiablestory.

Thestatisticalmethodologythataddressesthetopicofhowtofusesuchindependentresultsisrelativelynewandistermed''combininginformation";arelatedsetoftoolsisprovidedbymeta-analysis.AnexcellentoverviewofthismethodologywasproducedbyaCATSpanelanddocumentedinanNRCreport(NRC,1992)thatisnowavailableasanAmericanStatisticalAssociationpublication(ASA,1993).Thereportdocumentsvariousapproachestotheproblemofhowtocombineinformationanddescribesnumerousspecificapplications.Oneoftherecommendationsmadeinit(p.182)iscrucialtoachievingadvancesinsoftwareengineering:

Thepanelurgesthatauthorsandjournaleditorsattempttoraisethelevelofquantitativeexplicitnessinthereportingofresearchfindings,bypublishingsummariesofappropriatequantitativemeasuresonwhichtheresearchconclusionsarebased(e.g.,ataminimum:samplesizes,means,andstandarddeviationsforallvariables,andrelevantcorrelationmatrices).

Itisnotsensibletomerelycombinep-valuesfromindependentstudies.Itisclearlybettertotakeweightedaveragesofeffectswhentheweightsaccountfordifferencesinsizeandsensitivityacrossthe

Page 114: Statistical software engineering

studiestobecombined.

Example.Kitchenham(1991)discussesanissueincostestimationthatinvolveslookingacross10differentsourcesconsistingof17differentsoftwareprojects.Theissueiswhethertheexponentßinthebasiccostestimationmodel,effortµsizeß,issignificantlydifferentfrom1.Theusualinterpretationofßisthe"overheadintroducedbyproductsize,"sothatavaluegreaterthan1impliesthatrelativelymoreeffortisrequiredtoproducelargesoftwaresystemsthantoproducesmallerones.Manycitesuch"diseconomiesofscale"insoftwareproductionasevidenceinsupportoftheirmodelsandtools.

The17softwareprojectsarelistedinTable4.Fortunately,thecitedsourcescontainbothpointestimates(b)oftheexponentanditsestimatedstandarderror.Thesesummarystatisticscanbeusedtoestimateacommonexponentandultimatelytestthehypothesisthatitisdifferentfrom1.

Page 115: Statistical software engineering

Page47

Table4.Reportedandderiveddataon17projectsconcernedwithcostestimation.

Study b SE(b) Var(b) w

Bai-Bas 0.951 0.068 0.004624 21.240Bel-Leh 1.062 0.101 0.010200 18.990Your 0.716 0.230 0.052900 10.490Wing 1.059 0.294 0.086440 7.758Kemr 0.856 0.177 0.031330 13.550Boehm.Org 0.833 0.184 0.033860 13.100Boehm.semi 0.976 0.133 0.017690 16.630Boehm.Emb 1.070 0.104 0.010820 18.770Kit-Tay.ICL 0.472 0.323 0.104300 6.813Kit-Tay.BTSX 1.202 0.300 0.090000 7.550Kit-Tay.BTSW 0.495 0.185 0.034220 13.040DS1.1 1.049 0.125 0.015630 17.220DS1.2 1.078 0.105 0.011020 18.700DS1.3 1.086 0.289 0.083520 7.938DS2.New 0.178 0.134 0.017960 16.550DS2.Ext 1.025 0.158 0.024960 14.830DS3 1.141 0.077 0.005929 20.670

SOURCE:Reprinted,withpermission,fromKitchenham(1992).(c)1992byNationalComputingCentre,Ltd.

FollowingtheNRCrecommendationsoncombininginformationacrossstudies(NRC,1992),theappropriatemodel(theso-calledrandomeffectsmodelinmeta-analysis)allowsforasystematicdifferencebetweenprojects(e.g.,biasindatareporting,managementstyle,andsoon)thataveragestozero.Underthismodel,theoverallexponentisestimatedasaweightedaverageoftheindividual

Page 116: Statistical software engineering

exponentswheretheweightshavetheformwi=var(bi)+t2andthecommonbetween-projectcomponentofvarianceisestimatedby

whereQ=Swi(bi- )2.ThestatisticQisitselfatestofthehomogeneityofprojectsandunderanormalityassumptionisdistributedasX2k-1.ForthesedataoneobtainsQ=55.19,whichstronglyindicatesheterogeneityacrossprojects.Althoughtherandomeffectsmodelanticipatessuchheterogeneity,otherapproachesthatmodelthedifferencesbetweenprojects(e.g.,

Page 117: Statistical software engineering

Page48

regressionmodels)maybemoreinformative.Sincenoexplanatoryvariablesareavailable,thisdiscussionproceedsusingthesimplermodel.

Theestimatedbetween-projectcomponentofvarianceist2=0.0425,whichissurprisinglylargeandisperhapshighlyinfluencedbytwoprojectswithb'slessthan0.5.Combiningthisestimatewiththeindividualwithin-projectvariancesleadstotheweightsgiveninthefinalcolumnofTable4.Thustheoverallestimatedexponentis =0.911withestimatedstandarderrors=0.0640(=Ö[1/Swi]).Combiningthesetwoestimatesleadsreadilytoa95%confidenceintervalforßof(0.78,1.04).Thusthedatainthesestudiesdonotsupportthediseconomies-of-scaleargument.

Evenbetterthanpublishedsummarieswouldbeacentralrepositoryofthedataarisingfromastudy.Thisinformationwouldallowassessmentofvariousdeterminationsofsimilaritiesbetweenstudies,aswellaspotentialbiases.Thepanelisawareofseveralinitiativestobuildsuchdatarepositories.TheproposedNationalSoftwareCouncilhasasoneofitsprimaryresponsibilitiestheconstructionandmaintenanceofanationalsoftwaremeasurementsdatabase.Atthepanel'sforum,aspecializeddatabaseonsoftwareprojectsintheaeronauticsindustrywasalsodiscussed(Keller,1993).

Anissuerelatedtocombininginformationfromdiversesourcesconcernsthetranslationtoindustryofsmallexperimentalstudiesand/orpublishedcasestudiesdoneinanacademicenvironment.Seriousdoubtsexistinindustryastotheupwardscalabilityofmostofthesestudiesbecausepopulations,projectsizes,andenvironmentsarealldifferent.Expectationsdifferregardingquality,anditisunclearwhethervariablesmeasuredinasmallstudyarethevariablesinwhichindustryhasaninterest.Thestatisticalcommunityshoulddevelop

Page 118: Statistical software engineering

stochasticmodelstopropagateuncertainty(includingvariabilityassessment)ondifferentcontrolfactorssothatadjustmentsandpredictionsapplicabletoindustry-levelenvironmentscanbemade.

VISUALIZATIONINSOFTWAREENGINEERING

Scientificvisualizationisanemergingtechnologythatisdrivenbyever-decreasinghardwarepricesandtheassociatedincreasingsophisticationofvisualizationsoftware.Visualizationinvolvestheinteractivepictorialdisplayofdatausinggraphics,animation,andsound.Muchoftherecentprogressinvisualizationhascomefromtheapplicationofcomputergraphicstothree-dimensionalimageanalysisandrendering.Datavisualization,asubsetofscientificvisualization,focusesonthedisplayandanalysisofabstractdata.Someoftheearliestandbest-knownexamplesofdatavisualizationinvolvestatisticaldatadisplays.

Themotivationforapplyingvisualizationtosoftwareengineeringistounderstandthecomplexity,multidimensionality,andstructureembodiedinsoftwaresystems.Muchoftheoriginalresearchinsoftwarevisualizationtheuseoftypography,graphicdesign,animation,andcinematographytofacilitatetheunderstandingandenhancementofsoftwaresystems-wasperformedbycomputerscientistsinterestedinunderstandingalgorithms,particularlyinthe

Page 119: Statistical software engineering

Page49

contextofeducation.Applyingthequantitativefocusofstatisticalgraphicsmethodstocurrentlypopularscientificvisualizationtechniquesisafertileareaforresearch.

Visualizingsoftwareengineeringdataischallengingbecauseofthediversityofdatasetsassociatedwithsoftwareprojects.Fordatasetsinvolvingsoftwarefaults,timestofailure,costandeffortpredictions,andsoon,thereisaclearstatisticalrelationshipofinterest.Softwarefaultdensitymayberelatedtocodecomplexityandtoothersoftwaremetrics.Traditionaltechniquesforvisualizingstatisticaldataaredesignedtoextractquantitativerelationshipsbetweenvariables.Othersoftwareengineeringdatasetssuchastheexecutiontraceofaprogram(thesequenceofstatementsexecutedduringatestrun)orthechangehistoryofafilearenoteasilyvisualizedusingconventionaldatavisualizationtechniques.Theneedforrelevanttechniqueshasledtothedevelopmentofspecializeddomain-specificvisualizationcapabilitiespeculiartosoftwaresystems.Applicationsincludethefollowing:

Configurationmanagementdata(Eicketal.,1992b),

Functioncallgraphs(Ganseretal.,1993),

Codecoverage,

Codemetrics,

Algorithmanimation(BrownandHershberger,1992;Stasko,1993),

Sophisticatedtypesettingofcomputerprograms(BaeckerandMarcus,1988),

Softwaredevelopmentprocess,

Page 120: Statistical software engineering

Softwaremetrics(Ebert,1992),and

Softwarereliabilitymodelsanddata.

Someoftheseapplicationsarediscussedbelow.

ConfigurationManagementData

Arichsoftwaredatabasesuitableforvisualizationinvolvesthecodeitself.Inproductionsystems,thesourcecodeisstoredinconfigurationmanagementdatabases.Thesedatabasescontainacompletehistoryofthecodewitheverysourcecodechangerecordedasamodificationrequest.Alongwiththeaffectedlines,thesourcecodedatabaseusuallycontainsotherinformationsuchastheidentityoftheprogrammermakingthechanges,datethechangesweresubmitted,reasonforthechange,andwhetherthechangewasmeanttoaddfunctionalityorfixabug.Thevariablesassociatedwithsourcecodemaybecontinuous,categorical,orbinary.Foralineinacomputerprogram,whenitwaswrittenis(essentially)continuous,whowroteitiscategorical,andwhetherornotthelinewasexecutedduringaregressiontestisbinary.

Example.Figure1(see"Implementation"inChapter3)showsproductioncodewritteninClanguagefromamoduleinAT&T's5ESSswitch(Eick,1994).Inthedisplay,rowcoloristiedtothecode'sage:themostrecentlyaddedlinesareinredandtheoldestinblue,withacolorspectruminbetween.Dynamicgraphicstechniquesareemployedforincreasingtheeffectivenessofthedisplay.TherearefiveinteractiveviewsofdatainFigure1:

Page 121: Statistical software engineering

Page50

1.Therowscorrespondingtothetextlines,

2.Thevaluesonthecolorscale,

3.Thefilenamesabovethecolumns,

4.Thebrowserwindows,and

5.Thebarchartbeneaththecolorscale.

Eachoftheviewsislinked,unitedthroughtheuseofcolor,andactivatedbyusingamousepointer.Thismodeofmanipulatingthedisplay,calledbrushingbyBeckerandCleveland(1987)andbyBeckeretal.(1987),isparticularlyeffectiveforexploringsoftwaredevelopmentdata.

FunctionCallGraphs

PerhapsthemostcommonvisualizationofsoftwareisafunctioncallgraphasshowninFigure5.Functioncallgraphsareawidelyused,visual,tree-likedisplayofthefunctioncallsinapieceofcode.Theyshowcallingrelationshipsbetweenmodulesinasystemandareonerepresentationofsoftwarestructure.Aproblemwithfunctioncallgraphsisthattheybecomeoverloadedwithtoomuchinformationforallbutthesmallestsystems.Oneapproachtoimprovingtheusefulnessoffunctioncallgraphsmightinvolvetheuseofdynamicgraphicstechniquestofocusthedisplayonthevisuallyinformativeregions.

TestCodeCoverage

Anotherinterestingexampleofsourcecodevisualizationinvolvesshowingtestsuitecodecoverage.Figure6showsthestatementcoverageandexecution"hotspots"foraprogramthathasbeenrunthroughitsregressiontest.Therowindentationandlinelengthhave

Page 122: Statistical software engineering

beenturnedoffsothateachlinereceivesthesameamountofvisualspace.Themostfrequentlyexecutedlinesareshowninredandtheleastfrequentlyinblue,withacolorspectruminbetween.Therearetwospecialcolors:theblacklinescorrespondtononexecutablelinesofCcodesuchascomments,variabledeclarations,andfunctions,andthegraylinescorrespondtotheexecutablelinesofcodethatwerenotexecuted.Thesearethelinesthattheregressiontestmissed.

CodeMetrics

AsdiscussedinChapter4(inthesection"SoftwareMeasurementandMetrics"),staticcodemetricsattempttoquantifyandmeasurethecomplexityofcode.Thesemetricsareusedtoidentifyportionsofprogramsthatareparticularlydifficultandarelikelytobesubjecttodefects.Onevisualizationmethodfordisplayingcodecomplexitymetricsusesaspace-fillingrepresentation(BakerandEick,1995).Takingadvantageofthehierarchicalstructureofcode,eachsubsystem,module,andfileistiledonthedisplay,whichshowsthemasnested,space-fillingrectangleswitharea,color,andfillencodingsoftwaremetrics.Thistechniquecandisplay

Page 123: Statistical software engineering

Page51

therelativesizesofasystem'scomponents,therelativestabilityofthecomponents,thelocationofnewfunctionality,thelocationoferror-pronecodewithmanyfixestoidentifiedfaults,and,usinganimation,thehistoricalevolutionofthecode.

Example.Figure7displaystheAT&T5ESSswitchingcodeusingtheSeeSys(system,adynamicgraphicsmetricsvisualizationsystem.Interactivecontrolsenabletheusertomanipulatethedisplay,resetthecolors,andzoominonparticularmodulesandfiles,providinganinteractivesoftwaredataanalysisenvironment.Thespace-fillingrepresentation:

Showsmodules,files,andsubsystemsincontext;

Providesanoverviewofacompletesoftwaresystem;and

Appliesstatisticaldynamicgraphicstechniquestotheproblemofvisualizingmetrics.

Amajordifferenceintheuseofgraphicsinscientificvisualizationandstatisticsisthatfortheformer,graphsaretheend,whereasforthelatter,theyaremoreoftenthemeanstoanend.Thusvisualizationsofsoftwarearecrucialtostatisticalsoftwareengineeringtotheextentthattheyfacilitatedescriptionandmodelingofsoftwareengineeringdata.Discussedbelowaresomepossibilitiesrelatedtotheexamplesdescribedinthischapter.

TherainbowfilesinFigure1suggestthatcertaincodeischangedfrequently.Frequentlychangedcodeisoftenerror-prone,difficulttomaintain,andproblematic.Softwareengineersoftenclaimthatcode,orpeople'sunderstandingofit,decayswithage.Eventuallythecodebecomesunmaintainableandmustberewritten(re-engineered).

Page 124: Statistical software engineering

Statisticalmodelsareneededtocharacterizethenormalrateofchangeandthereforedeterminewhetherthecurrentfilesareunusual.Suchmodelsneedtotakeaccountofthenumberofchanges,locationsoffaults,typeoffunctionality,pastdevelopmentpatterns,andfuturetrends.Forexample,acommonsoftwaredesigninvolveshavingasimplemainroutinethatcallsonseveralotherprocedurestoinvokeneededfunctionality.Themainroutinemaybechangedfrequentlyasprogrammersmodifysmallsnippetsofcodetoaccesslargechunksofnewcodethatisputintootherfiles.Forthiscode,manysimple,smallchangesarenormalanddonotindicatemaintenanceproblems.Ifmodelsexisted,thenitwouldbepossibletomakequantitativecomparisonsbetweenfilesratherthanthequalitativecomparisonsthatarecurrentlymade.

Figure5suggestssomenaturalcovariatesandmodelsforimprovingtheefficiencyofsoftwaretesting.Currentcompilertechnologycaneasilyanalyzecodetoobtainthefunctions,lines,andeventhepathsexecutedbycodeintestsuites.Forcertainclassesofprogrammingerrorssuchastypographicalerrors,theincrementalcodecoverageisanidealcovariateforestimatingtheprobabilityofdetectinganerror.Theexecutionfrequencyofblocksofcodeorfunctionsisclearlyrelatedtotheprobabilityoferrordetection.Figure5showsclearlythatsmallportionsoftheprogramareheavilyexercisedbutthatmostofthecodeisnottouched.Inanindirectwayoperationalprofiletestingattemptstocapturethisideabytestingthefeatures,andthereforethecode,inrelationtohowoftentheywillbeused.Thisnotionsuggeststhatstatisticaltechniquesinvolvingcovariatescanimprovetheefficiencyofsoftwaretesting.

Figure7suggestsnovelwaysofdisplayingsoftwaremetrics.Thecurrentpracticeistoidentifyoverlycomplexfilesforspecialcareandmanagementattention.Theproceduresfor

Page 125: Statistical software engineering
Page 126: Statistical software engineering

Page52

identifyingcomplexcodeareoftenbasedonverycleverandsophisticatedarguments,butnotondata.Astatisticalapproachmightattempttocorrelatethecomplexityofcodewiththelocationsofpastfaultsandinvestigatetheirpredictivepower.Statisticalmodelsthatcanrelatecomplexitymetricstoactualfaultswillincreasethemodels'practicalefficiencyforreal-lifesystems.Thesemodelsshouldnotbedevelopedintheabsenceofdataaboutthecode.Simplewaysofpresentingsuchdata,suchasanorderedlistoffaultdensity,filebyfile,canbeveryeffectiveinguidingtheselectionofanappropriatemodel.Inothercases,microanalysis,oftendrivenbygraphicalbrowsers,mightsuggestaricherclassofmodelsthatthedatacouldsupport.Forexample,softwarefaultratesareoftenquotedintermsofthenumberoffaultsper1,000linesofNCSL.ThelinesinFigure1canbecolor-codedtoshowthehistoricallocationsofpastfaults.Inotherrepresentations(notshown),clearspatialpatternswithfaultsareconcentratedinparticularfilesandinparticularregionsofthefiles,suggestingthatspatialmodelsoffaultdensitymightworkverywellinhelpingtoidentifyfault-pronecode.

ChallengesforVisualization

Theresearchopportunitiesandchallengesinvisualizingsoftwaredataaresimilartothoseforvisualizingotherlargeabstractdatabases:

1.Softwaredataareabstract;thereisnonaturaltwo-dimensionalorthree-dimensionalrepresentationofthedata.Aresearchchallengeistodiscovermeaningfulrepresentationsofthedatathatenableananalysttounderstandthedataincontext.

2.Muchsoftwaredataarenontraditionalstatisticaldatasuchasthechangehistoryofsourcecode,duplicationinmanuals,orthestructureofarelationaldatabase.Newmetaphorsmustbediscoveredforharmonioustransferinformation.

Page 127: Statistical software engineering

3.Thedatabaseassociatedwithlargesoftwaresystemsmaybehuge,potentiallycontainingmillionsofobservations.Effectivestatisticalgraphicstechniquesmustbeabletocopewiththevolumeofdatafoundinmodernsoftwaresystems.

4.Thelackofeasy-to-usesoftwaretoolsmakesthedevelopmentofhigh-qualitycustomvisualizationsparticularlydifficult.Currently,visualizationsmustbehand-codedinlow-levellanguagessuchasCorC++.Thisisatime-consumingtaskthatcanbecarriedoutonlybythemostsophisticatedprogrammers.

OpportunitiesforVisualization

Visualizationsassociatedwithsoftwareinvolvethecodeitself,dataassociatedwiththesystem,theexecutionoftheprogram,andtheprocessforcreatingthesystem.Opportunitiesincludethefollowing:

1.Objects/Patterns.Object-orientedprogrammingisrapidlybecomingstandardfordevelopmentofnewsystemsandisbeingretrofittedintoexistingsystems.Effective

Page 128: Statistical software engineering

Page53

Figure5.Functioncallgraphsshowingthecallingpatternbetweenprocedures.Thetoppanelshowsaninterpretable,easy-to-comprehenddisplay,whereasthe

bottompanelisoverlybusyandvisuallyconfusing.

Page 129: Statistical software engineering

Page55

Figure6.aSeeSoftTMdisplayshowingcodecoverageforaprogramexecutingitsregressiontest.Thecolorofeachlineisdeterminedbythenumberoftimesthatitexecuted.Thecolorsrangefromred(the"hotspots")todeepblue(forcodeexecutedonlyonce)usingared-green-bluecolorspectrum.Therearetwospecialcolors:theblacklinesarenon-executablelinesofcodesuchasvariable

declarationsandcomments,andthegraylinesarethenon-executed(notcovered)lines.Thefigureshowsthatgeneratingregressiontestswithhigh

coverageisquitedifficult.Source:Eick(1994).

Page 130: Statistical software engineering

Page57

Figure7.Adisplayofsoftwaremetricsforamillion-linesystem.Therectangleformingtheoutermostboundaryrepresentstheentiresystem.Therectanglescontainedwithintheboundaryrepresentthesize(inNCSLs)ofindividual

subsystems(eachlabeledwithasinglecharacterA-Z,a-t),andmoduleswithinthesubsystems.Colorisusedheretoredundantlyencodesizeaccordingtothe

colorschemeintheslideratthebottomofthescreen.

Page 131: Statistical software engineering

Page59

displaysneedtobedevelopedforunderstandingtheinheritance(ordependency)structure,semanticrelationshipsamongobjects,andtherun-timelifecycleofobjects.

2.Performance.Softwaresystemsinevitablyruntooslowly,makingrun-timeperformanceanimportantconsideration.Hostsystemsoftencollectlargevolumesoffine-grain(thatis,low-level)performancedataincludingfunctioncallingpatterns,lineexecutioncounts,operatingsystempagefaults,heapusage,andstackspace,aswellasdiskusage.Noveltechniquestounderstandanddigestdynamicprogramexecutiondatawouldbeimmediatelyuseful.

3.Parallelism.Recently,massivelyparallelcomputerswithtenstothousandsofcooperatingprocessorshavestartedtobecomewidelyavailable.Programmingthesecomputersinvolvesdevelopingnewdistributedalgorithmsthatdivideimportantcomputationsamongtheprocessors.Mostoftenanessentialaspectofthecomputationinvolvescommunicatinginterimresultsbetweenprocessorsandsynchronizingthecomputations.Visualizationtechniquesareacrucialtoolforenablingprogrammerstomodelanddebugsubtlecomputations.

4.Three-dimensional.Workstationscapableofrenderingrealisticthree-dimensionaldisplaysarerapidlybecomingwidelyavailableatreasonableprices.Newvisualizationtechniquesleveragingthree-dimensionalcapabilitiesshouldbedevelopedtoenablesoftwareengineerstocopewiththeever-increasingcomplexityofmodernsoftwaresystems.

ORTHOGONALDEFECTCLASSIFICATION

Theprimaryfocusofsoftwareengineeringistomonitorasoftwaredevelopmentprocesswithaviewtowardimprovingqualityand

Page 132: Statistical software engineering

productivity.Forimprovingquality,therehavebeentwodistinctapproaches.Thefirstconsiderseachdefectasuniqueandtriestoidentifyacause.Thesecondconsidersadefectasasamplefromanensembletowhichaformalstatisticalreliabilitymodelisfitted.Chillaregeetal.(1992)proposedanewmethodologythatstrikesabalancebetweenthesetwoendsofspectrum.Thismethod,calledorthogonaldefectclassification,isbasedonexploratorydataanalysistechniquesandhasbeenfoundtobequiteusefulatIBM.Itrecognizesthatthekeytoimprovingaprocessistoquantifyvariouscause-and-effectrelationshipsinvolvingdefects.

Thebasicapproachisasfollows.First,classifydefectsintovarioustypes.Then,obtainadistributionofthetypesacrossdifferentdevelopmentphases.Finally,havingcreatedthesereferencedistributionsandtherelationshipsamongthem,comparethemwiththedistributionsobservedinanewproductorrelease.Iftherearediscrepancies,takecorrectiveaction.

Operationally,thedefectsareclassifiedaccordingtoeight''orthogonal"(mutuallyexclusive)defecttypes:functional,assignment,interface,checking,timing,build/package/merge,datastructuresandalgorithms,anddocumentation.Further,developmentphasesaredividedintofourbasicstages(wheredefectscanbeobserved):design,unittest,functiontest,andsystemtest.Foreachstageandeachdefecttype,arangeofacceptablebaselinedefectratesisdefinedbyexperience.Thisinformationisusedtoimprovethequalityofanewproductorrelease.Toward

Page 133: Statistical software engineering

Page60

thisend,foragivendefecttype,defectdistributionsacrossdevelopmentstagesarecomparedwiththebaselinerates.Foreachchainofresultssay,toohighearlyon,lowerlater,andhighattheendanimplicationisderived.Forexample,theimplicationmaybethatfunctiontestingshouldberevamped.

Thismethodologyhasbeenextendedtoastudyofthedistributionoftriggers,thatis,theconditionsthatallowadefecttosurface.First,itisimplicitinthisapproachthatthereisnosubstituteforagooddataanalysis.Second,assumptionsclearlyarebeingmadeaboutthestationarityofreferencedistributions,anapproachthatmaybeappropriateforastableenvironmentwithsimilarprojects.Thus,itmaybenecessarytocreateclassesofreferencedistributionsandclassesofsimilarprojects.Perhapssomeclusteringtechniquesmaybevaluableinthiscontext.Third,althoughthedefecttypesaremutuallyexclusive,itispossiblethatafaultmayresultinmanydefects,andviceversa.Thismultiple-spawningmaycauseseriousimplementationdifficulties.Propermeasurementprotocolsmaydiminishsuchmultipropagation.Finally,givengood-qualitydata,itmaybepossibletoextendorthogonaldefectclassificationtoeffortstoidentifyrisksintheproductionofsoftware,perhapsusingdatatoprovideearlyindicatorsofproductqualityandpotentialproblemsconcerningscheduling.Thepotentialofthislineofinquiryshouldbecarefullyinvestigated,sinceitcouldopenupanexcitingnewareainsoftwareengineering.

Page 134: Statistical software engineering

Page61

6SummaryandConclusionsInthe1950s,astheproductionlinewasbecomingthestandardforhardwaremanufacturing,Demingshowedthatstatisticalprocesscontroltechniques,inventedoriginallybyShewhart,wereessentialtocontrollingandimprovingtheproductionprocess.Deming'scrusadehashadalastingimpactinJapanandhaschangeditsworldwidecompetitiveposition.Ithasalsohadaglobalimpactontheuseofstatisticalmethods,thetrainingofstatisticians,andsoforth.

Inthe1990stheemphasisisonsoftware,ascomplexhardware-basedfunctionalityisbeingreplacedbymoreflexible,software-basedfunctionality.Smallprogramscreatedbyafewprogrammersarebeingsupersededbymassivesoftwaresystemscontainingmillionsoflinesofcodecreatedbymanyprogrammerswithdifferentbackgrounds,training,andskills.Thisistheworldofso-calledsoftwarefactories.Thesefactoriesatpresentdonotfitthetraditionalmodelof(hardware)factoriesandmorecloselyresemblethedevelopmenteffortthatgoesintodesigningnewproducts.However,withthespreadofsoftwarereuse,theincreasingavailabilityoftoolsforautomaticallycapturingrequirements,generatingcodeandtestcases,andprovidinguserdocumentation,andthegrowingrelianceonstandardizedtuningandinstallationprocessesandstandardizedproceduresforanalysis,themodelismovingclosertothatofatraditionalfactory.Theeconomyofscalethatisachievablebyconsideringsoftwaredevelopmentasamanufacturingprocess,afactory,ratherthanahandcraftingprocess,isessentialforpreservingU.S.competitiveleadership.Thechallengeistobuildthesehugesystemsinacost-effectivemanner.Thepanelexpectsthischallengeto

Page 135: Statistical software engineering

concernthefieldofsoftwareengineeringfortherestofthedecade.Hence,anysetofmethodologiesthatcanhelpinmeetingthischallengewillbeinvaluable.Moreimportantly,theuseofsuchmethodologieswilllikelydeterminethecompetitivepositionsoforganizationsandnationsinvolvedinsoftwareproduction.

Withtheamountofvariabilityinvolvedinthesoftwareproductionprocessanditsmanysubprocesses,aswellasthediversityofdevelopers,users,anduses,itisunlikelythatadeterministiccontrolsystemwillhelpimprovethesoftwareproductionprocess.Asinstatisticalphysics,onlyatechnologybasedonstatisticalmodeling,somethingakintostatisticalcontrol,willwork.ThepanelbelievesthatthejunctureathandisnotverydifferentfromtheonereachedbyDeminginthe1950swhenhebegantopopularizetheconceptofstatisticalprocesscontrol.Whatisneedednowisadetailedunderstandingbystatisticiansofthesoftwareengineeringprocess,aswellasanappreciationbysoftwareengineersofwhatstatisticianscanandcannotdo.Ifcollaborativeinteractionsandthebuildingofthismutualunderstandingcanbecultivated,thentherelikelywilloccuramajorimpactofthesameorderofmagnitudeasDeming'sintroductionofstatisticalprocesscontroltechniquesinhardwaremanufacturing.

Ofcourse,thisisnottosaythatallsoftwareproblemsaregoingtobesolvedbystatisticalmeans,justasnotallautomobilemanufacturingproblemscanbesolvedbystatisticalmeans.Onthecontrary,thesoftwareindustryhasbeentechnologydriven,andthebulkoffuturegainsinproductivitywillcomefromnew,creativeideas.Forexample,muchofthegaininproductivity

Page 136: Statistical software engineering

Page62

between1950and1970occurredbecauseofthereplacementofassemblercodingbyhigh-levellanguages.

Nevertheless,asthepanelattemptstopointoutinthisreport,increasedcollaborationbetweensoftwareengineersandstatisticiansholdsmuchpromiseforresolvingproblemsinsoftwaredevelopment.Someofthecatalyststhatareessentialforthisinteractiontobeproductive,aswellassomeoftherelatedresearchopportunitiesforsoftwareengineersandstatisticians,arediscussedbelow.

INSTITUTIONALMODELFORRESEARCH

Thepanelstronglybelievesthattherightmodelforstatisticalresearchinsoftwaredevelopmentiscollaborativeinnature.Itisessentialtoavoidsolvingthe"wrong"problems.Itisequallyimportantthattheproblemsidentifiedinthisreportnotbe"solved"bystatisticiansinisolation.Statisticiansneedtoattainadegreeofcredibilityinsoftwareengineering,andsuchcredibilitywillnotbeachievedbydevelopingNnewreliabilitymodelswithhigh-powerasymptotics.Theidealcollaborationpartnersstatisticiansandsoftwareengineersinworkaimedatimprovingarealsoftwareprocessorproduct.

Thisconclusionassumesnotonlythatstatisticiansandsoftwareengineershaveamutualdesiretoworktogethertosolvesoftwareengineeringproblems,butalsothatfundingandrewardmechanismsareinplacetostimulatethetechnicalcollaboration.Uptonow,suchincentiveshavenotbeenthenorminacademicinstitutions,giventhat,forexample,coauthoredpapershavebeengenerallydiscountedbypromotionevaluationcommittees.Moreover,atfundingagencies,proposalsforcollaborativeworkhavetendedto"fallthroughthecracks"becauseofalackofinterdisciplinaryexpertisetoevaluatetheirmerits.Thepanelexpectssuchbarrierstobereducedinthecomingyears,butintheinterim,industrycanplayaleadershiprolein

Page 137: Statistical software engineering

nurturingcollaborationsbetweensoftwareengineersandstatisticiansandcanreduceitsownsetofbarriers(forinstance,thoserelatedtoproprietaryandintellectualpropertyinterests).

MODELFORDATACOLLECTIONANDANALYSIS

Asdiscussedaboveinthisreport,forstatisticalapproachestobeuseful,itisessentialthathigh-qualitydatabeavailable.Qualityincludesmeasuringtherightthingsattherighttimespecifically,adoptedsoftwaremetricsmustberelevantforeachoftheimportantstagesofthedevelopmentlifecycle,andtheprotocolofmetricsforcollectingdatamustbewelldefinedandwellexecuted.Withoutcarefulpreparationthattakesaccountofallofthesedataissues,itisunlikelythatstatisticalmethodswillhaveanyimpactonagivensoftwareprojectunderstudy.Forthisreason,itiscrucialtohavethesoftwareindustrytakealeadpositioninresearchonstatisticalsoftwareengineering.

Figure8,amodelfortheinteractionbetweenresearchersandthesoftwaredevelopmentprocess,displaysahigh-levelspiralviewofthesoftwaredevelopmentprocessofferedbyDalal

Page 138: Statistical software engineering

Page63

Figure8.Spiralsoftwaredevelopmentprocessmodel.SSEM,statisticalsoftwareengineeringmodule.

Figure9.Statisticalsoftwareengineeringmoduleatstagen.

etal.(1994).Figure9givesamoredetailedviewofthestatisticalsoftwareengineeringmodule(SSEM)atthecenterofFigure8.

TheSSEMhasseveralcomponents.Oneofitsmajorfunctionsistoactasthecentralrepositoryforallrelevantprojectdata(statisticalornonstatistical).Thusthismoduleservesasaresourcefortheentireproject,interfacingwitheverystage,typicallyatitsrevieworconclusion.Forexample,theSSEMwouldbeusedattherequirementreviewstage,whendataoninspection,faults,times,effort,and

Page 139: Statistical software engineering

coverageareavailable.Fortesting,informationwouldbegatheredattheendofeachstageoftesting(unit,integration,system,alpha,beta,...)aboutthenumberofopenfaults,closedfaults,typesofproblems,severity,changes,andeffort.Suchdatawouldcomefromtestcasemanagementsystems,changemanagementsystems,andconfigurationmanagementsystems.

Page 140: Statistical software engineering

Page64

AdditionalelementsoftheSSEMincludecollectionprotocols,metrics,exploratorydataanalysis(EDA),modeling,confirmatoryanalysis,andconclusions.AcriticalpartoftheSSEMwouldberelatedtoroot-causeanalysis.AnalysiscouldbeassimpleasIshikawa'sfishbonediagram(Ishikawa,1976),ormorecomplex,suchasorthogonaldefectclassification(describedinChapter5).Thiscapabilityaccordswiththebeliefthatacarefulanalysisofrootcauseisessentialtoimprovingthesoftwaredevelopmentprocess.CentralplacementoftheSSEMensuresthattheresultsofvariousanalyseswillbecommunicatedatallrelevantstages.Forexample,atthecodereviewstage,theSSEMcansuggestwaysofimprovingtherequirementprocessaswellaspointoutpotentiallyerror-pronepartsofthesoftwarefortesting.

ISSUESINEDUCATION

Enormousopportunitiesandmanypotentialbenefitsarepossibleifthesoftwareengineeringcommunitylearnsaboutrelevantstatisticalmethodsandifstatisticianscontributetoandcooperateintheeducationoffuturesoftwareengineers.Theareasoutlinedbelowarethosethatarerelevanttoday.Asthecommunitymaturesinitsstatisticalsophistication,theareasthemselvesshouldevolvetoreflectthematurationprocess.

Designedexperiments.Softwareengineeringisinherentlyexperimental,yetrelativelyfewdesignedexperimentshavebeenconducted.Softwareengineeringeducationprogramsmuststressthedesirability,whereverfeasible,ofvalidatingnewtechniquesthroughtheuseofstatisticallyvalid,designedexperiments.Partofthereasonforthelackofexperimentationinsoftwareengineeringmayinvolvethelargevariabilityinhumanprogrammingcapabilities.AspointedoutinChapter5,themosttalented

Page 141: Statistical software engineering

programmermaybe20timesmoreproductivethantheleasttalented.Thisdisparitymakesitdifficulttoconductexperimentsbecausethebetween-subjectvariabilitytendstooverwhelmthetreatmenteffects.Experimentaldesignsthataddressbroadvariabilityinsubjectsshouldbeemphasizedinthesoftwareengineeringcurriculum.Asimilaremphasisshouldbegiventorandom-andfixed-effectsmodelswithhierarchicalstructureandtodistinguishingwithin-andbetween-experimentvariability.

ThereisalsoaroleforthestatisticsprofessioninthedevelopmentofguidelinesforexperimentsinsoftwareengineeringakintothosemandatedbytheFoodandDrugAdministrationforclinicaltrials.Theseguidelineswillrequirereformulationinthesoftwareengineeringcontextwiththepossibleinvolvementofvariousindustryandacademicforums,includingtheInstituteofElectricalandElectronicsEngineers,theAmericanStatisticalAssociation,andtheSoftwareEngineeringInstitute.

Exploratorydataanalysis.Itisimportanttoappreciatethestrengthsandthelimitationsofavailabledatabychallengingthedatawithabatteryofnumerical,tabular,andgraphicalmethods.Exploratorydataanalysismethods(e.g.,Tukey,1977;MostellerandTukey,1977)areessentially"modelfree,"sothatinvestigatorscanbesurprisedby

Page 142: Statistical software engineering

Page65

unexpectedbehaviorratherthanhavetheirthinkingconstrainedbywhatisexpected.Oneoftheattitudestowardstatisticalanalysisthatisimportanttoconveyisthatof

data=fit+residual.

Theiterativenatureofimprovingthemodelfitbyremovingstructurefromtheresidualsmustbestressedindiscussionsofstatisticalmodeling.

Modeling.Themodelsusedbystatisticiansdifferdramaticallyfromthoseusedbynonstatisticians.Thedifferencesstemfromadvancesinthestatisticalcommunityinthepastdecadethateffectivelyrelaxassumptionsoflinearityfornearlyallclassicaltechniques.Thisrelaxationisobtainedbyassumingonlylocallinearityandusingsmoothingtechniques(e.g.,splines)toregularizethesolutions(HastieandTibshirani,1990).Theresultisquiteflexiblebutinterpretablemodelsthatarerelativelyunknownoutsidethestatisticscommunity.Arguablythesemorerecentmethodslackthewell-studiedinferentialpropertiesofclassicaltechniques,butthatdrawbackisexpectedtoberemediedincomingyears.Educationalinformationexchangesshouldbeconductedtostimulatemorefrequentandwideruseofsuchcomparativelyrecenttechniques.

Riskanalysis.Softwaresystemsareoftenusedinconjunctionwithothersoftwareandhardwaresystems.Forexample,intelecommunications,anoriginatingcallisconnectedbyswitchingsoftware;however,theactualconnectionismadebyphysicalcables,transmissioncells,andothercomponents.Themegasystemsthuscreatedrunournation'stelephonesystems,stockmarkets,andnuclearpowerplants.Failurescanbeveryexpensive,ifnotcatastrophic.Thus,itisessentialtohavesoftwareandhardwaresystemsbuiltinsuchawaythattheycantoleratefaults

Page 143: Statistical software engineering

andprovideminimalfunctionality,whileprecludingacatastrophicfailure.Thistypeofsystemrobustnessisrelatedtoso-calledfault-tolerantdesignofsoftware(Leveson,1986).

Riskanalysishasplayedakeyroleinidentifyingfault-pronecomponentsofhardwaresystemsandhashelpedinmanagingtherisksassociatedwithverycomplexhardware-softwaresystems.AparadigmsuggestedbyDalaletal.(1989)forriskmanagementforthespaceshuttleprogramandcorrespondingstatisticalmethodsareimportantinthiscontext.Forsoftwaresystems,riskanalysistypicallybeginswithidentifyingprogrammingstyles,characteristicsofthemodulesresponsibleformostsoftwarefaults,andsoon.Statisticalanalysisofroot-causedataleadstoariskprofileforasystemandcanbeusefulinriskreduction.Riskmanagementalsoinvolvesconsiderationoftheprobabilityofoccurrenceofvariousfailurescenarios.SuchprobabilitiesareobtainedeitherbyusingtheDelphimethod(e.g.,Dalkey,1972;Pill,1971)orbyanalyzinghistoricaldata.Oneofthekeyrequirementsinfailure-scenarioanalysisistodynamicallyupdateinformationaboutthescenariosasnewdataonsystembehaviorbecomeavailable,suchasachanginguserprofile.

Page 144: Statistical software engineering

Page66

Attitudetowardassumptions.Assoftwareengineersareaware,amajordifferencebetweenstatisticsandmathematicsisthatforthelatter,itmattersonlythatassumptionsbecorrectlystated,whereasfortheformer,itisessentialthattheprevailingassumptionsbesupportedbythedata.Thisdistinctionisimportant,butunfortunatelyitisoftentakentooliterallybymanywhousestatisticaltechniques.Tukeyhaslongarguedthatwhatisimportantisnotsomuchthatassumptionsareviolatedbutratherthattheireffectonconclusionsiswellunderstood.Thusforalinearmodel,wherethestandardassumptionsincludenormality,homoscedasticity,andindependence,theirimportancetostatementsofinferenceisexactlyintheoppositeorder.Statisticstextbooks,courses,andconsultingactivitiesshouldconveythestatistician'slevelofunderstandingofandperspectiveontheimportanceofassumptionsforstatisticalinferencemethods.

Visualization.Theimportanceofplottingdatainallaspectsofstatisticalworkcannotbeoveremphasized.Graphicsisimportantinexploratorystagestoascertainhowcomplexamodelthedatacansupport;intheanalysisstagefordisplayofresidualstoexaminewhatacurrentlyentertainedmodelhasfailedtoaccountfor;andinthepresentationstagewheregraphicscanprovidesuccinctandconvincingsummariesofthestatisticalanalysisandassociateduncertainty.Visualizationcanalsohelpsoftwareengineerscopewith,andunderstand,thehugequantitiesofdatacollectedinthesoftwaredevelopmentprocess.

Tools.Softwareengineerstendtothinkofstatisticiansaspeoplewhoknowhowtorunaregressionsoftwarepackage.Althoughstatisticiansprefertothinkofthemselvesmoreasproblemsolvers,itisstillimportantthattheypointoutgoodstatisticalcomputingtools-forinstance,S,SAS,GLIM,RS1,andsoon-tosoftware

Page 145: Statistical software engineering

engineers.ACATSreport(NRC,1991)attemptstoprovideanoverviewofstatisticalcomputinglanguages,systems,andpackages,butforsuchmaterialtobeusefultosoftwareengineers,amorefocusedoverviewwillberequired.

Page 146: Statistical software engineering

Page67

ReferencesAbdel-Ghaly,A.A.,P.Y.Chan,andB.Littlewood.1986.Evaluationofcompetingsoftwarereliabilitypredictions.IEEETrans.SoftwareEng.SE-12(9):950-967.

Abdel-Hamid,T.1991.SoftwareProjectDynamics:AnIntegratedApproach.EnglewoodCliffs,N.J.:Prentice-Hall.

AmericanHeritageDictionaryoftheEnglishLanguage,The.1981.Boston:HoughtonMifflin.

AmericanStatisticalAssociation(ASA).1993.CombiningInformation:StatisticalIssuesandOpportunitiesforResearch,ContemporaryStatisticsSeries,No.1.Alexandria,Va.:AmericanStatisticalAssociation.

Baecker,R.M.andA.Marcus.1988.HumanFactorsandTypographyforMoreReadablePrograms.Reading,Mass.:AddisonWesley.

Baker,M.J.andS.G.Eick.1995.Space-fillingsoftwaredisplays.J.VisualLanguagesComput.6(2).Inpress.

Basili,V.1993.Measurement,analysisandmodeling,andexperimentationinsoftwareengineering.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

Basili,V.andD.Weiss.1984.Amethodologyforcollectingvalidsoftwareengineeringdata.IEEETrans.SoftwareEng.SE-10:6.

Becker,R.A.andW.S.Cleveland.1987.Brushingscatterplots.Technometrics29:127-142.

Page 147: Statistical software engineering

Becker,R.A.,W.S.Cleveland,andA.R.Wilks.1987.Dynamicgraphicsfordataanalysis.StatisticalScience2:355-383.

Beckman,R.J.andM.D.McKay.1987.MonteCarloestimationunderdifferentdistributionsusingthesamesimulation.Technometrics29:153-160.

Blum,M.,M.Luby,andR.Rubinfeld.1989.Programresultcheckingagainstadaptiveprogramsandincryptographicsettings.Pp.107-118inDistributedComputingandCryptography,J.FeigenbaumandM.Merritt,eds.DIMACS:SeriesinDiscreteMathematicsandTheoreticalComputerScience,Vol.2.Providence,R.I.:AmericanMathematicalSociety.

Blum,M.,M.Luby,andR.Rubinfeld.1990.Self-testing/correctingwithapplicationstonumericalproblems.STOC22:73-83.

Boehm,B.W.1981.SoftwareEngineeringEconomics.EngelwoodCliffs,N.J.:PrenticeHall.

Brocklehurst,S.andB.Littlewood.1992.Newwaystogetaccuratereliabilitymeasures.IEEESoftware9(4):34-42.

Brown,M.H.andJ.Hershberger.1992.Colorandsoundinalgorithmanimation.IEEEComputer25(12):52-63.

Burnham,K.P.andW.S.Overton.1978.Estimationofthesizeofaclosedpopulationwhencaptureprobabilitiesvaryamonganimals.Biometrika45:343-359.

Chillarege,R.,I.Bhandari,J.Chaar,M.Halliday,D.Moebus,B.Ray,andM.Wong.1992.Orthogonaldefectclassification-Aconceptforin-processmeasurements.IEEETrans.Software.Eng.SE-18:943-955.

Cohen,D.M.,S.R.Dalal,A.Kaija,andG.Patton.1994.Theautomaticefficienttestgenerator(AETG)system.Pp.303-309inProceedingsofthe5thInternationalSymposiumonSoftware

Page 148: Statistical software engineering

ReliabilityEngineering.LosAlamitos,Calif.:IEEEComputerSocietyPress.

Page 149: Statistical software engineering

Page68

Curtis,W.1988.Theimpactofindividualdifferencesinprogrammers.Pp.279-294inWorkingwithComputers:TheoryversusOutcome,G.C.vanderVeeretal.,eds.SanDiego,Calif.:AcademicPress.

Dalal,S.R.andC.L.Mallows.1988.Whenshouldonestopsoftwaretesting?J.Am.Statist.Assoc.83:872-879.

Dalal,S.R.andC.L.Mallows.1990.Somegraphicalaidsfordecidingwhentostoptestingsoftware.IEEEJ.SelectedAreasinCommunications8:169-175.(Specialissueonsoftwarequalityandproductivity.)

Dalal,S.R.andC.L.Mallows.1992.Buyingwithexactconfidence.Ann.Appl.Probab.2:752-765.

Dalal,S.R.andA.M.McIntosh.1994.Whentostoptestingforlargesoftwaresystemswithchangingcode.IEEETrans.SoftwareEng.SE-20:318-323.

Dalal,S.R.,E.B.Fowlkes,andA.B.Hoadley.1989.Riskanalysisofthespaceshuttle:Pre-Challengerpredictionoffailure.J.Am.Stat.Assoc.84:945-957.

Dalal,S.R.,J.R.Horgan,andJ.R.Kettenring.1994.ReliablesoftwareandcommunicationII:Controllingthesoftwaredevelopmentprocess.IEEEJ.SelectedAreasinCommunications12:33-39.

Dalkey,N.C.1972.StudiesintheQualityofLife-DelphiandDecision-Making.Lexington,Mass.:D.C.Heath&Co.

Dawid,A.P.1984.Statisticaltheory:Theprequentialapproach.J.R.Stat.Soc.LondonA147:278-292.

DeMillo,R.A.,D.S.Guindi,K.S.King,W.M.McCracken,andA.J.

Page 150: Statistical software engineering

Offutt.1988.AnextendedoverviewoftheMOTHRAmutationsystem.Pp.142-151inProceedingsoftheSecondWorkshoponSoftwareTesting,VerificationandAnalysis.Alberta,Canada:Banff.

Ebert,C.1992.Visualizationtechniquesforanalyzingandevaluatingsoftwaremeasures.IEEETrans.SoftwareEng.11(18):1029-1034.

Eckhardt,D.E.andL.D.Lee.1985.Atheoreticalbasisofmultiversionsoftwaresubjecttocoincidenterrors.IEEETrans.SoftwareEng.SE-11:1511-1517.

Eckhardt,D.E.,A.K.Caglayan,J.C.Knight,L.D.Lee,D.F.McAllister,M.A.Vouk,andJ.P.Kelly.1991.Anexperimentalevaluationofsoftwareredundancyasastrategyforimprovingreliability.IEEETrans.SoftwareEng.SE-17(7):692-702.

Eick,S.G.1994.Graphicallydisplayingtext.J.Comput.GraphicalStat.3(2):127-142.

Eick,S.G.,C.R.Loader,M.D.Long,S.A.VanderWiel,andL.G.Votta.1992a.Estimatingsoftwarefaultcontentbeforecoding.Pp.59-65inProceedingsofthe14thInternationalConferenceonSoftwareEngineering(Melbourne,Australia).LosAlamitos,Calif.:IEEEComputerSocietyPress.

Eick,S.G.,J.L.Steffen,andE.E.Sumner.1992b.(Atoolforvisualizinglineorientedsoftware.IEEETrans.SoftwareEng.11(18):957-968.

Ganser,E.R.,E.E.Koutsofios,S.C.North,andK.-P.Vo.1993.Atechniquefordrawingdirectedgraphs.IEEETrans.SoftwareEng.SE-19(3):214-230.

Halstead,M.H.1977.ElementsofSoftwareScience.NewYork:Elsevier.

Hastie,T.J.andR.J.Tibshirani.1990.GeneralizedAdditiveModels.London:Chapman&Hall.

Page 151: Statistical software engineering
Page 152: Statistical software engineering

Page69

Henrion,M.andB.Fischhoff.1986.Assessinguncertaintyinphysicalconstants.Am.J.Phys.54(9):791-798.

Horgan,J.R.andS.London.1992.ATAC:AdataflowtestingtoolforC.Pp.2-10inProceedingsoftheSecondSymposiumonAssessmentofQualitySoftwareDevelopmentTools(May27-29,1992,NewOrleans,La.),E.Nahouraii,ed.LosAlamitos,Calif.:IEEEComputerSocietyPress.

Humphrey,W.S.1988.Characterizingthesoftwareprocess:Amaturityframework.IEEESoftware5:73-79.

Humphrey,W.S.1989.ManagingtheSoftwareProcess.Reading,Mass.:AddisonWesley.

Iman,R.L.andW.J.Conover.1982.Adistributionfreeapproachtoinducingrankcorrelationsamonginputvariables.Commun.Stat.,PartB11:311-334.

InstituteofElectricalandElectronicsEngineers(IEEE).1990.IEEEStandardGlossaryofSoftwareEngineeringTerminology.IEEEStd.610.12-1990.NewYork:IEEE,Inc.

InstituteofElectricalandElectronicsEngineers(IEEE).1993.IEEEStandardforSoftwareProductivityMetrics.IEEEComputerSociety,IEEEStd.1045-1992,January11,1993.NewYork:IEEE,Inc.

Ishikawa,K.1976.GuidetoQualityControl.Tokyo,Japan:AsianProductivityOrganization.

Kahneman,D.,P.Slovic,andA.Tversky,eds.1982.JudgmentUnderUncertainty:HeuristicsandBiases.NewYork:CambridgeUniversityPress.

Keller,T.W.1993.Maintenanceprocessmetricsforspaceshuttle

Page 153: Statistical software engineering

flightsoftware.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

Kitchenham,B.1991.Nevermindthemetrics;whataboutthenumbers!Pp.28-37inFormalAspectsofMeasurement,T.Denvir,R.Herman,andR.W.Whitty,eds.ProceedingsoftheBCS-FACSWorkshop,May5,1991,SouthBankUniversity,London.NewYork:Springer-Verlag.

Kitchenham,B.1992.AnalyzingSoftwareData.MetricsClubReport.Manchester,England:NationalComputingCentre,Ltd.

Knight,J.C.andN.G.Leveson.1986.Experimentalevaluationoftheassumptionofindependenceinmultiversionsoftware.IEEETrans.SoftwareEng.SE-12(1):96-109.

Lee,D.andM.Yanakakis.1992.On-lineminimizationoftransitionsystems.Pp.264-274inProceedingsofthe24thAnnualACMSymposiumonTheoryofComputing.NewYork:AssociationforComputingMachinery.

Leveson,N.G.1986.Softwaresafety:why,what,andhow.ACMComput.Surveys8:125-163.

Lipton,R.1989.Newdirectionsintesting.Pp.191-202inDistributedComputingandCryptography,J.FeigenbaumandM.Merritt,eds.DIMACS:SeriesinDiscreteMathematicsandTheoreticalComputerScience,Vol.2.Providence,R.I.:AmericanMathematicalSociety.

Littlewood,B.1979.Softwarereliabilitymodelformodularprogramstructure.IEEETrans.ReliabilityR-28(3):241-246.

Littlewood,B.andD.R.Miller.1989.Conceptualmodelingofcoincidentfailuresinmultiversionsoftware.IEEETrans.SoftwareEng.SE-15(12):1596-1614.

Page 154: Statistical software engineering
Page 155: Statistical software engineering

Page70

Littlewood,B.andL.Strigini.1993.Validationofultra-highdependabilityforsoftware-basedsystems.CommunicationsoftheAssociationforComputingMachinery36(11):69-80.

Mallows,C.L.1973.SomecommentsonCp.Technometrics15:661-667.

McCabe,T.J.1976.Acomplexitymeasure.IEEETrans.SoftwareEng.SE-1(3):312-327.

McKay,M.D.,W.J.Conover,andR.J.Beckman.1979.Acomparisonofthreemethodsforselectingvaluesofinputvariablesintheanalysisofoutputfromacomputercode.Technometrics21:239-245.

Mosteller,F.andJ.W.Tukey.1977.DataAnalysisandRegression:ASecondCourseinStatistics.Reading,Mass.:AddisonWesley.

Munson,J.C.1993.Therelationshipbetweensoftwaremetricsandqualitymetrics.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

NationalResearchCouncil(NRC).1991.TheFutureofStatisticalSoftware.CommitteeonAppliedandTheoreticalStatistics,BoardonMathematicalSciences.Washington,D.C.:NationalAcademyPress.

NationalResearchCouncil(NRC).1992.CombiningInformation:StatisticalIssuesandOpportunitiesforResearch.CommitteeonAppliedandTheoreticalStatistics,BoardonMathematicalSciences.Washington,D.C.:NationalAcademyPress.(Reprintedin1993bytheAmericanStatisticalAssociationasVolume1intheASAContemporaryStatisticsseries.)

Nayak,T.K.1988.Estimatingpopulationsizebyrecapturesampling.Biometrika75:113-120.

Page 156: Statistical software engineering

Phadke,M.S.1993.Robustdesignmethodforsoftwareengineering.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

Pill,J.1971.TheDelphimethod:Substance,context,acritiqueandanannotatedbibliography.Socio-EconomicPlanningScience5:57-71.

Randell,B.andP.Naur,eds.1968.SoftwareEngineeringConceptsandTechniques.NATOScienceCommittee,ProceedingsoftheNATOConferences,October7-11,1968,Garmisch,Germany.NewYork:Petrocelli/Charter.

Sackman,H.1970.Man-ComputerProblem-Solving:ExperimentalEvaluationofTime-SharingandBatchProcessing.NewYork:Auerbach.

Siegrist,K.1988a.ReliabilityofsystemswithMarkovtransfersofcontrol.IEEETrans.SoftwareEng.SE-14(7):1049-1053.

Siegrist,K.1988b.ReliabilityofsystemswithMarkovtransfersofcontrol,II.IEEETrans.SoftwareEng.SE-14(10):1478-1480.

Singpurwalla,N.D.1991.Determininganoptimaltimeintervalfortestinganddebuggingsoftware.IEEETrans.SoftwareEng.17(4):313-319.

Smith,A.F.M.andG.O.Roberts.1993.BayesiancomputationviatheGibbssamplerandrelatedMarkovchainMonteCarlomethods.J.R.Stat.Soc.LondonB55(1):3-23.

Stasko,J.1993.Softwarevisualization.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

Page 157: Statistical software engineering
Page 158: Statistical software engineering

Page71

Stein,M.1987.LargesamplepropertiesofsimulationsusingLatinhypercubesampling.Technometrics29:143-151.

Tukey,J.W.1977.ExploratoryDataAnalysis.Reading,Mass.:AddisonWesley.

Tukey,J.W.1991.Useofmanycovariatesinclinicaltrials.Int.Stat.Rev.59(2):123-128.

VanderWiel,S.A.andL.G.Votta.1993.Assessingsoftwaredesignsusingcapture-recapturemethods.IEEETrans.SoftwareEng.SE-19(11):1045-1054.

Zuse,H.1991.SoftwareComplexity:MeasuresandMethods.Berlin:deGruyter.

Zweben,S.1993.Statisticalmethodsinastudyofsoftwarere-useprinciples.UnpublishedpaperpresentedatForumonStatisticalMethodsinSoftwareEngineering,October11-12,1993,NationalResearchCouncil,Washington,D.C.

Page 159: Statistical software engineering

Page72

Appendix:ForumProgramMONDAY,OCTOBER11,1993

8:00AM WelcomeandIntroductions

8:05AM SessiononSoftwareProcess

SessionChair:GloriaJ.Davis(NASA-AmesResearchCenter)

InvitedSpeakers:TedW.Keller(IBMCorporation),DavidCard(ComputerSciencesCorporation)

9:45AM Break

10:15AM SessiononSoftwareMetrics

SessionChair:BillCurtis(CarnegieMellonUniversity)

InvitedSpeakers:VictorR.Basili(UniversityofMaryland),JohnC.Munson(UniversityofFlorida)

NOONBreak

1:00PM SessiononSoftwareDependabilityandTesting

SessionChair:RichardA.DeMillo(PurdueUniversity)

InvitedSpeakers:JohnC.Knight(UniversityofVirginia),RichardLipton(PrincetonUniversity)

2:25PM Break

3:15PM SessiononCaseStudies

SessionChair:DarylPregibon(AT&TBellLaboratories,

Page 160: Statistical software engineering

MurrayHill)

InvitedSpeakers:TsuneoYamaura(HitachiComputerProducts-America,Inc.),StuartZweben(OhioStateUniversity)

5:00PM Adjourn


Recommended