Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 221 times |
Download: | 0 times |
Lecture 7 Estimation
Estimate size, thenEstimate effort, schedule and cost from sizeBound estimates
CS 540 – Quantitative Software Engineering
Project Metrics: Why Estimate?
Cost and schedule estimation Measure progress Calibrate models for future estimation Manage Project Scope Make Bid/No Bid decisions Make Buy/Build decisions
QSE Lambda Protocol
Prospectus Measurable Operational Value Prototyping or Modeling sQFD (Quality Functional Deployment) Schedule, Staffing, Quality Estimates ICED-T (Intuitive, Consistent, Efficient, Durable-
Thoughtful) Trade-off Analysis
Specification for Development Plan
Project Feature List Development Process Size Estimates Staff Estimates Schedule Estimates Organization Gantt Chart
Approaches to Cost Estimation
• By expert
• By analogies
• Decomposition
• Parkinson’s Law; work expands to fill time available
• Price to win/ customer willingness-to -pay
• Lines of Code
• Function Points
• Mathematical Models: Function Points & COCOMO
Heuristics to do Better Estimates
Decompose Work Breakdown Structure to lowest possible level and type of software.
Review assumptions with all stakeholders Do your homework - past organizational experience Retain contact with developers Update estimates and track new projections (and warn) Use multiple methods Reuse makes it easier (and more difficult) Use ‘current estimate’ scheme
Heuristics to Cope with Estimates
Add and train developers early Use gurus for tough tasks Provide manufacturing and admin support Sharpen tools Eliminate unrelated work and red tape (50% issue) Devote full time end user to project Increase level of exec sponsorship to break new ground (new
tools, techniques, training) Set a schedule goal date but commit only after detailed design Use broad estimation ranges rather than single point estimates
Popular Methods for Effort Estimation
• Parametric Estimation• Wideband Delphi• Cocomo• SLIM (Software Lifecycle Management)• SEER-SEM• Function Point Analysis• PROBE (Proxy bases estimation, SEI CMM)• Planning Game (XP) Explore-Commit• Program Evaluation and Review Technique (PERT)
SEER-SEM: System Evaluation and Estimation of Resources
Sizing. How large is the software project being estimated (Lines of Code, Function Points, Use Cases, etc.)
Technology. What is the possible productivity of the developers (capabilities, tools, practices, etc.)
Effort and Schedule Calculation. What amount of effort and time are required to complete the project?
Constrained Effort/Schedule Calculation. How does the expected project outcome change when schedule and staffing constraints are applied?
Activity and Labor Allocation. How should activities and labor be allocated into the estimate?
Cost Calculation. Given expected effort, duration, and the labor allocation, how much will the project cost?
Defect Calculation. Given product type, project duration, and other information, what is the expected, objective quality of the delivered software?
Maintenance Effort Calculation. How much effort will be required to adequately maintain and upgrade a fielded software system?
Progress. How is the project progressing and where will it end up. Also how to replan. Validity. Is this development achievable based on the technology involved?
Wide Band Delphi
Convene a group of experts Coordinator provides each expert with specification Experts make private estimate in interval format: most
likely value and an upper and lower bound Coordinator prepares summary report indicating group
and individual estimates Experts discuss and defend estimates Group iterates until consensus is reached
Minimum Time: PERT and GANTT
Time
Staff-month
Ttheoretical
75% * Ttheoretical
Impossible design
Linear increase
Boehm: “A project can not be done in less than 75% of theoretical time”
Ttheoretical = 2.5 * 3√staff-months
But, how can I estimate staff months?
Sizing Software Projects
Effort = (productivity)-1 (size)c
productivity ≡ staff-months/kloc
size ≡ kloc
Staff
months
Lines of Code or
Function Points
500
Understanding the equations
Consider a transaction project of 38,000 lines of code, what is the shortest time it will take to develop? Module development is about 400 KSLOC/staff month
Effort = (productivity)-1 (size)c
= (1/.400 KSLOC/SM) (38 KSLOC)1.02
= 2.5 (38)1.02 ≈ 100 SMMin time = .75 T= (.75)(2.5)(SM)1/3
≈ 1.875(100)1/3
≈ 1.875 x 4.63 ≈ 9 months
0
2
4
6
8
10
12
20 40 80 160 320 640 1280 2560 5120 10240 20480 40960
Function Points
Bell Laboratories data
Capers Jones data
Prod
uctiv
ity (F
unct
ion
poin
ts /
staf
f mon
th)
Productivity= f(size)
Lines of Code
LOC ≡ Line of Code KLOC ≡ Thousands of LOC KSLOC ≡ Thousands of Source LOC NCSLOC ≡ New or Changed KSLOC
Productivity per staff-month:» 50 NCSLOC for OS code (or real-time system)
» 250-500 NCSLOC for intermediary applications (high risk, on-line)
» 500-1000 NCSLOC for normal applications (low risk, on-line)
» 10,000 – 20,000 NCSLOC for reused code
Reuse note: Sometimes, reusing code that does not provide the exact functionality needed can be achieved by reformatting input/output. This decreases performance but dramatically shortens development time.
Bernstein’s rule of thumb
Productivity: Measured in 2000
Classical rates 130 – 195 NCSLOC
Evolutionary approaches 244 – 325 NCSLOC
New embedded flight software
17 – 105 NCSLOC
Heuristics for requirements engineering
Move some of the desired functionality into version 2
Deliver product in stages 0.2, 0.4… Eliminate features Simplify Features Reduce Gold Plating Relax the specific feature specifications
Function Point (FP) Analysis
Useful during requirement phase Substantial data supports the methodology Software skills and project characteristics are accounted
for in the Adjusted Function Points FP is technology and project process dependent so that
technology changes require recalibration of project models.
Converting Unadjusted FPs (UFP) to LOC for a specific language (technology) and then use a model such as COCOMO.
Function Point Calculations
Unadjusted Function Points
UFP= 4I + 5O + 4E + 10L + 7F, Where
I ≡ Count of input types that are user inputs and change data structures. O ≡ Count of output typesE ≡ Count of inquiry types or inputs controlling execution.
[think menu selections]L ≡ Count of logical internal files, internal data used by system
[think index files; they are group of logically related data entirely within the applications boundary and maintained by external inputs. ]
F ≡ Count of interfaces data output or shared with another application
Note that the constants in the nominal equation can be calibrated to a specific software product line.
External Inputs – One updates two files
External Inputs (EI) - when data crosses the boundary from outside to inside. This data may come from a data input screen or another application.
External Interface Table
For example, EIs that reference or update 2 File Types Referenced (FTR’s) and has 7 data elements would be assigned a ranking of average and associated rating of 4.
File Type References (FTR’s) are the sum of Internal Logical Files referenced or updated and External Interface Files referenced.
External Output from 2 Internal Files
External Outputs (EO) – when data passes across the boundary from inside to outside.
External Inquiry drawing from 2 ILFs
External Inquiry (EQ) - an elementary process with both input and output components that result in data retrieval from one or more internal logical files and external interface files. The input process does not update Internal Logical File, and there is no derived data.
EO and EQ Table mapped to Values
Adjusted Function Points
Accounting for Physical System Characteristics
Characteristic Rated by System User
• 0-5 based on “degree of influence”
• 3 is average
UnadjustedFunction
Points (UFP)
UnadjustedFunction
Points (UFP)
General SystemCharacteristics
(GSC)
General SystemCharacteristics
(GSC)
X
=
AdjustedFunction
Points (AFP)
AdjustedFunction
Points (AFP)
AFP = UFP (0.65 + .01*GSC), note GSC = VAF= TDI
1. Data Communications
2. Distributed Data/Processing
3. Performance Objectives
4. Heavily Used Configuration
5. Transaction Rate
6. On-Line Data Entry
7. End-User Efficiency
8. On-Line Update
9. Complex Processing
10. Reusability
11. Conversion/Installation Ease
12. Operational Ease
13. Multiple Site Use
14. Facilitate Change
Complexity Table
TYPE: SIMPLE AVERAGE COMPLEX
INPUT (I) 3 4 6
OUTPUT(O) 4 5 7
INQUIRY(E) 3 4 6
LOG INT (L) 7 10 15
INTERFACES (F)
5 7 10
Complexity Factors
1. Problem Domain ___2. Architecture Complexity ___3. Logic Design -Data ___4. Logic Design- Code ___
Total ___
Complexity = Total/4 = _________
Problem Domain Measure of Complexity (1 is simple and 5 is complex)
1. All algorithms and calculations are simple.2. Most algorithms and calculations are simple.3. Most algorithms and calculations are moderately
complex.4. Some algorithms and calculations are difficult.5. Many algorithms and calculations are difficult.
Score ____
Architecture ComplexityMeasure of Complexity (1 is simple and 5 is complex)
1. Code ported from one known environment to another. Application does not change more than 5%.2. Architecture follows an existing pattern. Process design is straightforward. No complex hardware/software interfaces.3. Architecture created from scratch. Process design is straightforward. No complex hardware/software interfaces.4. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces exist but they are well defined and unchanging.5. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces are ill defined and changing.
Score ____
Logic Design -Data
1. Simple well defined and unchanging data structures. Shallow inheritance in class structures. No object classes have inheritance greater than 3.
2. Several data element types with straightforward relationships. No object classes have inheritance greater than
3. Multiple data files, complex data relationships, many libraries, large object library. No more than ten percent of the object classes have inheritance greater than three. The number of object classes is less than 1% of the function points
4. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large but stable number of object classes.
5. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large and growing number of object classes. No attempt to normalize data between modules
Score ____
Logic Design- Code
1. Nonprocedural code (4GL, generated code, screen skeletons). High cohesion. Programs inspected. Module size constrained between 50 and 500 Source Lines of Code (SLOCs).
2. Program skeletons or patterns used. ). High cohesion. Programs inspected. Module size constrained between 50 and 500 SLOCs. Reused modules. Commercial object libraries relied on. High cohesion.
3. Well-structured, small modules with low coupling. Object class methods well focused and generalized. Modules with single entry and exit points. Programs reviewed.
4. Complex but known structure randomly sized modules. Some complex object classes. Error paths unknown. High coupling.
5. Code structure unknown, randomly sized modules, complex object classes and error paths unknown. High coupling.
Score __
Computing Function Points
See http://www.engin.umd.umich.edu/CIS/course.des/cis525/js/f00/artan/functionpoints.htm
Adjusted Function Points
Now account for 14 characteristics on a 6 point scale (0-5) Total Degree of Influence (DI) is sum of scores. DI is converted to a technical complexity factor (TCF)
TCF = 0.65 + 0.01DI Adjusted Function Point is computed by
FP = UFP X TCF For any language there is a direct mapping from Function
Points to LOC
Beware function point counting is hard and needs special skills
Function Points Qualifiers
Based on counting data structures Focus is on-line data base systems Less accurate for WEB applications Even less accurate for Games, finite state machine and
algorithm software Not useful for extended machine software and compliers
An alternative to NCKSLOC because estimates can be based on requirements and design data.
SLOC Defined
:• Single statement, not two separated by semicolon• Line feed• All written statements (OA&M)• No Comments• Count all instances of calls, subroutines, …
There are no industry standards and SLOC can be fudged
Initial Conversion
Language Median SLOC/function pointC 104
C++ 53
HTML 42
JAVA 59
Perl 60
J2EE 50
Visual Basic 42
http://www.qsm.com/FPGearing.html
Average Median Low High Consultant
SLOC
Function Points = UFP x TCF = 78 * .96 = 51.84 ~ 52 function points
78 UFP * 53 (C++) SLOC / UFP = 4,134 SLOC
≈ 4.2 KSLOC
.
(Reference for SLOC per function point: http://www.qsm.com/FPGearing.html)
Understanding the equations
For 4,200 lines of code, what is the shortest time it will take to develop? Module development is about 400 SLOC/staff month
From COCOMO:Effort = 2.4 (size)c
By Barry Boehm
What is ‘2.4?’
Effort = 2.4 (size)c = 1/(.416) (size)c
Effort = (productivity)-1 (size)c
where productivity = 400 KSLOC/SM from the statement of the problem
= (1/.400 KSLOC/SM)(4.2 KSLOC)1.16
= 2.5 (4.2)1.16 ≈ 13 SM
Minimum Time
Theoretical time = 2.5 * 3√staff-months
Min time = .75 Theorectical time= (.75)(2.5)(SM)1/3
≈ 1.875(13)1/3
≈ 1.875 x 2.4 ≈ 4.5 months
Function Point pros and cons
Pros:
• Language independent
• Understandable by client
• Simple modeling
• Hard to fudge
• Visible feature creep
Cons:• Labor intensive• Extensive training • Inexperience results in
inconsistent results• Weighted to file
manipulation and transactions
• Systematic error introduced by single person, multiple raters advised
Easy?
“When performance does not meet the estimate, there are two possible causes:
poor performance or poor estimates.
In the software world, we have ample evidence that our estimates stink, but virtually no evidence that people in general don’t work hard enough or intelligently enough.” -- Tom DeMarco
Capers Jones Expansion Table
Bernstein’s Trends in Software Expansion
Small ScaleReuse
1990SubsecTime Sharing
1995ObjectOrientedProgramming
1960MachineInstructions
1965MacroAssembler
1970High LevelLanguage
1975Database Manager
1980On-line
1985Prototyping
2000Large ScaleReuse
1
10
100
1000
3
15
3037.5
47
75 81113
142
475
638
RegressionTesting
4GL
Order of MagnitudeEvery Twenty Years
ExpansionFactor
TechnologyChange
Sizing Software Projects
Effort = (productivity)-1 (size)c
Staff
months
Lines of Code or
Function Points
500 1000
Regression Models
• Effort:» Watson-Felix: Effort = 5.2 KLOC 0.91
» COCOMO: Effort = 2.4 KLOC 1.05 » Halstead: Effort = 0.7 KLOC 1.50
• Schedule:» Watson-Felix: Time = 2.5E 0.35
» COCOMO: Time = 2.5E 0.38
» Putnam: Time = 2.4E 0.33
COCOMO
COnstructive COst MOdel Based on Boehm’s analysis of a database of 63 projects -
models based on regression analysis of these systems Linked to classic waterfall model Effort is number of Source Lines of Code (SLOC) expressed in
thousands of delivered source instructions (NCKSLOC) - excludes comments and unmodified software
Original model has 3 versions and considers 3 types of systems:• Organic - e.g.,simple business systems• Embedded -e.g., avionics• Semi-detached -e.g., management inventory systems
COCOMO Model
Effort in staff months =b*NCKSLOCc
b c
organic 2.4 1.05
semi-detached
3.0 1.12
embedded 3.6 1.20
COCOMO System Types
SIZE INNOVATION DEADLINE CONSTRAINTS
Organic Small Little Not tight Stable
Semi-Detached
Medium Medium Medium Medium
Embedded Large Greater Tight Complex hdw/customer interfaces
Proposed System
Proposed System
Check Status
Create Order
Shipment Notice
Inventory
Assign Inventory to Order
Inventory Assigned
New Inventory for Held Orders
Assign Order to Truck
Truckload Report
Shipping Invoices
Order Update
Order Display
Problem ResolutionDispatch
Accounting
Management Reports
Customer
Check Credit &
Completion
Users
Catalog
Orders
OrderCreation
Credit Check
InventoryAssignment
Held OrderProcessing
Completion
DispatchSupport
ProblemResolution
ManagementReporting
OA&M
Case Study:Use Cases Transactions Type Complexity UFP
End UsersLogon 1 I 3 3View Last Bill 1 Q 6 6Create Account 1 I 6 6View Current Services 1 Q 4 4Establish Analog CATV Service 1 I 6 6Add Data Service 1 I 6 6Add/Delete a Premium Channel 1 I 4 4Add/Delete a Digital Package 1 I 6 6View Trouble Status 1 Q 4 4View Order Status 1 Q 3 3View Information 5 Q 3 15
BackEndGet Account & Service Info 1 N 10 10Get Last Bill 1 N 10 10Create Account 1 N 10 10Create Order 1 N 10 10Account Validation 3 N 7 21Order Validation 3 N 7 21Get Trouble Status 1 N 7 7Get Order Status 1 N 7 7
ManagementView Customer Use Statistics 5 Q 4 20Troubleshoot Customer Scenario 5 Q 6 30
OA&MUser Administration 2 F 7 14Table Administration 15 F 7 105Usage DB Administration 1 F 15 15Temp DB Admin 1 F 15 15Schedule Reports 1 I 4 15Control Application 1 I 4 15Create Reports 1 I 6 15Application Alarms 1 O 7 15
Total Unadjusted Function Points 418
GSC
General System Characteristic Rating1 Data Communications 52 Distributed Data/Processing 43 Performance Objectives 54 Heavily Used Configuration 55 Transaction Rate 56 On-Line Data Entry 37 End-User Efficiency 58 On-Line Update 39 Complex Processing 3
10 Reusability 211 Conversion/Installation Ease 312 Operational Ease 413 Multiple Site Use 414 Facilitate Change 4
Total Degree of Influence 55
UFP (.65+.01*GSC)418 1.2
Adjusted Function Points 502
Applying the equations
For 418 UFP x 63 (Java) SLOC/FP = 26334 SLOC
≈ 30 KSLOC
How long will it take to develop?
Module development is about 330 SLOC/staff month
Summary: Popular Methods for Effort Estimation
• Parametric Estimation• Wideband Delphi• Cocomo• SLIM (Software Lifecycle Management)• SEER-SEM• Function Point Analysis• PROBE (Proxy bases estimation, SEI CMM)• Planning Game (XP) Explore-Commit• Program Evaluation and Review Technique (PERT)
Business Realities
Customer Affordability/Willingness to pay: design the system to win the business
Conflict of Interests: Project Manager (affordability and profit) vs. Development/Test (pad budgets)
Estimation with uncertainty: the more you know (better understood)—the higher the estimate
Personality Traits: risk aversion, tolerance for ambiguity Staffing Issues: sometimes any business is better than no business Opportunity Cost: Winning a bid prevents you from working on other
deals Strategic Interests: Losing $ is sometimes ok