Software Efforts at the NRO Cost Group
21st International Forum on COCOMO and Software Cost Modeling
November 8, 2006
Purpose
• Explain how the NCG uses USC’s code counter data
• Introduce NCG’s Software Database
• Insight into Difference Application Results and Software Trends
Background
• NCG maps the USC output files to a CSCI/CSC and Work Breakdown Structure (WBS)– Mapping is most meaningful when mapped to
the lowest functional level possible within the WBS
• Mapping is very labor intensive if done through excel spreadsheets or other manual methods
CSCI/CSC: DP/DPAP
CSCI/CSC: DP/DPCC
CSCI/CSC: DP/DPCU
Software Database
• NCG created a software database which automates the mapping process
• Software database is primary tool for storing ALL NCG software related data. Database will provide– Low level functional breakout– Traceability to past programs– Historical representation of development process
• Database will help us better understand trends for– Code Counts– Staffing Profiles– Discrepancy Reports (DRs)– Schedules
Database Functionality
• Database allows for the importation of:– Code counter output files for any language
– Difference Application output files
– CSCI/CSC listing
– Staffing data
– DR data
– Cost and Hours
– Programmatic/Technical data
• Database allows the mapping of output file folder paths to a WBS and CSCI/CSC
Walkthrough of Software Databases’ Key Functionalities
Unmap
Test Program
0
50000
100000
150000
200000
250000
CC DP MP PA
CSCI
SL
OC
Baseline 1
Baseline 2
Baseline 3
Test Program (SLOC by CSCI)
0
50000
100000
150000
200000
250000
Baseline 1 Baseline 2 Baseline 3
Baseline
SL
OC
CC
DP
MP
PA
Test Program SW Heads
-10
0
10
20
30
40
50
60
70
1/1/
2005
7/1/
2005
1/1/
2006
7/1/
2006
1/1/
2007
7/1/
2007
1/1/
2008
7/1/
2008
1/1/
2009
7/1/
2009
1/1/
2010
7/1/
2010
1/1/
2011
7/1/
2011
1/1/
2012
7/1/
2012
1/1/
2013
7/1/
2013
1/1/
2014
7/1/
2014
Time
Eq
uiv
ilan
t H
ead
s
C&C
PA
MP
DP
Test Program / MP CSCI
05
1015202530354045
1/22
/200
6
2/5/
2006
2/19
/200
6
3/5/
2006
3/19
/200
6
4/2/
2006
4/16
/200
6
4/30
/200
6
5/14
/200
6
Date
# D
Rs 1
2
3
DRs By Priority
0
10
20
30
40
50
60
70
5/11
/199
9
7/11
/199
9
9/11
/199
9
11/1
1/19
99
1/11
/200
0
3/11
/200
0
5/11
/200
0
7/11
/200
0
9/11
/200
0
11/1
1/20
00
1/11
/200
1
3/11
/200
1
5/11
/200
1
7/11
/200
1
9/11
/200
1
11/1
1/20
01
1/11
/200
2
3/11
/200
2
5/11
/200
2
7/11
/200
2
9/11
/200
2
11/1
1/20
02
1/11
/200
3
3/11
/200
3
5/11
/200
3
7/11
/200
3
9/11
/200
3
11/1
1/20
03
1/11
/200
4
3/11
/200
4
5/11
/200
4
7/11
/200
4
9/11
/200
4
Date
# o
f D
Rs
4
5
2
3
1
Insight into Difference Application Results and Software Trends
Introduction
• NCG collects datasheet information regarding program “Complexity Attributes”– This can be defined as “Program Development Environment” data
• e.g. Number of years of experience of programmers …
• We also collect USC output files for LOC and Reuse trends
• We collect Staffing Profiles– Staffing Profiles are broken out by CSCI
• We collect Discrepancy Reports (DRs)– DRs are broken out by Priorities and ranked in some manner
• Are there any useful trends if we analyze all the data collectively?
CSCI #1 Example
CSCI 1 Diff Results
0
20000
40000
60000
80000
100000
1 2 3 4 5 6
Samples
SL
OC
New
Deleted
Modified
Unmodified
CSCI 1
0.000%
5.000%
10.000%
15.000%
20.000%
25.000%
30.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Logical
Trig
Log
Preproc
Math
Assign
Ptr
Cond
CSCI 1
0.000%
0.100%
0.200%
0.300%
0.400%
0.500%
0.600%
0.700%
0.800%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Loop1
Loop2
Loop3
Loop4
CSCI 1
0
2
4
6
8
10
12
14
16
18
1 2 3 4 5 6
0
10000
20000
30000
40000
50000
60000
Heads
New
Below is a summary of CSCI #1Code
reaches stable point
Increase in staff to fix
DR’s
New code looks
proportional to Staffing !!!
Heritage code written
in C and remainder
written in C++
CSCI #2 Example
Below is a summary of CSCI #2CSCI 2 Diff Results
0
50000
100000
150000
200000
250000
1 2 3 4 5 6
Samples
SL
OC
New
Deleted
Modified
Unmodified
CSCI 2
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6
0
20000
40000
60000
80000
100000
120000
Heads
New
CSCI 2
0.000%
0.100%
0.200%
0.300%
0.400%
0.500%
0.600%
0.700%
0.800%
0.900%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Loop1
Loop2
Loop3
Loop4
CSCI 2
0.000%
2.000%
4.000%
6.000%
8.000%
10.000%
12.000%
14.000%
16.000%
18.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Logical
Trig
Log
Preproc
Math
Assign
Ptr
Cond
Keep an eye on the peak
staffing levels
Heritage code written in C
and remainder written in C++
CSCI #3 Example
Below is a summary of CSCI #3CSCI 3 Diff Results
0
50000
100000
150000
200000
250000
1 2 3 4 5 6
Samples
SL
OC
New
Deleted
Modified
Unmodified
CSCI 3
0.000%
5.000%
10.000%
15.000%
20.000%
25.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Logical
Trig
Log
Preproc
Math
Assign
Ptr
Cond
CSCI 3
0.000%0.200%0.400%
0.600%0.800%1.000%1.200%1.400%
1.600%1.800%2.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Loop1
Loop2
Loop3
Loop4
CSCI 3
0
5
10
15
20
25
30
1 2 3 4 5 6
0
20000
40000
60000
80000
100000
120000
140000
160000
Heads
New
Peak staffing trend?
Heritage code written in C
and remainder written in C++
CSCI #4 Example
Below is a summary of CSCI #4CSCI 4 Diff Results
0
2000040000
6000080000
100000120000
140000
1 2 3 4 5 6
Samples
SL
OC
New
Deleted
Modified
Unmodified
CSCI 4
0
5
10
15
20
25
1 2 3 4 5 6
0
10000
20000
30000
40000
50000
60000
Heads
New
CSCI 4
0.000%0.200%0.400%0.600%0.800%1.000%1.200%1.400%1.600%1.800%2.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Loop1
Loop2
Loop3
Loop4
CSCI 4
0.000%
5.000%
10.000%
15.000%
20.000%
25.000%
30.000%
35.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Logical
Trig
Log
Preproc
Math
Assign
Ptr
Cond
Low Modified code trend continues
Heritage code written in C
and remainder written in C++
CSCI #5 Example
Below is a summary of CSCI #5CSCI 5 Diff Results
0100000200000300000400000500000600000700000800000
1 2 3 4 5 6
Samples
SL
OC
New
Deleted
Modified
Unmodified
CSCI 5
0.000%
5.000%
10.000%
15.000%
20.000%
25.000%
30.000%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Logical
Trig
Log
Preproc
Math
Assign
Ptr
Cond
CSCI 5
0.000%
0.100%
0.200%
0.300%
0.400%
0.500%
0.600%
1 2 3 4 5 6 7 8
Baselines
Den
sit
y R
ati
o
Loop1
Loop2
Loop3
Loop4
CSCI 5
0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Heads
New
Any guess on why this looks different from the previous trends?
Reuse
• This CSCI shows the usefulness of breaking out New and Deleted code from the Total code counts
• What assessment can be made of this development?– Unstable requirements?
– Re-writing same code?• Code cleanup occurring after each New code delivery?
0
5000
10000
15000
20000
25000
30000
1 2 3 4 5 6 7 8 9 10 11
Baselines
SL
OC
TOTAL
UNMODIFIED
MODIFIED
DELETED
NEW
Total counts show change,
but without “Diff”, you
can’t see why
Notice that there is nearly 5,000 New SLOC and 5,000 Deleted SLOC
Doesn’t look STABLE
Deleted code size is
proportional to previous New
code size
Code looks stable at this point, but in
reality, it was dynamic!!
More on Reuse
• USC “Diff” results can provide better insight into how much Reuse came from heritage programs
• Example:– Program B uses Program A software as a starting point
– Program A metrics:• 50,841 Total Logical Code Counts
– Program B is completed and returns the following metrics:
• 1,937,167 Total Logical Code Counts
More on Reuse (cont.)
• Run “Diff” counter on the initial baseline (in this case, Program A) and the final baseline (Program B)– “Diff” results show:
• New Logical code: 1,918,011• Deleted Logical code: 31,685• Modified Logical code: 5,701• Unmodified Logical code: 13,455
– Compute Reuse from Program A:– Unmodified (at Program B completion) / Total (at Program B start)– 13,455 / 50,841 = 26%
» 26% of Program A was “DIRECT” reuse into Program B
» This is not 26% of Program B!– This is one way to simplify the Reuse problem
Operators / Tokens• Here are 8 types of operators that could be counted:
– Logical• &&, ||
– Trigonometric• Cos(), Sin(), Tan(), Cot(), Csc(), Sec()
– Log • Log(), Ln ()
– Preprocessor• #, Sizeof()
– Math• +, -, *, /, sqrt(), %, ^
– Assignment• =
– Pointers– Conditional
• if, else, switch, case, ?• As well as, Nesting of the loops
– Level 1 loop• Level 2 loop
– Level 3 loop» Level 4 loop
Complexity Density Ratios
• Here are snapshots at various baselines of a CSCI– If we only looked at the final numbers, we would assume
nearly 450,000 Logical lines of code
– What do you make of this????
SLOC
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
1 2 3 4 5 6 7 8
A growth of 371,868
Logical SLOC
What productive
developers!
Complexity Density Ratios (cont)
• Looking at the Complexity Density Ratio for different operators, we can understand more about the development
Cond
0.0%
1.0%2.0%
3.0%4.0%
5.0%6.0%
7.0%8.0%
9.0%
1 2 3 4 5 6 7 8
Preproc
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
1 2 3 4 5 6 7 8
Loop - Level 1
0.0%0.1%0.2%0.3%0.4%0.5%0.6%0.7%0.8%0.9%1.0%
1 2 3 4 5 6 7 8
Ptr
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
1 2 3 4 5 6 7 8
Code looks
different
Complexity Density Ratios (cont)
• Looking at the SW as an entity, composed of many different elements, the Complexity Density Ratio allows us to see the make up of the SW– This can be characterized as a signature of the SW
development!• In the previous example, the program received SW
from another project (not directly associated to the current project)– The Complexity Density Ratio validates that the SW is
different for the ongoing development• This should be taken into account when trying to come up with
any productivities– ANALYST BEWARE!
» DON’T BLINDLY USE DATA
Summary
• The NCG continues to standardize our code counting efforts– Essential for normalizing our data across multiple programs,
multiple contractors
• The NCG works closely with USC to develop a complete USC Code Counting Tool Suite– Addressing necessities such a new ways of looking at reuse,
complexities, trends, etc.
• The NCG has invested extensive resources to use the USC code counter files and parse the USC output files
• Our goal is to establish consistency across the Intelligence Community– Primarily involves our industry contractors