+ All Categories
Home > Documents > [American Institute of Aeronautics and Astronautics AIAA SPACE 2009 Conference & Exposition -...

[American Institute of Aeronautics and Astronautics AIAA SPACE 2009 Conference & Exposition -...

Date post: 08-Dec-2016
Category:
Upload: edmund
View: 222 times
Download: 3 times
Share this document with a friend
16
Copyright © 2009 by Edmund H. Conrow 1 Estimating Technology Readiness Level Coefficients Dr. Edmund H. Conrow, CMC, CPCM, CRM, PMP * Management and Technology Associates, Redondo Beach, California 90278 NASA and Department of Defense Technology Readiness Level (TRL) scales are in widespread use and provide inputs for a variety of systems engineering and project management functions. The common nine-level TRL scales have ordinal coefficients which both limit their usefulness and introduce errors if mathematical operations are performed on the TRL scale values (e.g., averaging). The Analytic Hierarchy Process (AHP) was used to estimate cardinal TRL scale values. The average of the deviations between the ordinal coefficients (1 through 9) and the AHP estimated cardinal coefficients was 166 percent. A high quality curve fit of the AHP estimated coefficients was also developed (degrees of freedom adjusted statistical coefficient of determination equal to 0.996) which permits generating non-integer TRL values for use in mathematical operations. Nomenclature R 2 = statistical coefficient of determination RSS = relative schedule slippage α = first coefficient in the regression of RSS and TRL β = second coefficient in the regression of RSS and TRL t = t value, tests the hypothesis that the specified regression coefficient is zero Introduction The purpose of this paper is to: 1) present the Technology Readiness Level (TRL) scale commonly used to evaluate hardware maturity on a variety of aerospace and non-aerospace applications, 2) discuss the fact that the TRL scale is ordinal and that common mathematical operations cannot be performed on the scale values, 3) provide a methodology to calibrate the scale coefficients to yield cardinal estimates for each TRL value, 4) provide the estimated TRL cardinal coefficients, 5) present the differences between the ordinal and resulting coefficients, 6) provide a curve fit of the cardinal coefficients to permit TRL estimates for fractional values (between 1 TRL 9), 7) discuss the limitations of the cardinal estimation methodology and the resulting coefficients, and 8) discuss in the Appendix a methodology that attempts to relates TRL values to schedule slippage and mention some potential limitations associated with this approach. Of these eight items, all but the first represents new work that advances the state of the art and can be applied to a wide variety of both aerospace and non-aerospace programs. “Technology Readiness Levels are a systematic metric/measurement system that supports assessments of the maturity of a particular technology and the consistent comparison of maturity between different types of technology.”[John C. Mankins, “Technology Readiness Levels,” White Paper, NASA Office of Space Access and Technology, 6 April 1995; available online at www.hq.nasa.gov/office/codeq/trl/trl.pdf .] TRLs have a wide variety of uses in aerospace systems engineering and project management, ranging from permitting technology maturation tracking; helping to balance cost, performance, schedule; and risk management (via the use of a condensed, reversed scale to represent the technology maturity component of probability of occurrence). Scales involving TRLs or similar proxies for hardware maturity were developed and used by the Department of Defense (DoD) and the National Aeronautics and Space Administration (NASA) in the 1980s but more widely used following dissemination of John C. Mankins’ 1995 NASA White Paper. Mankins’ TRL definitions are given in Table 1. * Principal, P. O. Box 1125, Redondo Beach, CA 90278, www.risk-services.com , Associate Fellow and Life Member. AIAA SPACE 2009 Conference & Exposition 14 - 17 September 2009, Pasadena, California AIAA 2009-6727 Copyright © 2009 by Edmund H. Conrow. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.
Transcript

Copyright © 2009 by Edmund H. Conrow

1

Estimating Technology Readiness Level Coefficients

Dr. Edmund H. Conrow, CMC, CPCM, CRM, PMP* Management and Technology Associates, Redondo Beach, California 90278

NASA and Department of Defense Technology Readiness Level (TRL) scales are in widespread use and provide inputs for a variety of systems engineering and project management functions. The common nine-level TRL scales have ordinal coefficients which both limit their usefulness and introduce errors if mathematical operations are performed on the TRL scale values (e.g., averaging). The Analytic Hierarchy Process (AHP) was used to estimate cardinal TRL scale values. The average of the deviations between the ordinal coefficients (1 through 9) and the AHP estimated cardinal coefficients was 166 percent. A high quality curve fit of the AHP estimated coefficients was also developed (degrees of freedom adjusted statistical coefficient of determination equal to 0.996) which permits generating non-integer TRL values for use in mathematical operations.

Nomenclature R2 = statistical coefficient of determination RSS = relative schedule slippage α = first coefficient in the regression of RSS and TRL β = second coefficient in the regression of RSS and TRL t = t value, tests the hypothesis that the specified regression coefficient is zero

Introduction The purpose of this paper is to: 1) present the Technology Readiness Level (TRL) scale commonly used to

evaluate hardware maturity on a variety of aerospace and non-aerospace applications, 2) discuss the fact that the TRL scale is ordinal and that common mathematical operations cannot be performed on the scale values, 3) provide a methodology to calibrate the scale coefficients to yield cardinal estimates for each TRL value, 4) provide the estimated TRL cardinal coefficients, 5) present the differences between the ordinal and resulting coefficients, 6) provide a curve fit of the cardinal coefficients to permit TRL estimates for fractional values (between 1 ≤ TRL ≤ 9), 7) discuss the limitations of the cardinal estimation methodology and the resulting coefficients, and 8) discuss in the Appendix a methodology that attempts to relates TRL values to schedule slippage and mention some potential limitations associated with this approach. Of these eight items, all but the first represents new work that advances the state of the art and can be applied to a wide variety of both aerospace and non-aerospace programs.

“Technology Readiness Levels are a systematic metric/measurement system that supports assessments of the maturity of a particular technology and the consistent comparison of maturity between different types of technology.”[John C. Mankins, “Technology Readiness Levels,” White Paper, NASA Office of Space Access and Technology, 6 April 1995; available online at www.hq.nasa.gov/office/codeq/trl/trl.pdf.] TRLs have a wide variety of uses in aerospace systems engineering and project management, ranging from permitting technology maturation tracking; helping to balance cost, performance, schedule; and risk management (via the use of a condensed, reversed scale to represent the technology maturity component of probability of occurrence).

Scales involving TRLs or similar proxies for hardware maturity were developed and used by the Department of Defense (DoD) and the National Aeronautics and Space Administration (NASA) in the 1980s but more widely used following dissemination of John C. Mankins’ 1995 NASA White Paper. Mankins’ TRL definitions are given in Table 1.

* Principal, P. O. Box 1125, Redondo Beach, CA 90278, www.risk-services.com, Associate Fellow and Life Member.

AIAA SPACE 2009 Conference & Exposition14 - 17 September 2009, Pasadena, California

AIAA 2009-6727

Copyright © 2009 by Edmund H. Conrow. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

Copyright © 2009 by Edmund H. Conrow

2

TRL Value Definition

1 Basic principles observed and reported 2 Technology concept and/or application formulated 3 Analytical and experimental critical function and/or characteristic proof-of-concept 4 Component and/or breadboard validation in laboratory environment 5 Component and/or breadboard validation in relevant environment 6 System/subsystem model or prototype demonstration in a relevant environment

(ground or space) 7 System prototype demonstration in a space environment 8 Actual system completed and “flight qualified” through test and demonstration

(ground or space) 9 Actual system “flight proven” through successful mission operations

Table 1. NASA TRL Values and Corresponding Definitions TRL definitions have been applied and tailored to a variety of industries beyond NASA and uses other than

hardware technology [1], [2], [3]. However, the basic idea associated with these other applications and uses remains the same as in Mankins’ TRL scale. For example, the DoD TRL scale definitions [3], are essentially the same as the Mankins’ NASA definitions, with identical definitions for Levels 1 through 6 and only minor definition wording changes for Levels 7 through 9. The DoD TRL scale definitions are given in Table 2.

A variety of different scale types exist, including nominal, interval, ordinal, calibrated ordinal, estimative probability, and ratio [4], [5], [6], [7]. Ordinal scales, as the name suggests, have levels that are monotonic and rank ordered, and coefficient values that are only placeholders for the true cardinal values. Consider a three-level maturity scale with coefficients 1, 2, and 3. Here the maturity of level 2 > level 1, the maturity of level 3 > level 1, and the maturity of level 3 > level 2. In this case the scale is entirely non-decreasing. However, because the coefficients 1, 2, and 3 are only placeholders we can just as easily substitute coefficients A, B, and C, where C > B and B > A. (In fact, this representation is often preferable to using numbers to preclude people from performing mathematics on the resulting values.)

The TRL scale (Tables 1 and 2) is a nine-level ordinal maturity scale related to hardware technology. (The scale level values are inverted vs. probability of occurrence and consequence of occurrence ordinal scales commonly used in performing a risk analysis. For example, with a five level ordinal “probability” maturity scale the least mature level = 5 and the most mature level = 1. This is the opposite ordering vs. a TRL scale.) As previously mentioned,

TRL Value Definition 1 Basic principles observed and reported 2 Technology concept and/or application formulated 3 Analytical and experimental critical function and/or characteristic proof-of-concept 4 Component and/or breadboard validation in laboratory environment 5 Component and/or breadboard validation in relevant environment 6 System/subsystem model or prototype demonstration in a relevant environment 7 System prototype demonstration in an operational environment 8 Actual system completed and qualified through test and demonstration 9 Actual system proven through successful mission operations

Table 2. DoD TRL Values and Corresponding Definitions [3] the ordinal TRL coefficients are only placeholders for the true cardinal coefficients, the scale levels are only rank ordered, and the interval values between scale levels are both potentially different and unknown. It is interesting to note that Mankins never claimed that the TRL scale level values were cardinal, nor that mathematical operations could be performed on the results.) For example, should you really expect that an item rated as TRL = 8 [actual system completed and “flight qualified” through test and demonstration (ground or space)] is only twice as mature as another item rated as TRL = 4 (component and/or breadboard validation in laboratory environment)? My decades of experience on “real world” aerospace programs suggests the answer is a resounding no! In addition, mathematical operations should never be performed on results obtained from ordinal scales, because very large

Copyright © 2009 by Edmund H. Conrow

3

errors can result [7]. (Irrefutable examples with errors of 600% or more between ordinal and cardinal scale coefficients can readily be generated [8].) While the practice of performing mathematical operations on results from ordinal scales was widespread in the Department of Defense and its contractors in the 1980s and 1990s, this erroneous practice was greatly curtailed by the late 1990s as a result of language included in the Risk Management Guide for DoD Acquisition [9] and other sources [6], [10], [11].

Attempts have been made to relate ordinal TRL values to potential risk, specifically cost growth and/or schedule slippage, in aerospace programs [12], [13], [14], [15], [16], [17], [18]. However, most of these discussions are subjective and with only anecdotal information provided. Two examples were located that attempted to relate TRL to system-level cost growth and/or schedule slippage. In the first case, subsystem TRL levels for four programs were compared to system-level cost growth and schedule slippage [19]. One of the four programs was an automobile, and clearly different than the other three aerospace programs (Army helicopter, Army submunition, and commercial satellite solar array). It is interesting to note that both the army helicopter and submunition programs were canceled more than five years ago. The GAO stated in July 1999 that of the five helicopter subsystems it tracked, two were TRL = 5 and three were TRL = 3. To that point in time, the projected schedule slippage was 120 percent [19]. The GAO said in July 1999 that of the five subsystems it tracked for the submunition one was rated at TRL = 2, and four were rated at TRL = 3. To that point in time, the projected schedule slippage was 62 percent [19]. Hence, it would appear that the submunition with lower TRL values overall for key subsystems had about ½ the projected schedule slippage than the helicopter program with higher overall TRL values for key subsystems. While a sample size of two, this would suggest that schedule slippage is proportional to TRL level, meaning that more mature systems have larger schedule slippage. This is of course not true, holding all else constant. But as the GAO said in its July 1997 report: “The Comanche (helicopter), in particular, has experienced a great deal of cost growth and schedule slippage for many reasons, of which technology immaturity is only one. Other factors, such as changing the scope, funding, and pace of the program for affordability reasons, have also contributed.” [19] This GAO statement effectively says that TRL values only measure technical maturity (or immaturity), and other factors affect cost growth and schedule slippage besides technical maturity. Thus, TRL values may not be highly correlated with risk, including schedule slippage. As will be discussed in the Appendix of this paper, assertions that TRL and schedule slippage are highly correlated are both false and omit a wide variety of factors (some mentioned above in the GAO quote) that affect schedule slippage besides TRL.

For the commercial satellite solar array example, the GAO stated in July 1999 that the array was at TRL = 6 when launched and that zero cost growth and schedule slippage resulted [19]. However, following satellite launches in late 1999, 2000, and 2001 as many as six satellites that included this solar array design rather rapidly and unexpectedly had reduced on-orbit power. An inherent solar array design flaw led to a higher than expected outgassing of the solar cells, fogging of the solar array concentrator mirrors, reduced available power, and diminished satellite operational capability. While ground tests were performed on the arrays prior to launch, the tests did not identify the potential outgassing problem. The resulting cost impact of this design flaw was considerable—clearly the technology was not as mature as had originally been estimated. (Even if the TRL value was correctly estimated, the resulting risk level was effectively much higher than anticipated.) Incredibly, in a May 2007 presentation [20], the GAO still touted that the same commercial satellite solar array with TRL = 6 had zero cost growth and schedule slippage, perhaps five years after on-orbit problems had been observed and redesigns had been performed.

In the second example, Dubos, et. al. [21] attempted to regress cost weighted TRL values against variations in system-level actual vs. planned schedule durations for 28 spacecraft development programs. However, TRL values are not highly correlated with risk, since they are related to only a portion of the risk probability of occurrence term and unrelated to the consequence of occurrence term. Thus, TRL values should not be used to estimate any form of risk. For example, the NASA developed cost weighted ordinal TRL coefficients used by Dubos, et. al. are only weakly correlated with spacecraft development program schedule slip (degrees of freedom adjusted statistical coefficient of determination equal to 0.20). See the appendix to this paper for a detailed discussion of the relationship between cost weighted TRL values and schedule slip, as well as why TRL values are not strongly related to technical and other types of risk.

Obviously, holding all else constant, it is desirable to have mature technologies available early in a program’s development phase. In speaking of two DoD programs [Space Based Radar (SBR) and Transformational Satellite

Copyright © 2009 by Edmund H. Conrow

4

(TSAT)], the GAO stated [22]: …“If these programs adhere to the TRL 6 criteria, they will greatly reduce the risk of encountering costly technical delays, though not completely. There are still significant inherent risks associated with integrating critical technologies and with developing the software needed to realize the benefits of the technologies. Moreover, the best practice programs we have studied strive for a TRL 7, where the technology has been tested in an operational environment, that is, space.”

It is interesting to note that both SBR and TSAT had troubled acquisition histories and both had been re-baselined and/or canceled one or more times. While useful technology work had been completed on each program, only limited systems development had been performed compared to what would be needed for a deployed system. Also, note the second sentence in the above quote—even when relatively high TRL values exist there are typically other items unrelated to (hardware) technology maturity that pose “significant inherent risks” which can contribute to non-technical (e.g., cost and schedule) and/or technical risks#.

The remainder of this paper provides six unique and previously unpublished results. First, I present an approach that calibrates the TRL ordinal scale coefficients given in Tables 1 and 2 using the Analytic Hierarchy Process, which uses an additive utility function [23]. The calibrated TRL scale coefficients are cardinal and limited mathematical operations can be performed on them. Second, I present the resulting cardinal calibration coefficients for TRL values 1 through 9—the results obtained are identical for both the DoD and NASA TRL scales. Third, I provide percent deviations between the ordinal and cardinal coefficients associated with TRL scale values 1 through 9. These deviations are substantial. For example, using the calibrated coefficients, TRL = 8 is approximately six times more mature than TRL = 4, rather than a factor of two from simply and incorrectly taking the ratio of the ordinal (uncalibrated) scale coefficients (8/4 = 2). Fourth, a high quality curve fit is presented for the nine calibrated TRL scale coefficients. This curve fit permits estimates to be made for non-integer TRL values so long as the estimates are performed in the obvious, valid range (between 1 ≤ TRL ≤ 9). Fifth, I discuss the limitations of the resulting methodology used to calibrate the TRL coefficients and the coefficients themselves. I also mention how the methodology used to generate results 1) through 4) above can be applied to “TRL-like” maturity scales other than the hardware technology scales evaluated in Tables 1 and 2. Sixth, I critique an existing methodology that relates ordinal TRL values to schedule slippage and discuss a number of the shortcomings that are present.

Analytic Hierarchy Process Methodology and Implementation TRL scales are ordinal with increasing maturity relative to the level number. For example, TRL = 9 is more

mature than TRL = 8, TRL = 4 is more mature than TRL = 3, TRL = 2 is more mature than TRL = 1, and so on. The analytic hierarchy process (AHP) was used to estimate the cardinal coefficient value for each ordinal TRL scale value.

“The AHP derives ratio scales of relative magnitudes of a set of elements by making paired comparisons. It proceeds from judgments on comparisons with respect to dominance, which is the generic term for expressing importance, preference or likelihood, of a property which they have in common, to their numerical representation according to the strength of that dominance and then derives a ratio scale.” [23] (Note: in the application of AHP to the TRL scale, maturity is the attribute of dominance that is being evaluated. Also, the relative ratio scale derived by AHP in this cases does not have a meaningful, non-arbitrary zero point.) “The comparisons are made using judgments based on knowledge and experience to interpret data according to their contribution to the parent node in the level immediately above. Once all the pair-wise comparisons in a group are completed a scale of relative priorities is derived from them.” [24]

“In this particular (TRL) case, a square matrix with as many rows (and columns) as there are scale levels is created. The numbers in this matrix express the intensity of maturity dominance of the criterion in the column heading (individual scale levels) over the criterion in the row heading (again, individual scale levels) [25].” For example, in the TRL case the analyst estimates how much more mature TRL = 9 is than TRL = 8, and so on. “Because a relative ratio scale is used, the matrix is reciprocal which means that the numbers, which are symmetric with respect to the diagonal, are inverses of one another, aij = 1/aji [25].” (For example, if a TRL value is deemed to be 4 times more important than another, then the other is 1/4 as important when compared with the first.)

Of the 81 (9x9) entries in the TRL case, nine are ones, representing the nine self comparisons on the diagonal. “Half of the remainder are reciprocals by virtue of the inverted comparison. In general, n(n - 1)/2 comparisons (or n items taken two at a time, nC2) are needed if n is the number of elements being compared in the triangle above the diagonal of ones [26].” (Thus, with nine TRL scale levels, 9*8/2 = 36 pairwise comparisons are needed.)

# When cited by Dubos, et. al., [21], pg. 841, this second sentence was eliminated.

Copyright © 2009 by Edmund H. Conrow

5

“The pairwise comparisons are converted to a relative ratio scale by estimating the eigenvector of the matrix” described above [27]. With the Expert Choice ® software package the “eigenvector computation is based upon the normalized row sums of the limiting power of a primitive matrix (and hence also of a positive matrix). To obtain this vector the matrix is raised to powers. Fast convergence is obtained by successively squaring the matrix. The row sums are calculated and normalized. The computation is stopped when the difference between these sums in two consecutive calculations of the power is smaller than a prescribed value.” [28] [Numerous AHP estimation software exists and includes both commercial (e.g., Expert Choice ®, Decision Lens Suite TM, SuperDecisions TM) and non-commercial packages.] For the TRL case, the eigenvector consolidates the 81 relative maturity ratios of the matrix into nine measures of maturity. “This new scale is called the derived scale. It is an important property of this scale that the sum of the numbers is always 1.” [29]

While the above process may appear somewhat complicated, the TRL evaluation in this paper is a very simplistic hierarchy with a single criteria (and by extension a single subcriteria). A more complex (but still simple) evaluation, for example, would be the calibration of cost, performance, and schedule consequence of occurrence ordinal scales, with a single criteria (consequence of occurrence), and three subcriteria (corresponding to cost, performance, and schedule consequence). Here, each subcriteria includes an “n” level ordinal scale. An even more complex risk analysis evaluation would be the calibration of various hardware, software, and integration ordinal “probability” of occurrence scales, where each of these items (e.g., hardware) corresponds to an individual criteria, each scale within the criteria (e.g., hardware: design/engineering, manufacturing, technology, threat) is a separate subcriteria. And as in the last example, each subcriteria includes one or more “n” level ordinal scales§.

Results Using the Expert Choice ® software package and the methodology described above, the estimated calibrated

ordinal coefficients for the TRL scales (Tables 1 and 2) are given in Table 3. (There was no difference in the pairwise comparison scores between the NASA and DoD TRL scales, hence the resulting AHP estimated TRL values are the same for these two scales.) Both coefficients calculated directly from Expert Choice ® (“raw”) and those adjusted to 9.0 for the TRL 9 coefficient are provided. (In the latter case each calibrated coefficient was multiplied by 9/0.33 = 27.27 to obtain coefficients adjusted to 9.0 for TRL 9.)

Ordinal TRL Values

AHP Estimated, TRL Values

AHP Estimated TRL Values Adjusted to 9.0 (TRL 9)

1 0.01 0.26 2 0.02 0.53 3 0.03 0.71 4 0.04 1.14 5 0.07 1.97 6 0.10 2.74 7 0.16 4.26 8 0.25 6.81 9 0.33 9.00

Table 3. Estimated Calibrated TRL Coefficients (“Raw” and Adjusted) The calibrated TRL coefficients presented in Table 3 have a small random noise component (as estimated from

Expert Choice ® AHP calculations). Random noise is akin to inconsistency, ranging from zero to 1.0, and is a “number closely related to the principal eigenvalue of the (square) matrix” [25] previously discussed. “Inconsistency may be thought of as an adjustment needed to improve the consistency of the comparisons. But the adjustment should not be as large as the judgment itself, nor so small that using it is of no consequence. Thus inconsistency should be just one order of magnitude smaller. On a scale from zero to one, the overall inconsistency should be around 10%” [30]. Note that 0.10 is an estimated upper limit, hence the desired level of inconsistency should be less than 0.10 and as close to zero as reasonably possible without “gaming” the pairwise comparisons to

§ David Graham, U.S. Air Force, likely first developed the approach of applying AHP to ordinal probability of occurrence scales in 1994. The first AHP application to a series of probability scales was likely performed by David Graham (USAF), Jason Dechoretz (MCRI), and Edmund Conrow (Consultant), in 1994.

Copyright © 2009 by Edmund H. Conrow

6

drive the inconsistency down. The inconsistency index estimated by Expert Choice ® for the pairwise comparisons associated with the results in Table 3 was 0.01, which corresponds to a very small random noise term‡.

Percent deviations between the ordinal and cardinal coefficients associated with TRL scale levels 1 through 9 are given in Table 4. The average percent deviation was 166, and in each case the AHP estimated and adjusted cardinal coefficient was less than or equal to the ordinal TRL value. In fact, in five of the nine cases (TRL = 1 to 5), the deviation between the estimated and adjusted cardinal and ordinal coefficients was greater than 150 percent.

Ordinal TRL Values

AHP Estimated Adjusted TRL Values

Percent Deviation Ordinal vs. AHP Estimated

1 0.26 284.6 2 0.53 277.4 3 0.71 322.5 4 1.14 250.9 5 1.97 153.8 6 2.74 119.0 7 4.26 64.3 8 6.81 17.5 9 9.00 0.0

Table 4. Percent Deviation between Ordinal and AHP Estimated TRL Coefficients Also, the ratio of the AHP estimated, adjusted TRL values was substantially different than that for the ordinal

TRL values. For example, using the AHP estimated, adjusted coefficients, TRL = 8 is approximately six times (5.97) more mature than TRL = 4, rather than a factor of two from simply and incorrectly taking the ratio of the ordinal (uncalibrated) scale coefficients (8/4 = 2). (The ratio of several other TRL values also showed substantial variations from equivalent ratios of the AHP estimated, adjusted TRL values versus ordinal values. But again, the raw TRL values are only ordinal placeholders and performing mathematical operations on them will lead to erroneous results.)

The difference and percent difference in adjacent AHP TRL values is given in Table 5. The largest difference in adjacent AHP estimated TRL values are between TRL = 6 and 7, TRL = 7 and 8, and TRL = 8 and 9. However, the largest percent difference in adjacent AHP estimated TRL values are between TRL = 3 and 4, TRL = 4 and 5, and TRL = 6 and 7, and range from 93 percent to 139 percent. While it may appear that the highest payoff in improving maturity may occur for the TRL = 3 and 4, TRL = 4 and 5, and TRL = 6 and 7 cases this is perhaps an overly simplistic perspective given the financial investments needed, the resulting timeline, and additional performance-related criteria will vary on a case-to-case basis.

TRL Values X Observed

AHP Estimated TRL Values

Difference in AHP Estimated TRL Values

Percent Difference in AHP Estimated TRL Values

1 0.26 N/A N/A 2 0.53 0.27 N/A 3 0.71 0.18 -33.3 4 1.14 0.43 138.9 5 1.97 0.83 93.0 6 2.74 0.77 -7.2 7 4.26 1.52 97.4 8 6.81 2.55 67.8 9 9.00 2.19 -14.1

Table 5. Difference and Percent Difference in Adjacent TRL Values

‡ An inconsistency index of zero corresponds to no measureable inconsistency in the pairwise comparisons.

Copyright © 2009 by Edmund H. Conrow

7

A curve fit was then performed on the AHP estimated, adjusted TRL values given in Tables 3 and 4. The purpose of this curve fit was to provide an approach to estimate non-integer TRL values (e.g., TRL = 5.6) so long as the estimates are performed in the obvious, valid range between 1 ≤ TRL ≤ 9. Several criteria for accepting the curve fit were established before the process began. While these criteria are not inviolate, each is desirable to insure that the resulting curve fit is of high quality. First, the selected equation with derived coefficients had to be smooth, continuous, and without changes in the sign of the first and second derivatives over the range of 1 ≤ TRL ≤ 9. Second, the resulting equation should have as few coefficients as possible to retain a relatively high degrees of freedom for the nine values used in the curve fit. Third, the “t” statistic of each curve fit coefficient should be much greater than 0, the corresponding probability level associated with the “t” statistic for each coefficient should be much less than 0.05. Fourth, the 95% confidence interval lower and upper bounds for each estimated coefficient should not cross the zero value. Fifth, the resulting equation coefficient of determination (R2) should be fairly high (e.g., greater than 0.90 if possible), as should the R2 adjusted for degrees of freedom.

More than two thousand equation forms were evaluated against the nine AHP estimated, adjusted coefficient values. The selected equation form is:

TRL Predicted = α + β*(TRL Observed)3 (1)

The selected equation (in terms of X and Y) and summary statistics are given in Table 6.

Equation Form: Y = α + β*X3 R2 0.997 Degrees of Freedom Adjusted R2 0.996 Fit Standard Error 0.180 F-Value 2330.0

Table 6. Selected TRL Curve Fit Equation Form Curve fit statistics for the individual coefficients of this equation are given in Table 7.

Parameter Statistics Coefficient α Coefficient β

Value 0.346 0.012 Standard Error 0.082 0.0002 t-value 4.22 48.27 95% Lower Bound 0.152 0.011 95% Upper Bound 0.540 0.013 Probability > |t| 0.004 0

Table 7. Coefficient Curve Fit Statistics for Selected Equation Form Predicted values of the AHP estimated, adjusted TRL values (Y predicted), along with the difference between

the Y predicted and the AHP estimated, adjusted TRL values (Y residual), and the corresponding residual percentage (Y residual percentage are given in Table 8. [Note: Y Predicted is identical to TRL predicted [eq. 1)] and X Observed is the same as TRL observed [eq. 1)].] That the average Y residual percentage from Table 7 is –1.7%--a relatively small value.

A graphical representation of the ordinal and estimated TRL coefficients is given in Figure 1. The dashed line with diamond points in Figure 1 represents the ordinal TRL coefficients. The slope of the line as drawn is 45 degrees, but the Y axis values are an ordinal representation and simply placeholders for true cardinal coefficient values. The upward sloping curve (with positive first and second derivatives) in Figure 1 is the fitted curve of the form: Y predicted = 0.346 + 0.012*X3 [eq. (1) with coefficients from Table 7]. The square points about this curve are the AHP estimated, TRL adjusted values given in Tables 3, 4, and 8. The only time the selected equation form (Table 6) or corresponding curve (Figure 1) should be considered is when non-integer estimates of TRL are mandatory. In such cases, note the residual magnitude and percent error given in Table 8 for each estimated TRL value. For example, for a TRL = 5.6, the Y predicted value using the selected equation is 2.45, and a linear interpolation of residual values in Table 8 yields an estimated Y residual of -.072.

TRL Values, X Observed

AHP Estimated Adjusted TRL Values

TRL Values,Y Predicted

Y Predicted – AHPEstimated Y Residual

Y Predicted – AHP Estimated Y Residual %

1 0.26 0.36 -0.10 -37.8 2 0.53 0.44 0.09 16.5 3 0.71 0.67 0.04 5.6 4 1.14 1.11 0.03 2.2 5 1.97 1.85 0.12 6.3 6 2.74 2.94 -0.20 -7.3 7 4.26 4.46 -0.20 -4.8 8 6.81 6.49 0.32 4.7 9 9.00 9.10 -0.10 -1.1

Table 8. Selected Equation Predicted TRL Values and Residuals

1 3 5 7 9

TRL

0

1

2

3

4

5

6

7

8

9

Coe

ffici

ent V

alue

= Ordinal coefficients= Estimated cardinal coefficients

2 8641 3 5 7 9

TRL

0

1

2

3

4

5

6

7

8

9

Coe

ffici

ent V

alue

= Ordinal coefficients= Estimated cardinal coefficients= Ordinal coefficients= Estimated cardinal coefficients= Estimated cardinal coefficients

2 864

Figure 1. Graphical Representation of Ordinal and Estimated TRL Coefficients From Tables 4 and 8 the average of the nine ordinal coefficients is 5.0. Note, this is an invalid result since the

average or ordinal numbers is meaningless. However, this result is presented to contrast it with the average of the nine AHP estimated TRL values, which is 3.1, or about a 60 percent difference. Hence, using ordinal TRL coefficients may overestimate the technical maturity of an item when mathematical operations are performed on values corresponding to TRL levels one through eight. This is graphically illustrated in Figure 1 where the AHP estimated TRL coefficient values are noticeably less than the corresponding ordinal values for all but TRL = 9 (which is the normalization point to fix the AHP estimated values to the ordinal values). An example and discussion of a more elaborate set of mathematical operations performed on ordinal TRL values by Dubos, et. al. [21] is given in the Appendix.

Copyright © 2009 by Edmund H. Conrow

8

Copyright © 2009 by Edmund H. Conrow

9

Based upon my several decades of performing risk management on actual programs, any item that is a medium or high risk should always be carried independently of results generated by mathematical operations on a series of risks, or rollups, for example, to higher Work Breakdown Structure levels. Otherwise risks can be overlooked that can “re-appear” later in the project as problems that are very difficult to handle [31]. Similarly, when using ordinal or AHP estimated TRLs, unadjusted TRL values should be carried for each item and values below a particular pre-determined threshold should be reported for such items no matter what mathematical operations or rollups are performed. (For example, if the average TRL is estimated for “n” different hardware components in a given subsystem using the AHP estimated TRL values, the lowest component TRL level value present in the subsystem should be separately reported.) In addition, the lowest TRL value should be carried forward without modification for each successive level of integration. For example, if a system is composed of components and subsystems, TRL values should first be estimated for appropriate components (e.g., those that are developmental items and not proven, off-the-shelf items). The subsequent subsystem TRL is thus the lowest TRL value of any component contained in the subsystem, plus the equivalent “TRL” for integrating the components into the subsystem and other appropriate considerations (e.g., design/engineering, manufacturing, support, threat). Similarly, the system-level TRL is the lowest of the subsystem-level TRL scores coupled with the equivalent “TRL” for integrating the subsystems into the system, plus other appropriate considerations. All components, subsystems, and systems that have TRL values below the minimum threshold required by the program should be identified and documented and appropriate action taken to evaluate and alleviate the shortfalls [32].

Conclusions TRL and similar maturity based hardware technology development scales have been in use since the 1980s. The

scales most commonly used by government and industry are those given in Tables 1 and 2. The TRL and related scales do not estimate risk and are only weakly correlated with risk because they: 1) do not consider, include, or estimate consequence of occurrence (which is one of the two terms that compose risk); and 2) only capture a small component (technology maturity) of the probability of occurrence term while excluding numerous other probability-related terms. Furthermore, the TRL and related scales are ordinal—the coefficient values are not cardinal and mathematical operations cannot be performed on the results without introducing potentially large errors.

The AHP process was used to provide cardinal estimates of the ordinal TRL scale values. While this approach has been used to “calibrate” risk analysis probability of occurrence and consequence of occurrence scales for the past 15 years, the application and results presented in this paper are the first known use of AHP (or related methodologies) to calibrate a TRL scale. Comparing ordinal and AHP estimate TRL levels, the average difference for the nine scale levels is 166 percent, as given in Table 5, and deviations of more than 150 percent exist for TRL = 1 to TRL = 5. The level of random noise (inconsistency index) estimated by Expert Choice ® for the calibrated coefficients was exceedingly small.

The same methodology used to generate results given in Tables 3-8 and Figure 1 can be applied to “TRL-like” maturity scales other than the hardware scales evaluated in Tables 1 and 2. In fact, the methodology application is direct and easy for simulation, software, biomedical, manufacturing, and related maturity based applications that currently exist [1], [2], [3]. In these particular cases the nine level scales will also have 36 pairwise comparisons that must be evaluated. (The number of comparisons is nC2, where n is the number of scale levels.) It is strongly recommended that an independent analysis be performed in each case rather than simply using the coefficient and curve fit results for the hardware maturity TRL scale presented here and applying these results to other scales.

A very high quality, yet relatively simple, curve fit was developed for the AHP estimated TRL coefficients (Y) vs. the TRL values (X): Y = α + β*X3. This regression is also the first known curve fit of “cardinal” TRL coefficients and permits estimates of non-integer (X) TRL values for use in a variety of computations. While the selected equation produces a “high quality” curve fit to the X observed values (Table 8), this procedure should only be used in cases where non-integer TRL values must be estimated. In all other cases (including for integer TRL values) the AHP estimated, adjusted TRL values given in Tables 3 and 4 should be used. It is particularly important that extensive mathematical operations not be performed on TRL values or that the values be blindly “rolled-up.” In both cases there is a tendency to overlook, for example, items having low maturity levels through aggregation, roll-up, or other operations.

Despite the fact that TRLs and related maturity scales are not risk, attempts have been made to develop relationships between TRL values and specific types of risk (e.g., schedule slippage) [21]. However, it is not surprising that the resulting relationships have only a poor statistical fit. For example, as shown in Table A1, using the NASA 28 program data set developed by Lee and Thomas [33], it is evident that only a weak relationship, with R2 ~ 0.26, exists for dissimilar types of equations that are curve fit between cost weighted TRL values (WTRL) and

Copyright © 2009 by Edmund H. Conrow

10

development schedule slip (RSS). Only about ¼ of the variance in the dependent variable (RSS) can be explained by the regression equation—the other ¾ is unexplained. (This falls to only about 1/5 of the variance being explained in the dependent variable when the results are adjusted for degrees of freedom.) These results mean that schedule slippage is primarily related (74+% of the total variance) to factors other than the hardware development TRL value, and likely related to a number of other factors. (See the Appendix for a discussion of nine different considerations which may contribute to the relatively low R2 value when regressing WTRL against RSS.)

Attempts to truncate WTRL values to integer TRL levels then perform the regression against the RSS mean or maximum value for each truncated TRL level have little value both theoretically and practically. This is particularly the case when only five (TRL = 4 to 8) or six data points (TRL = 3 to 8) are available for regression [let alone when data is incorrectly moved from one bin (TRL = 3) to another (TRL = 4)]. When the correct TRL = 3 to 8 range is used, the resulting R2 of the RSS mean and maximum curve fits is 0.71 and 0.41, respectively, which drops to 0.52 and 0.02 when the degrees of freedom adjusted R2 is estimated (as shown in Table A3). These results, as when WTRL was regressed against RSS, strongly suggest that TRL is only weakly correlated with schedule slippage.

Appendix An example of mathematical operations performed on ordinal TRL scale values is now given. Lee and Thomas

derived a 31 program data set from the NASA Resource Data Storage and Retrieval (REDSTAR) database [33]. Data aggregated to the total program level was given for five variables: initial cost estimate (ICE, $million), final total cost (FTC, $million), initial schedule duration estimate (IDE, year), final total schedule duration (FTD, year), and TRL. (Note: program names were not provided and additional data for each program was not released.) Of specific interest here is the weighted average of the TRL (WRTL) that Lee and Thomas estimated for each program. This was computed by taking the TRL of each available component and multiplying it by the component percent cost (component cost vs total program cost). For the 31 programs, no component was rated as TRL = 1, and only two components were rated as TRL = 2 and 9 [33]. This was judged by Lee and Thomas to be insufficient and the three corresponding programs were eliminated; thus leaving 28 programs, and the above mentioned five variables [33].

Using the 28 program Lee and Thomas database [33], the degree of schedule slippage can be readily calculated. While the conventional method of portraying acquisition program cost change is equivalent to FTD/IDE [34], another sufficient representation was used by Dubos, et. al. as the degree of relative schedule slippage (RSS) [21]:

RSS = ((FTD-IDE)/IDE)*100 (1A)

The two different schedule slippage portrayals are statistically identical given a translation between the schedule change delta percent and schedule change ratio.

Curve fits were performed on the WRTL (X) and RSS (Y) data for the 28 program sample. More than 2,000 equation forms were used to fit the data, but three common forms are representative of the results and presented here. These three forms include: linear, modified exponential, and decay, given in eq. (2A), (3A), and (4A), respectively:

RSS = α+β*WRTL (2A)

RSS = α*e(-β*WTRL) (3A)

RSS = α+0.25*(β2)*(WRTL2)-(α0.5)*β*WRTL (4A)

The results from these curve fits are given in Table A1. Perhaps the most interesting result presented in Table A1 is that the R2 value for the three evaluated forms is about 0.26 and the degrees of freedom adjusted R2 is similarly about 0.20. Hence, only about ¼ of the variance in the dependent variable (RSS) can be explained by the regression equation—the other ¾ is unexplained. (This falls to only about 1/5 of the variance being explained in the dependent variable when the results are adjusted for degrees of freedom.) In effect, this means that schedule slippage is primarily related (74+% of the total variance) to factors other than the hardware development TRL value. This is discussed in greater detail at the end of this Appendix.

From Table A1, both the linear [eq. (2A)] and decay forms [eq. (4A)] had α and β coefficients that were statistically significant at the 0.05 or better level, while the modified exponential form [eq. (3A)] had a β coefficient

Copyright © 2009 by Edmund H. Conrow

11

that was statistically significant at the 0.05 or better level, but not the α coefficient. Similarly, the 95% confidence interval (CI) only included zero for the modified exponential form, α coefficient, meaning that only in this case could the coefficient not be rejected as being zero at the 0.05 statistical significance level.

Statistical Characteristic Linear Eq. (2A) Modified Exponential Eq. (3A) Decay Eq. (4A)

Coefficient α 171.17 606.69 268.60 “t” Value 4.15 1.05 2.45

Probability of “t” < 0.001 0.305 0.021 95% CI Zero Crossed? No Yes No

Coefficient β -20.98 0.46 3.35 “t” Value -2.96 2.28 2.56

Probability of “t” 0.007 0.031 0.017 95% CI Zero Crossed? No No No

R2 0.25 0.26 0.26 Degrees of Freedom

Adjusted R2 0.19 0.20 0.20

Table A1. Curve Fit Results for WTRL (X) and RSS (Y) Using Eq. (2A), (3A), and (4A) Dubos et. al., estimated and reported the RSS mean and maximum (max) values at selected TRL values [21], as

given in Table A2. (This approach may appear to be comprehensive but is not because the TRL scale values are ordinal. As previously shown, substantial differences exist between averaging calculations performed using TRL ordinal vs. estimated cardinal coefficients.) Conrow also estimated the same mean and maximum values from the Lee and Thomas data as given in Table A2, and reported the number of programs associated with each truncated WRTL (TRL) level. (Note that the sum of the number of programs is 28, which is the number of programs in the Lee and Thomas database [33].) The Conrow values are reported in Table A2 as integers to correspond to the format used by Dubos et. al. [21]. From Table A2, only three differences are present. First, for TRL = 4, Conrow estimated the RSS mean = 81 while Dubos et. al. estimated this mean = 78 [21]. This is about a four percent difference, and may point to another issue. Of greater concern is the fact that Dubos et. al. did not report mean and maximum values for TRL = 3 [21], yet using their methodology, integer truncated WTRL values from two programs existed (WRTL = 3.95 and 3.85, RSS = 30.19 and 94.45), yielding a RSS mean of 62 and a RSS maximum of 95. The impact of this difference is considerable and will be subsequently discussed. If the two TRL = 3 programs (WRTL = 3.95 and 3.85) are instead moved to the TRL = 4 programs, the resulting mean TRL = 4 value changes from 81 to 78, while the maximum value remains the same. These values match the Dubos, et. al. results [21]. If this is what occurred, then the movement of the two data points from TRL = 3 to TRL = 4 violated the Dubos et. al. methodology, and was not discussed anywhere in their publication [21]. (Also, if these two programs were moved from TRL = 3 to TRL = 4, then why weren’t three TRL 4 programs similarly moved to TRL 5 since their WTRL values were 4.94, 4.92, and 4.92?)

I will now proceed with analyzing the data in Table A2. Given this data, Dubos, et. al. performed a curve fit of RSS Mean vs. truncated WTRL (TRL) and RSS maximum vs. truncated WTRL (TRL). The equation they used for the curve fit was eq. (3A).

Four different regression analyses were performed, corresponding to the columns marked Conrow and Dubos et. al. in Table A2 using eq. (3A). The results from these regressions are presented in Table A3. Notice that there is a considerable difference between the quality of the regression fits (R2 and degrees of freedom adjusted R2) depending upon whether or not the two truncated WTRL values were included as part of TRL = 3 (TRL range = 3 to 8) or shifted to TRL = 4 (TRL range = 4 to 8). Also, the “t” value for coefficient α was somewhat low in all four cases (1.16 to 2.46), corresponding to a probability range associated with the “t” score of 0.09 to 0.31. The “t” value for coefficient β ranged from (1.31 to 6.12), corresponding to a probability range associated with the “t” score of 0.009 to 0.26. Similarly, the 95% confidence interval (CI) included zero for the exponential form, α in each case, and the β coefficient for the TRL range = 3 to 8 mean and maximum cases. Hence, only in the TRL range = 4 to 8 mean

Copyright © 2009 by Edmund H. Conrow

12

Truncated

WTRL (TRL) Conrow

RSS Mean Dubos et. al. RSS Mean

Conrow RSS Maximum

Dubos et. al. RSS Maximum

Number of Programs

3 62 N/A 95 N/A 2 4 81 78 214 214 10 5 57 57 128 128 6 6 20 20 30 30 3 7 19 19 55 55 6 8 7 7 7 7 1

Table A2. Conrow and Dubos et. al. Estimated RSS Mean and Maximum Values vs. Truncated WTRL (TRL)

and maximum cases could the β coefficient not be rejected as being zero at the 0.05 statistical significance level. Finally, the regression coefficients generated by Dubos et. al. [21] using eq. (3A) could not be verified when regressing the data in Table A2. While the β coefficient was nearly effectively identical for the RSS mean, TRL range = 4 to 8 case [0.55, Conrow vs. 0.56 Dubos et. al. [21]], and similar for the RSS maximum, TRL range = 4 to 8 case [0.65 Conrow vs. 0.57 Dubos et. al. [21]], the α coefficient for these two cases was considerably smaller—a factor of 90 and 143 less, respectively. These results are summarized in Table A4. The RSS estimates were verified in the four Conrow cases given in Tables A3 and A4 using eq. (3A) and the supplied α and β coefficients, but could not be verified using eq. (3A) and the Dubos et. al. coefficients [21].

Statistical Characteristic

RSS Mean

RSS Mean

RSS Maximum

RSS Maximum

TRL Range 3 to 8 4 to 8 3 to 8 4 to 8 Coefficient α 192.07 745.17 335.65 2936.43

“t” Value 1.96 2.46 1.16 1.63 Probability of “t” 0.12 0.09 0.31 0.20

95% CI Zero Crossed? Yes Yes Yes Yes Coefficient β 0.30 0.55 0.25 0.65

“t” Value 2.49 6.12 1.31 4.69 Probability of “t” 0.07 0.009 0.26 0.018

95% CI Zero Crossed? Yes No Yes No R2 0.71 0.96 0.41 0.94

Degrees of Freedom Adjusted R2 0.52 0.92 0.02 0.88

Table A3. Truncated WTRL Regression Results Using Eq. (3A) There are numerous limitations associated with the truncated WTRL approach regressed against schedule

slippage in Tables A2 and A3, and more generally attempting to relate TRL values to schedule slippage. I will briefly discuss ten different considerations—all of which may contribute to the relatively low R2 value when regressing WTRL against RSS.

First, the WTRL data is weighted by a cost fraction—this is unrelated to the technical and schedule dimensions, and thus introduces another level of uncertainty into the data subsequently used in regressions between WTRL and schedule slippage.

Second, binning WTRL values into integer (truncated) TRL levels introduces an unknown degree of aggregation, hence error, among TRL scores for different components within a program. This aggregation can mask true TRL-related development drivers. [For example, recent examples exist of one or more components at a very low Work Breakdown Structure level (e.g., Level 6 or 7) that had moderate technical maturity and a negligible component cost percentage yet adversely impacted spacecraft development to a considerable extent.]

Third, there is no strong argument for taking the mean or maximum of RSS data in integer (truncated) TRL bins then regressing these mean or maximum values against integer TRL levels (Tables A2 and A3). Taking the mean and maximum of RSS values within integer TRL bins results in values being selected that have reduced variability

Copyright © 2009 by Edmund H. Conrow

13

Statistical

Characteristic RSS

Mean RSS

Mean RSS

Maximum RSS

Maximum TRL Range 3 to 8 4 to 8 3 to 8 4 to 8

Conrow Coefficient α 192.07 745.17 335.65 2936.43 Dubos et. al. Coefficient α N/A 8.29 N/A 20.47

Conrow Coefficient β 0.30 0.55 0,25 0.65 Dubos et. al. Coefficient β N/A 0.56 N/A 0.57

Conrow R2 0.71 0.96 0.41 0.94 Dubos et. al. R2 N/A 0.94 N/A 0.83

Table A4. Comparison of Conrow and Dubos et. al. Exponential Equation Coefficients and R2 Values Using Eq. (3A)

versus the RSS and WTRL information available. If anything, the “raw” WTRL data should be regressed against RSS. When this is done, the results given in Table A1 show that only a weak relationship exists between WTRL and schedule slippage.

Fourth, the WRTL approach used by Lee and Thomas and Dubos et. al., also does not account for the fact that the TRL scale values are ordinal, not cardinal, which may contribute to erroneous results when the data are aggregated to the total program level [as likely existed in the cost weighted TRL (WTRL) data].

Fifth, Dubos et. al. assert that “components with a low TRL (e.g., TRL = 4) have a bigger impact on schedule slippage than do components with TRL = 5”… [21]. The data and results, in Tables A1, A2, and A3 do not fully support this assertion. For example, when the two truncated WRTL values (WRTL = 3.95 and 3.85, RSS = 30.19 and 94.45) are correctly allocated to TRL = 3, there is effectively little or no difference between the observed schedule slippage against the RSS mean for the 18 programs represented by TRL = 3 to TRL = 5 (Table A2). The same is also true for the difference between the observed schedule slippage versus the RSS maximum for TRL = 3 to TRL = 5. If anything, the observed schedule slippage increased for TRL = 4 and 5 versus TRL = 3 for RSS max. Also note that the RSS mean was effectively the same for TRL = 6 and TRL = 7, and RSS maximum increased between TRL = 6 and TRL = 7. Finally, since there was only a single data point for TRL = 8 (WRTL = 8.16 and RSS = 6.59), the mean (7) and maximum (7) values reported in Table A2 are trivial. (For example, how consequential is it to take the mean or maximum of a single value?)

Sixth, at best, there should only be a weak relationship between TRL and schedule slippage. The TRL scale given in Tables 1 and 2 corresponds to hardware technology maturity. It does not directly capture other program technical characteristics, such as some aspects of design/engineering, plus manufacturing, support, threat, and other considerations. (The 2005 DoD Technology Readiness Assessment Deskbook similarly states that “elements of technical risk also include design, architectures, interoperability, cost, schedule, manufacturability and producibility, and so forth [35].” The meaning here is that the TRL scale does not capture these other elements, but they must be included in a comprehensive risk assessment.) The TRL scale is also primarily focused on the development phase, rather than the production and support phases. The TRL scale also does not directly capture software development; integration, assembly, and test; project management; systems engineering; and other key program processes. For example, there is no mention of equipment, facilities, personnel availability, and tools and techniques needed for design, development, fabrication, and test of the spacecraft hardware in the TRL definitions. (For additional information, see the discussion of the Army helicopter and Army submunition programs given in the Introduction to this paper [19].)

Seventh, WTRL is estimated at Authority To Proceed (ATP), which is when the project development contract was let†. However, RSS is a measure of the schedule change between the final, realized development schedule and the initial estimated schedule (which corresponds to IDE). Hence RSS contains two points in time while WTRL is measured only at development initiation. The TRL for an item will change during the course of the program, both due to improvements in maturity as well as refinements in the estimate of the TRL value. While the former cause will tend to increase the TRL value, the latter cause may increase or decrease the TRL value depending upon the information and the state of the world. For example, in the case of the commercial satellite solar array, the actual TRL value (not the one reported by the GAO) effectively decreased and decreased considerably after the affected

† Dale Thomas, personal communication, July 2009.

Copyright © 2009 by Edmund H. Conrow

14

satellites were on-orbit. Here, the state of the world did not change, but the on-orbit degradations revealed weaknesses in how the technology was implemented into the design that were not previously understood. In hindsight, the solar array never should have been rated at TRL = 6 at the time of launch, but at a lower TRL value.

Eighth, attempting to relate TRL values to schedule slippage does not take into account that programs are often started with an insufficient budget and/or schedule in order to meet the required level of performance. (This contributes to a skewed and potentially negative cost and schedule margin at program initiation.) At best, both the buyer (government) and seller (contractor) may recognize that budget and schedule estimates are aggressive or optimistic but can or choose to do little or nothing about it. At worst, both the buyer and seller engage in a tacit dual “buy-in.” Regardless of the intention, starting a program under such circumstances injects an oscillation into the acquisition process that propagates throughout the remainder of the program and may lead to non-optimal outcomes. Having a downwardly biased cost and/or schedule program starting point may translate to a higher program cost, schedule, and risk than would otherwise exist, because of trade options foreclosed and subsequent concerns (e.g., integration and manufacturing) that materialize as problems later in the development phase. (For example, trade study results may not be selected that require a larger budget and/or longer schedule than available.) If realized, these concerns will contribute to increased cost growth and schedule slippage for reasons unrelated to technical maturity. However, the situation is made all the more serious when performance requirements stated at program initiation are potentially excessive, unverified, immature, and unstable, particularly when they either grow with time or new requirements are added and additional budget and schedule to address requirements concerns are unavailable.

Ninth, attempting to relate TRL values to schedule slippage does not take into account potential changes in cost, performance, and schedule (and the reason for these changes) during the course of the development program. Such a relationship also does not account for the acquisition dynamics between cost, performance, and schedule (C,P,S) that typically favor performance, over cost and schedule in DoD and NASA programs [36] [37]. By not accounting for performance change, cost change, and acquisition dynamics during the development phase, the inherent assumption is that these variables and dynamics do not affect program outcomes, which is clearly not the case. Historically, program managers are typically graded far more stringently on meeting performance requirements than meeting cost and/or schedule requirements. (This has occurred from the 1950s [34] to the present time.) Hence, when unanticipated performance concerns occur, often mid-to-late in the development phase, cost and/or schedule are adjusted upward (increased) in order to meet performance requirements [36] [37].

Tenth, risk is composed of two terms—probability of occurrence and consequence of occurrence. TRL and other related ordinal maturity scales (e.g., [1], [2], [3]) are solely associated with the probability of occurrence component of risk—they do not measure or represent the commonly used cost, performance, or schedule components of the consequence of occurrence term. (Note: maturity-based ordinal scales are not cardinal measures of the probability of occurrence, but rather proxies for this term.) As mentioned above, the TRL scale given in Tables 1 and 2 is only related to one portion of hardware development maturity—and is not related to a variety of other important characteristics that also contribute to the probability of occurrence term. Hence, any relationship between TRL results, particularly those obtained from the ordinal TRL scale, and schedule slippage is statistically weak at best (as shown in Table A1, and Table A3 for TRL range = 3 to 8) and should not be used.

Acknowledgments I gratefully acknowledge the support of Rozann Saaty of RWS publications for granting permission to include

excerpts from “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process” in this paper. I also gratefully acknowledge the support of Dale Thomas of NASA Marshall Space Flight Center who provided the NASA schedule and TRL related data used for calculations in the Appendix of this paper, and Systat Technical Support for verifying my programming of eq. (3A) and helping me to establish starting conditions and constraints to perform regressions using this equation.

References [1] Robert L. Clay, Scot J. Marburger, Max S. Shneider, Timothy G. Trucano, “Modeling and Simulation Technology Readiness Levels,” SANDIA Report, SAND2007-0570, Unlimited Release, printed January 2007. Available to Department of Energy and Department of Energy contractors through the U. S. Department of Energy. Available to the public through the U. S. Department of Commerce, National Technical Information Service, Springfield, VA.

Copyright © 2009 by Edmund H. Conrow

15

[2] Peter Hantos, “Software Technology Readiness Assessments-Managing Technology Risks in Space System Acquisitions,” The Aerospace Corporation, Aerospace Report No. TOR-2008(8550)-8033, Unlimited Release, 16 June 2008. Available to Department of Defense and Department of Defense contractors through the Defense Technical Information Center (ADA484614). Available to the public through the U. S. Department of Commerce, National Technical Information Service, Springfield, VA. [3] _____, “Technology Readiness Assessment (TRA) Deskbook,” Department of Defense, Deputy Under Secretary of Defense for Science and Technology, May 2005. Approved for public release, distribution unlimited. This document also contains TRL scales for biomedical, manufacturing, and software applications. Available to Department of Defense and Department of Defense contractors through the Defense Technical Information Center (ADA438933). Available to the public through the U. S. Department of Commerce, National Technical Information Service, Springfield, VA. [4] Stevens, Stanley S., “On the Theory of Scales of Measurement,” Science, Vol. 103, No. 2684, 7 June 1946, pp. 677-680. This paper describes nominal, interval, ordinal, and ratio scales, but does not discuss calibrated ordinal or estimative probability scales. Permissible classes of statistical analyses that can be performed with different scale types is a somewhat contentious and addressed (along with other concerns) in the Velleman and Leland paper cited below [5]. [5] Velleman, Paul F. Wilkinson, Leland, “Nominal, Ordinal, Interval, and Ratio Typologies Are Misleading,” The American Statistician, Vol. 47, No. 1 (Feb., 1993), pp. 65-72. This paper is in part a response to the Stanley S. Stevens paper cited above [4]. [6] Edmund H. Conrow, "The Use of Ordinal Risk Scales in Defense Systems Engineering," published in the "1995 Acquisition Research Symposium Proceedings," Defense Systems Management College, Ft. Belvoir, VA., June 1995. [7] Edmund H. Conrow, “Effective Risk Management: Some Keys to Success,” Second Edition, American Institute of Aeronautics and Astronautics, Reston, VA., 2003. This book describes nominal, interval, ordinal, calibrated ordinal, estimative probability, and ratio scales. [8] Edmund H. Conrow, “Effective Risk Management: Some Keys to Success,” op. cit., pg. 262. [9] _____, “Risk Management Guide for DoD Acquisition,” Defense Acquisition University, Defense Systems Management College, published by the Defense Systems Management College Press, Second Edition, March 1998, pp. 16-17. The final edition of this document series was the Fifth Edition, Version 2, June 2003, and the equivalent text is contained on pp. 19-20. [10] Pariseau, R., and Oswalt, I., “Using Data Types and Scales for Analysis and Decision Making,” Acquisition Review Quarterly, Spring 1994, Vol. 1, No. 2, pp. 146–152. This paper describes nominal, interval, ordinal, and ratio scales, but does not discuss calibrated ordinal or estimative probability scales. [11] Harold Kerzner, “Project Management: A Systems Approach to Planning, Scheduling, and Controlling,” Seventh Edition, John Wiley & Sons, New York, 2001, pp. 924-927. [12] “Defense Acquisitions: Improvements Needed in Space Systems Acquisition Management Policy,” U.S. Government Accountability Office, Report GAO-03-1073, Sept. 2003. [13] “Defense Acquisitions: Assessments of Selected Weapon Programs,” U.S. Government Accountability Office, Report GAO-07-406SP, Mar. 2007. [14] “NASA: Implementing a Knowledge-Based Acquisition Framework Could Lead to Better Investment Decisions and Projects Outcomes,” U.S. Government Accountability Office, Report GAO-06-218, Dec. 2005. [15] “Defense Acquisitions: Assessments of Selected Major Weapon Programs,” Report GAO-05-301, Mar. 2005. [16] “Defense Acquisitions: Assessments of Selected Major Weapon Programs,” Report GAO-06-391, Mar. 2006. [17] “Defense Acquisitions: Assessments of Selected Weapon Programs,” U.S. Government Accountability Office, Report GAO-08-467SP, Mar. 2008. [18] “Defense Acquisitions: Assessments of Selected Weapon Programs,” U.S. Government Accountability Office, Report GAO-09-326SP, Mar. 2009. [19] “Best Practices: Better Management of Technology Development Can Improve Weapon Systems Outcome,” U.S. Government Accountability Office, Report GAO/NSIAD-99-162, July 1999, pg. 27. [20] Michael J. Sullivan, “GAO Review of Technology Transition Practices.” Presentation to the Fourth Annual Acquisition Research Symposium, 17 May 2007, pg. 22. [21] Gregory F. Dobos, Joseph H. Saleh, and Robert Braun, “Technology Readiness Level, Schedule Risk, and Slippage in Spacecraft Design,” Journal of Spacecraft and Rockets, American Institute of Aeronautics and Astronautics, Vol. 45, No. 4, July–August 2008, pp. 836-842.

Copyright © 2009 by Edmund H. Conrow

16

[22] “Space Acquisitions: Actions Needed to Expand and Sustain Use of Best Practices,” U.S. Government Accountability Office, Report. GAO-07-730T, Apr. 2007, pp. 12-13. [23] Thomas L. Saaty , “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” Vol. VI, AHP Series, RWS Publications., 2000 (revised), pg. 8. Special permission was granted by Rozann Saaty, RWS Publications to include excerpts from “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process” in this paper. [24] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 11. [25] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 13. [26] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 14. [27] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 16. [28] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 79. [29] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 15. [30] Thomas L. Saaty, “The Fundamentals of Decision Making and Priority Theory with the Analytic Hierarchy Process,” op. cit., pg. 84. [31] Edmund H. Conrow, “Effective Risk Management: Some Keys to Success,” op. cit., pg. 360. [32] _____, “NASA Systems Engineering Handbook,” NASA Headquarters, NASA-SP-6105, Rev. 1, December 2007, pg. 295. [33] Lee, T S.., and Thomas, L. D., “Cost Growth Models for NASA’s Programs,” Journal of Probability and Statistical Science, Vol. 1, No. 2, Aug. 2003, pp. 265–279. The underlying database was provided by Dale Thomas for this paper. [34] A. W. Marshall and W. H. Meckling, “The Predictability of the Costs, Time, and Success of Development,” contained in “The Rate and Direction of Inventive Activity: Economic and Social Factors,” National Bureau of Economic Research, Princeton University Press, 1962, pp. 461–476. This chapter was originally published as P-1821 by the Rand Corporation, 11 December 1959. [35] _____, “Technology Readiness Assessment (TRA) Deskbook,” Department of Defense, op. cit., pg. 1-3. [36] Edmund H. Conrow, “Some Long-Term Issues and Impediments Affecting Military Systems Acquisition Reform,” Acquisition Review Quarterly, Defense Acquisition University, Summer 1995, Vol. 2, No. 3, pp. 199-212. [37] Edmund H. Conrow, “Effective Risk Management: Some Keys to Success,” op. cit., pp. 6-12, 427-430.


Recommended