Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | phoebe-harper |
View: | 221 times |
Download: | 5 times |
The Cycling of a Decision Threshold: A System Dynamics Model of the
Taylor Russell Diagram
Elise Axelrad Weaver, Ph.D. and
George Richardson, Ph.D.
Center for Policy Research
April 20, 2001
The Cycling of Decision ThresholdsIn his book, Hammond (1996) presents the following ideas:
• Any decision threshold based on a statistically uncertain measure will inevitably yield some error and injustice in policy outcomes (a duality of error: false positives and false negatives)
• Oscillations in public and professional attitudes (with implicit policy thresholds) exist:
• Schlesinger’s (1986) proposal of “regular oscillations” in the dominance of political parties
• Oscillations between cautious conservatism and risky innovation in bridge design, as “the accumulation of successful experience” makes designers bold until a “collosal failure” takes everyone by surprise
• Cycles tend to be around 30 years long, across decision domains
Hammond suggests that oscillations may be a result of the duality of error
Hammond’s exhortation:“For if such oscillations can be shown to exist, and if they can be shown to have a definite period...then we have at hand not only a means for predicting our future political climate far in advance, but an important phenomenon that strongly invites, indeed, demands, analysis and interpretation.”
We propose a system dynamics model to represent and explore Hammond’s idea in a rigorous way
Another Example of a Reversal in Policy Formation
Use of SAT testing in admissions
1967 University of California began using SAT scores in admissions decisions
1999 University of California faculty voted their preference to exclude SAT scores from admissions decisions
2001 Richard Atkinson, University of California President, publicly advocates removing SATs from admissions.
Example of a Decision Threshold for Policy Formation
Illustration:
Students are admitted to an academic program partially according to SAT score set at a given threshold
• The SAT score is used to predict academic success
• The GPA at graduation is the measure of true academic success
Taylor Russell Diagramr = .5, cutoff = 50
0
20
40
60
80
100
0 20 40 60 80 100Judgment
"Truth"
Positive on testNegative on test
Graduates
Non-Graduates
Decision Threshold
False +
False -
0
20
40
60
80
100
0 20 40 60 80 100Judgment
"Truth"
Taylor Russell Diagramr = .5, cutoff = 80
Positive on testNegative on test
Decision Threshold
False +
False -
Graduates
Non-Graduates
0
20
40
60
80
100
0 20 40 60 80 100Judgment
"Truth"
Taylor Russell Diagramr = .5, cutoff = 20
Positive on testNegative on test
Decision Threshold
False +
False -
Graduates
Non-Graduates
Non-Graduates
0
20
40
60
80
100
0 20 40 60 80 100Judgment
"Truth"
Taylor Russell DiagramHighly Certain Test (r = .99)
Positive on testNegative on test
Decision Threshold
False +
False -
Graduates
They want to reduce the number of individuals for whom the decision to accept was falsely negative: high potential for success but unacceptable SAT scores
Stakeholders in the Duality of Error
Constituency Concerned with Unfair Disadvantage:
Constituency Concerned with Maintaining Standards:
They want to reduce the number of individuals for whom the decision to accept was falsely positive: low potential for success but acceptable SAT scores
Duality of Error Due to the Decision Threshold
Decision Policy:Threshold SATfor Admission
True + (AcceptableSAT, High Potential)
True -, (UnacceptableSAT, Low Potential)
False + (AcceptableSAT, Low Potential)
False - (Unacceptable SAT,High Potential)
0
20
40
60
80
100
0 20 40 60 80 100Judgment
"Truth"
Decision Threshold
False +
False -
Decision Threshold as a Stock (accumulating increases and decreases)
Decision Policy:Threshold SATfor Admission
False - (Unacceptable SAT,High Potential)
False + (Acceptable SAT,Low Potential)
True + (Acceptable SAT,High Potential)
True - (Unacceptable SAT,Low Potential)
+
-
+
+
ThresholdDecreases
ThresholdIncreases
+
+
-
+
Decision Threshold Responding to Stakeholder Pressure
DecisionPolicy:
Threshold SATfor Admission
False + (AcceptableSAT, Low Potential)
False - (UnacceptableSAT, High Potential)
Increase inThreshold
CurrentDissatisfaction
(HSC)
CurrentDissatisfaction
(DC)
Decrease inThreshold
Pressure to IncreaseThreshold (HSC)
Pressure to DecreaseThreshold (DC)
Disadvantaged Constituency
(DC)
High Standards Constituency
(HSC)
Cycling of Policy Threshold: Historic Discontent
Disadvantaged Constituency
(DC)
High Standards Constituency
(HSC)
DecisionPolicy:
Threshold SATfor Admission
False - (UnacceptableSAT, High Potential)
Increase inThreshold
CurrentDissatisfaction
(DC)
CumulativeDissatisfaction
(DC)
Accumulation Rateof Historic
Dissatisfaction(DC)
Decrease inThreshold
ForgettingRate (DC)
Pressure to DecreaseThreshold (DC)
Cycling of Policy Threshold: Key Parameters
For both constituencies:• Tolerated Number of False Cases• Relative Weight on History• Time to Respond to Pressure• Threshold Change per Unit
Pressure
Disadvantaged Constituency
(DC)
High Standards Constituency
(HSC)
DecisionPolicy:
Threshold SATfor Admission
False - (UnacceptableSAT, High Potential)
Increase inThreshold
CurrentDissatisfaction
(DC)
CumulativeDissatisfaction
(DC)
Accumulation Rateof Historic
Dissatisfaction(DC)
Relative Weight onHistory (DC)
Threshold Change perUnit Pressure (DC)
Decrease inThreshold
ForgettingRate (DC)
Pressure to DecreaseThreshold (DC)
Tolerated Numberof False - (DC)
Time to Respondto Pressure (DC)
Cycling of Policy Threshold: More Key Parameters
•Time Constants (Forgetting, Retention)•Initial value of Threshold•Lookup functions (correlation)•Dissatisfaction per Error
DecisionPolicy:
Threshold SATfor Admission
False - (UnacceptableSAT, High Potential)
Increase inThreshold
CurrentDissatisfaction
(DC)
False NegativeLookup f
CumulativeDissatisfaction
(DC)
Accumulation Rateof Historic
Dissatisfaction(DC)
Relative Weight onHistory (DC)
Threshold Change perUnit Pressure (DC)
Decrease inThreshold
ForgettingRate (DC)
TimeConstant forForgetting
(DC)
Pressure to DecreaseThreshold (DC)
Tolerated Numberof False - (DC)
Time to Respondto Pressure (DC)
Dissatisfaction perError (DC)
Time Constant forRetention (DC)
Unexplored Parameters
Lookup Function
– translation of threshold choice to number of errors
– involves correlation, thresholds for judgment and success
– currently set at r = 0.7, variable threshold for judgment, and threshold for success fixed.
Pace of Error Generation and Error Detection
– Baseline model assumes that in every month, there is an assessment of number of errors due to threshold choice
Threshold Associated with “Success”
– Baseline model assumes a fixed threshold of “success”
Graph of Cycling of Policy Threshold
SAT Score
1,600
1,400
1,200
1,000
800
0 36 72 108 144 180 216 252 288 324 360Time
"Decision Policy: Threshold SAT for Admission" : baseline score
Baseline Case
Threshold Change per Unit Pressure
Time to Respond to Pressure
Graph for Decision Policy: Threshold SAT for Admission
1,765
1,514
1,263
1,012
760.97
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : half score"Decision Policy: Threshold SAT for Admission" : baseline score"Decision Policy: Threshold SAT for Admission" : double score
- represents policy makers’ responsiveness to constituent pressure: more responsiveness means wider amplitude and very slightly lower frequency
Initial Threshold Value
- represents policy makers’ initial setting for decision threshold: no long term effect
Graph for Decision Policy: Threshold SAT for Admission
1,601
1,420
1,239
1,057
876.09
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : thresh900 score"Decision Policy: Threshold SAT for Admission" : thresh1270 score"Decision Policy: Threshold SAT for Admission" : thresh1500 score
Tolerated Number of False Cases
- represents constituents’ sensitivity to errors not a key difference in the long run for either frequency or amplitude
Graph for Decision Policy: Threshold SAT for Admission
1,512
1,384
1,256
1,128
1,000
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : half score"Decision Policy: Threshold SAT for Admission" : baseline score"Decision Policy: Threshold SAT for Admission" : double score
Time Constant for Forgetting
- represents time it takes for an error to dissipate from constituent memory
more time to forget lowers frequency and raises amplitude
Graph for Decision Policy: Threshold SAT for Admission
2,000
1,700
1,400
1,100
800
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : half score"Decision Policy: Threshold SAT for Admission" : double score"Decision Policy: Threshold SAT for Admission" : baseline score
Relative Weight of History
- represents constituents’ memory or weighting for accumulated past errors relative to current error
more weight on history means lower frequency and higher amplitude of cycling; no weight on history means no cycling at all.
Graph for Decision Policy: Threshold SAT for Admission
2,000
1,700
1,400
1,100
800
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : more score"Decision Policy: Threshold SAT for Admission" : less score"Decision Policy: Threshold SAT for Admission" : baseline score
Different Weight on History for Each Constituency
- represents differential weight on history by constituents- if one group puts more weight on history, it doesn’t change the center
of the cycling, but it does reduce the amplitude.
Graph for Decision Policy: Threshold SAT for Admission
1,704
1,478
1,252
1,026
800
0 36 72 108 144 180 216 252 288 324 360Time (Month)
"Decision Policy: Threshold SAT for Admission" : morewt(both) score"Decision Policy: Threshold SAT for Admission" : morewt(hsc) score"Decision Policy: Threshold SAT for Admission" : baseline score"Decision Policy: Threshold SAT for Admission" : lesswt(hsc) score"Decision Policy: Threshold SAT for Admission" : lesswt(both) score
Key Parameter Summary
Parameters not affecting frequency:• initial decision threshold• policy maker sensitivity: increases amplitude,but
doesn’t affect frequency much in the long run• tolerated number of errors by constituents: does not
have much effect on frequency or amplitude
Parameters affecting frequency• relative weight on history & time to forget: longer
memories lower frequency and raise amplitude.
Next Steps• Ground the conceptual model in data from one or
more institutions (especially look for cases where false negatives could be detected)
• Incorporate the effects of correlations between the judgment and success measures, time to generate and detect errors, and the threshold for success
• Explore and operationalize the conversion variables (from errors to stakeholder pressure to changes in threshold)
• Consider whether there is any impact of true negatives and true positives on cycling
• Explore limit cycle and open loop structure of model
SummaryA systems dynamics model is under development.
• According to Hammond (1996), any uncertain test where a threshold is used as a policy decision tool leads to unavoidable injustice to some constituency.
• The pressure on the decision threshold from stakeholders representing the false positives and the false negatives will oppose.
• These opposing pressures will cause a cycling of the decision threshold over time.
• A key parameter affecting frequency and amplitude of cycling is constituent weighting of past errors.