Bayesian networks for decision making under uncertaintyHow to combine data, evidence, opinion and guesstimates to make decisions
InformationTechnology
Professor Ann Nicholson
Deputy DeanFaculty of Information TechnologyMonash University
SourcesofUncertainty
• Solution:ProbabilityTheory
Ignorance PhysicalrandomnessComplexity
Thereasoningprocess1.Startwithabeliefina
proposition“Theimportedmangois
infestedwithapest”“Thepatienthaslung
cancer”“Theapplicantwillpayback
theloan”“Salesofaproductwill
increase”
“Beliefs”• Veryunlikely• 1%chance• Oddsof100to1• 0.01probabilityFrom• Gutfeeling• Expertopinion• Data
Thereasoningprocess1.Startwithabeliefina
proposition“Theimportedmangois
infestedwithapest”“Thepatienthaslung
cancer”“Theapplicantwillpayback
theloan”“Salesofaproductwill
increase”
2. New information becomes available
“Itisfromacountrywherethepestisendemic”
“Thepatientisasmokerandhasacough”
“Theapplicantdefaultedonaprevious loan”
“Somemodelsarerecalledforsafetyreasons”
3.Updateyourbeliefs But how?
Probabilitytheoryforrepresentinguncertainty
• Propositionsareeithertrueorfalse“Patienthascancer”
• Assignsanumericaldegreeofbeliefbetween0and1topropositionsP(“Patienthascancer”)=0.001prior probability (unconditional)
• WecannowrepresenttheimpactofevidenceonbeliefP(“Patienthascancer|positivemammogram”) =0.8• Conditional probability• Or,posterior probability(bywayofBayes’theorem)
TheBayesianapproach
TheRev.ThomasBayes1702?-1761
• Represent uncertaintybyprobabilities
• UseBayes’theorem:h=hypothesise=evidence
P(h|e) = P(e|h) × P(h)P(e)
Startingbelief=‘prior’
Newbelief
EstimatingRisk
Scenario: Anathleteistestedforsteroiduseandthetestcomesbackpositive.Youknowthat:• Onein100competitors arethoughttotake
steroids• Thetestisn’talwaysaccurate
Falsepositiverate:10%Falsenegativerate:20%
Q.Whatistheprobabilitytheathlete isadrugcheat?
Suppose:• h = “the athlete is taking steroids”• e = “test result is positive”And:• P(h) = 0.01 (one in 100 people)• P(e|h) = 0.8 (true positive rate)• P(e|not h) = 0.1 (false positive rate)
What is P(h|e)?
Bayes’TheoremforEstimatingRisk
Ingeneral,peoplecan’tdoBayesTheorem (well)offhand!
≈ 0.075 (7.5%)
And how do we scale up to X1, X2, ….. X100, …. X1000 ??
BayesianNetworks
• Developed bygraphicalmodeling&AIcommunities in1980sforprobabilisticreasoningunderuncertainty
• Manysynonyms– Bayesnets,Bayesianbeliefnetworks,directedacyclicgraphs,probabilisticnetworks
Judea Pearl 2012 Turing
Award
ProbabilisticGraphicalModels
…andquantifytheuncertainty
…capturetheprocessstructure(notblackbox)
Foratargetsystem…
Whyusemodels?• Increases understanding• Supportsdecisionmaking• Usenewdataandevaluationtoimproveovertime
NativeFishExample
Alocalriverwithtree-linedbanks isknowntocontainnativefishpopulations,whichneedtobeconserved.Partsoftheriver passthroughcroplands,andpartsaresusceptible todrought conditions.Pesticides areknowntobeusedonthecrops.Rainfall helpsnativefishpopulationsbymaintainingwaterflow,whichincreaseshabitatsuitabilityaswellasconnectivity betweendifferenthabitatareas.Howeverraincanalsowashpesticides thataredangerous tofishfromthecroplandsintotheriver.Thereisconcern thatthetrees andnativefishwillbeaffectedbydroughtconditionsandcroppesticides.Seehttp://bayesianintelligence.com/publications/TR2010_3_NativeFish.pdf
BayesianNetworks- Definition
Agraph inwhichthefollowingholds:
1. Asetofrandomvariables=nodes innetwork
2. Asetofdirectedarcs connectspairsofnodes
Structure represents the causal process.(Anything missing?)
BayesianNetworks- DefinitionAgraph inwhichthe
followingholds:1. Asetofrandom
variables=nodes innetwork
2. Asetofdirectedarcsconnectspairsofnodes
3. Eachnode hasaconditionalprobability table(CPT)thatquantifies theeffects theparentnodeshaveonthechild node
4. Itisadirectedacyclicgraph(DAG), i.e.nodirectedcycles
Each row sums to 1
WORST CASEBEST CASE
Beforeyouknowanything(noevidence)
(Screen shots from the Netica BN software)
Diagnosis
Effects
Causes
Prediction
Effects
Causes
Whatnext?
• Havemodel• Haveestimatesofposteriorprobabilities
Q.Howdoweusetheseprobabilities toinformdecisions (aboutactionorinterventions)?
RiskAssessment– thedecision-theoreticview
Risk=Likelihood xConsequence
P(Outcome|Action,Evidence) Utility(Outcome|Action)
Decision making is about reducing risk or “maximising expected utility”
Decision nodes
Utility nodes(cost/benefits)
Decisionnetwork=BN+decisions+utilities
Ourmethodology1:Buildamodel• E.g.Variables:patient’sdetails,
diseases, symptoms,interventions
• Costs/benefits:eg.$,QALY
Test
ReviewStructure
Parameters
Experts
DataLiterature
2:Embedmodelindecisionsupporttool• Diagnosis• Prognosis• Treatment• Riskassessment• Prevention
Design
Build
Review
Revise
Bayesiannetworksforfogforecasting(CollaborationwithAus Bur.ofMeteorology)
Phase1:(Boneh etal,2015)Inusebyweather forecastersinMelbourne since2006
Willtherebeafogbefore9amtomorrowmorning?
Bayesiannetworksforfogforecasting
Stage2:2013-15Researchproject(prototype)(Boneh etal.Inpreparation)• Explicittemporalmodelling,predictingtimeofonsetandclearance
CaseStudy:ModellingWillows(inSt.John’sRiverBasin,Florida)
Plant and seed collectionArtificial Islands
Transplanting
Changes in water depthMay 2010 Artificial Ponds
Greenhouse exptsGrazing
Evaluating germination
Experimental research program 2009-2011
DBNExample:ModellingWillows(inSt.John’sRiverBasin,Florida)(Wilkinsonetal.,2013)
Next state
State transitions
Current state
Scenarios
Yearling
Unoccupied
Sapling
Adult
Burnt Adult
Willow life-history stage transitions
cell size = 100m x 100m (1ha)CanalsCattailGrassSedgeMarshSawgrassHerbaceousMarshWillowSwampMixedShrubTreeIslandHardwoodSwampOpenWaterLevees
Architecture of the Integrated Management Tool
GIS data
ST-DOOBN:State-
TransitionDynamic OO
Bayesian Network
Seed Production & Dispersal Spatial OO Bayesian Network
(Chee et al., 2016)
Managing Complexity: Object-oriented Modeling
Enough_Bare_Ground
Burn_Decision
Burn_Intensity
Vegetation
Summer_PPTCanal_or_Center
Soil_Type
BurnEffect_on_Willow
Spring_PPTMech_Clearing
GrowingSeason_PPT
Available_Water_Spring Available_Water_Survival
Available_Water_GrowingSeasAvailable_Water_Germination
Level of Cover (T)
Stage (T)
Proportion_Germinating Seedling_Survival_Proportion
Level of Cover (T+1)
Stage (T+1)
Seed_Availability NumberGerminating NumberSurviving
Rooted_Basal_Stem_Diameter (T)
Rooted_Basal_Stem_Diameter (T+1)
Sapling_Transition
Yearling_Transition
UnOccupied_Transition
Burnt_Adult_Transition
NonInterv_Yearling_Transition
Adult_Transition
Seed_Production
Modelling Seed Production
SeedsPerStem = I * F * SwhereI is number of inflorescencesF is the number fruits per inflorescenceS is number of seeds per fruit
SeedsPerHA = SeedsPerStem * NumberOfStemsPerHA
ApplicationofBNsforcomplexenvironmentalmanagement:WesternGrasslandsReserves
(DSEProject2012-2013)(Sinclairetal.,InPreparation)
• 10,000hatoberestoredtonativegrasslands over10-20years
• Task:buildadynamicBNtoevaluate“what-if”scenariosover20years– arangeofmanagementstrategies
– foravarietyoflandtypes– explicitlyrepresentingcostsandenvironmentalvalues
BNsforRiskassessmentforLogExportsinNZ
CollaborationwithSCION(NZtimberresearch)
Potentia l s ink popula tion preva lence(s cript/BN 2.2.5)
Activity(BN 2.2.4)
Flight conditions(BN 2.2.2)
Deadwood (s imula tion 2.1.3)
Dis pers a l ke rne l(s imula tion 2.1.2)
Meteorology(GIS 1.1.1)
Fores try his tory(GIS 1.1.3)
Topography(GIS 1.1.2)
Pes t dis tribution(GIS 1.1.5)
Source popula tion preva lence(GIS 1.3.3)
Source popula tion preva lence (BN 2.2.3)
Potentia l s ink popula tion preva lence(GIS 1.3.5)
Tempera ture dependentdeve lopment
(s imula tion 2.1.1)
Tempera ture dependentdeve lopment(GIS 1.2.1)
Loca l popula tion preva lence(BN 2.2.6)
Loca l popula tion preva lence(GIS 1.3.6)
Flight condtions(GIS 1.3.2
Activity(GIS 1.3.4)
Deadwood(GIS 1.2.3)
Dis pers a l ke rne l(GIS 1.2.2)
Log pes t preva lence(Script 2.3.2)
Log pes t preva lence(DB 1.4.2)
Log ba tch pes t preva lence(Script 2.3.3)
Log ba tch pes t preva lence(DB 1.4.3)
Fores try his tory(log tag DB 1.1.6)
Fores t productivity(GIS 1.1.4)
Mature adults(BN 2.2.1)
Mature adults (GIS 1.3.1)
Pes t infes ta tion ra te(BN 2.3.1)
Pes t infes ta tion ra te(DB 1.4.1)
BayesWatch:BNsforanomalydetectionintracking (CollaborationwithDSTO)(Mascaro etal.,2014)
• Task:Detectanomalousbehaviour ofvessels, cars,pedestrians
• OriginallyusedAISdatafromvessels inSydneyHarbour
• ApplyBNlearningtobuildmodelsofbehaviour
• Combines TimeseriesBN+TrackSummary BN
• Usemetricstoassessanomalous tracks
Modellingbushfireprevention&suppression(Penmanetal.,2015)
(ScreenshotfromtheGeNIe BNsoftware)
MedicalRiskAssessment:HeartDisease(Nicholsonetal.,2008;Floresetal.,2011)
SimpleBNmodel:NZapplesImportRiskAssessment(Wintle etal.,2014).Exampleof‘what-if’reasoning
Australia’s assumptions NZ assumptions
NZApplesBN(Wintle etal.,2014)Explicithandlingofquantities(ofproduct)
(Screen shot from the AgenaRisk BN software)
Bayesianmodellingoverview
BNModels
DataMining(CaMML,
Chordalysis,Snob)
ExpertElicitation
Existingmodels
Evaluation
Decisionsupporttools
Advancedmodelling
(dynamic/timeseries,objectoriented)
Current research• FullOOBN
frameworkandmethodology
• Learningmodelswithunobservedvariables
• OnlineDelphi-basedexpertelicitation
• Visualisationofprobabilisticoutputs