1
Verifying AutonomousPlanning Systems
Even the best laid plans need to be verified
Prepared for the 2005Software Assurance
Symposium (SAS)
DS1 MSLEO1
RajeevJoshi
GordonCucullu
GerardHolzmann
BenjaminSmith
MargaretSmith (PI)
Affiliation: Jet Propulsion Laboratory
2
Overview• Goal: Demonstrate the use of model checking, specifically the SPIN
model checker, to retire a significant class of risks that is associated with the use of Autonomous Planners (AP) on Missions.– Provide tangible results to a mission using AP technology.– Leverage the work throughout NASA.
• Back in FY04 …– Selected a specific risk from among a set of candidates:
– Progress in FY04:• On a toy problem – demonstrated how abstracting the timeline can
increase the size of problems Spin can check thoroughly.• On a larger problem (DS4/ST4 Champollion models) – demonstrated
that Spin can decide if an input model contains undesirable plans.
Develop a better and more thorough method for testing input models to planners to reduce the likelihood that an input model produces undesirable plans.
3
How to getfrom A to B
?
Definition of an undesirable plan, Part 1A plan that compromises mission goals by wasting resources
4
How to getfrom A to B
?
Definition of an undesirable plan, Part 2Loss of Mission
5
Testing
~100 plans
undesirableplan
all desirable plans
Empirical Testing(current approach)
undesirable plan (error trace) no errors
Testing with the SPIN Model Checker(our work)
inputmodel
Manually inspectplans to identify
undesirable plans
endtesting
Adjust modelto exclude
undesirableplan
propertiesof desirable
plans
Adjust modelto exclude
undesirableplan end
testing
Testing
Approach
limited by time
required to
inspect sample
plans
limited only by
memory size
and processor
speed
inputmodel
PromelaModel
requirements
requirements
plans analyzesbillionsof plans
SAS_05_Verifying_Autonomous_Planners_Smith
6
Spin Model Checker• Logic Model Checker used to formally verify the correctness of
distributed software systems.• Development began in 1980 at Bell Labs
– publicly distributed source code since 1991• Most widely used logic model checker with over 10,000 users
worldwide. • Recipient of 2002 System Software Award for 2001 from the Association
for Computing Machinery (ACM)• Verifies software using a meta language called Promela (Process Meta
Language)– requires that system being verified be expressed in Promela
• SPIN can flag deadlocks, unspecified receptions, logical incompleteness, race conditions and unwarranted assumptions about relative speeds of processes
– the tool can also check any temporal logic property, including both liveness and safety properties
7
CASPER / ASPEN
We chose to focus on the CASPER / ASPEN planner.Why? Proximity – software is developed at JPL Applications – Earth Observer 1 (EO1), 3 Corner Sat (3CS)
launched in late 2004
Facts: ASPEN: Automated Scheduling and Planning ENvironment:
• an expressive modeling language• a resource management system• a temporal reasoning system• and a graphical interface
CASPER: Continuous Activity Scheduling Planning Execution and Re-planning
• Supports continuous modification and updating of a current working plan in light of changing operating context
8
EO1 / ASE Architecture
diagram reference:Rob Sherwood, et al., The ST6 Autonomous Sciencecraft Experiment, IEEE 2005, Big Sky Montana, March 5-12, 2005
SCL converts CASPER scheduled activities to low level spacecraft commands – at the same level as ground generated commands
9
Focus of work in FY05
• Apply this work to a mission (Earth Observer 1 mission)– Adaptation of CASPER for EO1 was performed at JPL– Testing of EO1 adaptation performed at JPL
• Build tools necessary to make our technique scale– The Spin models constructed in FY04 were built by hand– We need tools to automate the generation of the Spin models
from AP input models to:• improve fidelity of the Spin models (avoid human translation errors)• make our technique available to testers who may not be experts in
both ASPEN’s input language and Spin’s Promela language• speed up the whole process – testing of each EO1 build was on the
order of months
EO1
10
EO1 MissionFirst satellite in NASA’s New Millenium Earth Observing Series
• Technology demonstration - Hyperion hyper-spectral instrument.• In extended mission since 2001 - CASPER Autonomous
Spacecraft Experiment has been in test operation since April, 2004.• Onboard science algorithms analyze images to identify static
features and detect changes relative to previous observations. • CASPER then plans downlink and imaging activities
– enables retargeting of imaging on a subsequent orbit cycle to identify and capture the full extent of changed land feature due to a flood, tornado, volcano, freeze/thaw etc.
sciencealgorithms
CASPERinputmodel
executionmanager
after tornado
05/01/02
before tornado
04/24/02
La Plata, MD
sciencegoals plan
new science observations
11
EO1 model elements
• Planning window is 5 hours in length• Goals are satisfied by performing Activities.
– we will need to analyze many goal sets over the input model– each goal set will have it’s own correctness properties
• Activities (and model complexity) are constrained by Resource availability and State variables
• Activities can change the values of State variables if no other activities have the lock and if the state transition is legal
DS4 EO1
5
4
4
7
~20
127
48
12
Goals
Activities
State Variables
Resources
12
Process, Tools, and Status
ASPENModel(.mdl)
ASPENLanguage
Parser(ALP)
parsedASPENmodel
statemachinemodel
start
PromelaGenerator
Timeedit( tool for formalizing
properties )
Promelamodel
Formalizedproperty
Spin( model
checker )
errorErrortranslator
errorshown in input
model orplan
noerror
Key
completed tools, non-OSMA funds
completed tools, OSMA funds
tool in development, OSMA funds
future tool, OSMA funds
Informalproperty
StateMachine
Generator
otheranalysistool
otherAP inputmodellanguage(i.e. IDEA)
13
Preparing for EO1 (Correctness Properties)Background:
– EO1 completes a low-earth orbit every 90 minutes – passing over the same location every 16 days.
– EO1 has two imaging instruments (Hyperion and Advanced Land Imager, ALI) that can write to the solid state recorder (WARP).
Property: Two simultaneous ‘start write’ commands to the WARP are not permitted.
Timeline:
• Expressing this property as a constraint in the input model causes the input model to be over-constrained.
• Assurance that this property is not violated is currently left to testing.
14
Preparing for EO1 (Correctness Properties)
Background:– The solid start recorder (WARP) has three states: “record,” “standby,” and
“play.”– To get to the “play” state, the WARP must pass from “record” through
“standby” to “play.”
Property: Transitions from “record” to “play” must always pass through “standby”.
Timeline:
15
Preparing for EO1 (niks)• Coverage of the SPIN model checker on a large problem depends on:
– the processor speed of the CPU on which the model checking is performed,– the amount of main memory available on the computer,– the SPIN algorithms selected by the user,– the skill of the modeler in expressing the problem in Promela,– and the skill of the modeler in deriving appropriate and sound abstractions
• For large problems (i.e. EO1) a high performance computer is required (enter the machine we named: “niks”)
• Niks specifications:dual processor (3.2 Ghz Intel Xeon processors with 64-bit extension)Dell Precision 470n, with 8 GB of main memory
• Niks Software:– OS: Redhat Linux Fedora Core 2, 64-bit version – SPIN model checking software, extended for 64-bit – Xspin (www.spinroot.com) Graphical User Interface for Spin
64 bit version of Spin,
was developed for us on Niks
(but not with OSMA SARP
funds) and can address all
8 GB of main memory (as
opposed to a 32-bit version
which can only address
4 GB of memory)
16
Preparing for EO1 (Test cases)
• Test cases are needed to test that our tools are generating high fidelity Promela models.
• We have developed a set of Promela models and properties by hand from the CASPER input model test suite.
• This permits us to compare model checking results between the hand generated models and our auto-generated Promela models.
17
Challenges
• The ASPEN language structure is not rigorously defined in the documentation.– parser debugging may have to be revisited as we obtain new
ASPEN models
• Automated abstraction– resource abstraction can help to reduce the search space and
thus further optimize the model checking process
• Schedule– EO1 extended mission may end this Summer – possibly
before we complete our tools.
18
Conclusions
• We have made significant progress in defining and implementing the tools necessary to auto-generate Spin models from ASPEN models.
• These tools will be applied to model check the EO1 input models as well as other ASPEN models (i.e. 3CS and future applications)
• Next steps:– complete and test our tool suite– work with EO1 project to develop a suite of EO1 properties to check– perform logic model checking and report results
Our Goal: Develop a better and more thorough method for testing input models to planners to reduce the likelihood that an input model produces undesirable plans.
SAS_05_Verifying_Autonomous_Planners_Smith