+ All Categories
Home > Documents > An Accessible Cognitive Modeling Tool for Evaluation of ... · cognitive modeling is conceptually...

An Accessible Cognitive Modeling Tool for Evaluation of ... · cognitive modeling is conceptually...

Date post: 22-May-2018
Category:
Upload: vanphuc
View: 217 times
Download: 1 times
Share this document with a friend
26
This article was downloaded by: [David B. Kaber] On: 22 October 2012, At: 11:26 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The International Journal of Aviation Psychology Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hiap20 An Accessible Cognitive Modeling Tool for Evaluation of Pilot–Automation Interaction Guk-Ho Gil a & David B. Kaber a a Department of Industrial and Systems Engineering, North Carolina State University, Raleigh, North Carolina Version of record first published: 11 Oct 2012. To cite this article: Guk-Ho Gil & David B. Kaber (2012): An Accessible Cognitive Modeling Tool for Evaluation of Pilot–Automation Interaction, The International Journal of Aviation Psychology, 22:4, 319-342 To link to this article: http://dx.doi.org/10.1080/10508414.2012.718236 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms- and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages
Transcript

This article was downloaded by: [David B. Kaber]On: 22 October 2012, At: 11:26Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

The International Journal ofAviation PsychologyPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hiap20

An Accessible CognitiveModeling Tool for Evaluation ofPilot–Automation InteractionGuk-Ho Gil a & David B. Kaber aa Department of Industrial and Systems Engineering,North Carolina State University, Raleigh, NorthCarolina

Version of record first published: 11 Oct 2012.

To cite this article: Guk-Ho Gil & David B. Kaber (2012): An Accessible CognitiveModeling Tool for Evaluation of Pilot–Automation Interaction, The InternationalJournal of Aviation Psychology, 22:4, 319-342

To link to this article: http://dx.doi.org/10.1080/10508414.2012.718236

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.

The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up todate. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damages

whatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

THE INTERNATIONAL JOURNAL OF AVIATION PSYCHOLOGY, 22(4), 319–342Copyright © 2012 Taylor & Francis Group, LLCISSN: 1050-8414 print / 1532-7108 onlineDOI: 10.1080/10508414.2012.718236

An Accessible Cognitive Modeling Tool forEvaluation of Pilot–Automation Interaction

Guk-Ho Gil and David B. KaberEdwards P. Fitts Department of Industrial and Systems Engineering, North

Carolina State University, Raleigh, North Carolina

Various cognitive modeling techniques and tools have been developed to supportdescription and prediction of human behavior in complex systems. GOMS (Goals,Operators, Methods and Selection rules) modeling methods have been used inhuman–computer interaction (HCI) analysis for many years and are considered easyto learn. GOMS has several limitations, including representing only expert behav-ior in tasks and not supporting detailed modeling of visual and motor operationsor parallel processing. Another limitation is that operation time estimates are deter-ministic. This research developed an enhanced GOMS language and computationalcognitive modeling tool to address the existing GOMS limitations to aid cockpitautomation designers in assessing the potential for automation-induced pilot perfor-mance problems. Output of the tool for a specific flight and automation use scenariowas compared with experiment data for validation purposes. Results demonstratedsignificant correlations of model-based pilot performance and cognitive workloadpredictions with observations on pilots using a flight simulator. The new enhancedcognitive modeling approach is expected to provide accurate explanations and pre-dictions of user behaviors during the design of complex systems and interfaces invarious domains involving interactive task performance.

In a recent report on causes of fatal aviation accidents, more than 88% of acci-dents were attributed to pilot error (Boeing, 2009). Major types of errors includedloss of control (35%) and runway excursion (12%). Such errors have previouslybeen attributed to improper equipment setup by pilots, automation states bias-ing pilot decision making, failure to monitor systems, failures to heed automatic

Correspondence should be sent to David B. Kaber, Edwards P. Fitts Department of Industrial andSystems Engineering, North Carolina State University, 111 Lampe Drive, 400 Daniels Hall, Raleigh,NC 27695–7906. E-mail: [email protected]

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

320 GIL AND KABER

alarms, pilot responses to false alarms, and loss of skill proficiency (cf. Wiener& Curry, 1980). In general, errors in complex system control are of two types:(a) person (active), and (b) system (latent) errors (Foyle & Hooey, 2008; Reason,2000). Active errors are related to the actions of operators (the pilot). For exam-ple, in direct control of a cockpit interface, pilots can flip the wrong switch orignore a warning tone. On the other hand, latent errors are due to actions that areseparated in space and time from the actions of an operator directly interfacingwith the system. Underlying causes of latent errors include poor system design,faulty maintenance, incorrect installation, and bad management decisions (Foyle& Hooey, 2008).

Operator training is the most common approach to address active errors; how-ever, latent errors cannot be resolved in this way. Because causes of latent errorsinclude poor design or interface interaction defects, they must be addressed dur-ing the design process. Design functionality and usability must be ensured prior toreleasing a system to a user to support performance and information processing.One approach to effective design is to use human factors research methods in theconceptual design process (Sanders & McCormick, 1993). Human factors meth-ods for evaluating design concepts can be generally categorized as two types: (a)human-in-the-loop (HITL) simulation and testing, and (b) human performancemodeling (HPM). HITL simulation uses real operators to collect empirical datathat can be used to predict system performance under various conditions. Thismethod is particularly useful for design validation and demonstrations of sys-tem applicability to real-world operations. However, the method also has severalweaknesses, including high cost in prototype development, difficulty in recruitingexperts for design input and evaluation, slow progress of experimentation, andthe high cost of testing. In response to these limitations, Card, Moran, and Newell(1983) suggested the concept of engineering human information processing (HIP)models for design usability assessment. Today, this concept is known as HPM, andis a major part of human factors research and practice. The advantages of HPMinclude description of human performance capacities and limitations without theuse of elaborate designed experiments. Consequently, the HPM approach mightreduce design cycle time and increase the possibility of selecting a better designconcept before production deadlines are reached. HPM can also be used in a com-plementary manner with experimental analysis or HITL simulations for usabilityassessment when only small participant samples are available.

Early forms of HPM techniques included the Goals, Operators, Methods, andSelection Rules (GOMS) cognitive modeling technique developed by Card et al.(1983). GOMS describes user behavior with interactive systems and has been usedto evaluate system design from usability and performance perspectives. However,the modeling approach is limited to representing expert behavior in tasks thatcan be decomposed into sets of procedures. GOMS models also do not supportrepresentation of basic visual (e.g., foveal, peripheral) and motor behaviors, as

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 321

well as parallel processing of these operations. Another major limitation is thattime estimates for operations coded in models for predicting task completiontimes are deterministic. Therefore, model output cannot accurately represent indi-vidual differences in performance or the stochastic nature of human behavior.Card et al. suggested that variances in operation times could be determined byobservation of target tasks and application in model coding.

To address these limitations and to extend the use of GOMS for accurate HPMand supporting effective system design, one of the objectives of this researchwas to develop an enhanced GOMS language (E-GOMSL). The primary moti-vation for focusing on GOMS versus other cognitive modeling languages, suchas Adaptive Control of Thought–Rational (ACT–R; Anderson et al., 2004) orthe Executive-Process/Interactive Control (EPIC) modeling language (Kieras &Meyer, 1997), is the learnability and usability of the GOMS method by novicesor designers who might not have time to learn a complex cognitive modelinglanguage. Furthermore, GOMS, especially Natural GOMS Language (NGOMSL;Kieras, 1997), provides a flexible format for modeling task methods and allowsfor extension of its capabilities.

The method we developed (E-GOMSL) supports detailed description of low-level visual and motor operations in cognitive models, coding of parallel executionof such operations, the use of stochastic operation time variables, and flexibil-ity in coding different kinds of operations (commonly referred to as “operators”in GOMS models) that occur in various work domains, such as aircraft pilot-ing. We also developed a computational cognitive modeling tool, the E-GOMSLTool, to aid designers in developing E-GOMSL cognitive models and assessingthe potential for pilot performance problems with automated systems or computerinterfaces in the cockpit. Previous research (Foyle & Hooey, 2008), includingsome studies through NASA Ames, has used existing cognitive modeling tech-niques for describing pilot performance in various aviation domains. We reviewedthese studies and techniques and developed our own modeling method to repre-sent specific pilot behaviors and variations among pilots. This is a novel approachrelative to the prior work. The E-GOMSL tool was developed to support systeminterface prototyping, operator activity flowcharting, cognitive model coding andexecution, simulation of user behaviors in interaction with automated systems orcomputer interfaces, and outcomes reporting. It was expected that the tool wouldallow for accurate explanation and prediction of user behaviors during the designof complex systems and interfaces.

To assess the validity of the modeling language and tool, we applied thenew technologies to an aircraft automation design evaluation. A flight simula-tor experiment was conducted with a contemporary form of cockpit automation,a continuous descent approach (CDA) tool for flight route replanning, to generatea real human–automation performance data set. A cognitive task analysis (CTA)was conducted to identify expert pilot behaviors in interacting with the automation

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

322 GIL AND KABER

and to generate information for model development purposes. An E-GOMSLmodel of pilot behavior with the CDA tool was developed and simulation out-puts were compared with experiment data for demonstration of the applicabilityof the E-GOMSL tool to the design processes.

DEFINITION OF E-GOMS LANGUAGE

The GOMSL computational cognitive modeling method (Kieras, 2006) andNGOMSL task analysis technique (Kieras & Polson, 1999) were selected as basesfor development of the E-GOMS Language. To incorporate novel low-level visualand motor operators in E-GOMSL, two approaches were considered. NGOMSLprovides flexibility for users to create and apply their own operators for natural-istic description of human cognitive behavior. Opposite to this, detailed elaboratecognitive architectures can serve as frameworks for developing explicit cognitivemodeling languages. With this in mind, we adopted the flexibility of NGOMSLfor creating new operators and used the HIP model developed by Wickens andHollands (1999) as a cognitive architecture to support and constrain E-GOMSLmodel coding. Each new operator in E-GOMSL was defined with four propertiesbased on Wickens’s HIP model: the cognitive processing channel used, controlobjects, operator syntax, and operator times. Processing channels identified in theHIP model included auditory, visual, cognitive, and motor (hand, foot, and vocal).Each channel has control objects. For example, the cognitive channel includesflow of control, parallel processing, working memory (WM) use, transactionsbetween WM and long-term memory (LTM), and decision making as controlobjects. E-GOMSL operator syntax is similar to GOSML and NGOMSL oper-ator syntax. The new operator set is primarily based on NGOMSL operators, asoriginally defined by Kieras (1997); however, the control structure of E-GOMSLmodels follows GOMSL models, to support compilation and model executionwith a simulation engine.

As the second step in E-GOMSL development, stochastic variables weredefined to represent operator times in behavior models. Because computationalcognitive modeling is conceptually similar to discrete event simulation of humantask performance, methods used in systems simulation for representing or quan-tifying event processing times have been extended to cognitive modeling (Liu,Feyen, & Tsimhoni, 2006). Although statistical parameters, such as mean timeand variance, for operators in GOMS can be determined based on existing datafrom the previous psychological and HCI literature (e.g., Anderson & Lebiere,1998; Card et al., 1983; Laird, Newell, & Rosenbloom, 1987; Meyer & Kieras,1997; Olson & Olson, 1990), studies have not developed operators and timesdescribing pilot behaviors with aircraft cockpit automation interfaces.

Five operators, including Confirm, Look_at, Think_of, Store, and Recall, weredefined for the aircraft cockpit, based on the HIP model properties identified

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 323

earlier. Times for all these operators, as well as statistical parameters, were deter-mined from video analysis of simulation experiment trials (described later). Videoanalysis was used because some of the operators for which we developed time dis-tributions were only accessible via the video recordings (e.g., use gaze tracking forestimation of Look at times). Numerous observations were made on each operatoras performed by multiple certified pilots in repeated tests. Among the completeset of E-GOMSL operators, only a limited number of observations could be col-lected on the (button) Press operator during the experiment trials; consequently,times were inferred from previous relevant literature (Olson & Olson, 1990).

The think_of operator was adopted from the NGOMS technique (Kieras,1997). This operator is used to represent the process of thinking of a valuefor some flight parameter, based on the route plan, which is designated by a<description>. This information must also be put into working memory. Theconfirm operator was created to account for pilot supervisory control behaviors.For example, in an automated aircraft cockpit, pilots often monitor automatedfunctions during flight and check to ensure system states reflect targets. Pilotsneed to confirm the current status of the aircraft after each automation cycleupdate and intervene in control if necessary. For this reason, the Confirm oper-ator was used for modeling supervisory behaviors in the cockpit. (The operatordoes not represent a combination of Look_at and Think_of in the modelingapproach.)

In earlier HPM-related experiments, time distributions on simple operators,like reading dials, looking up values, performing simple arithmetic, and enter-ing data by keystroke have appeared to be skewed (Card et al., 1983, p. 85) andnonnormal. For this reason, in generating the stochastic variables to representoperators in E-GOMSL model code, we used the log-normal distribution. Thelog-normal is a skewed distribution with a low mean and large variance based onall positive data (Limpert, Stahel, & Abbt, 2001). The distribution has also beenfound to be representative of human psychomotor task performance times (e.g.,Fitts, 1954). The inverse cumulative density function (CDF) for the log-normaldistribution is shown in Equation 1 (Cohen, 1988).

[ exp(�−1(x)σ + θ

) + γ 0 ≤ x ≤ 10 x < 01 x > 1

](1)

Table 1 shows examples of E-GOMSL operators, time estimates, and the oper-ator distribution parameters determined based on our experimental data and thatof previous studies.

With stochastic variables defined for each operator in the E-GOMSL, an overallstochastic time estimate for pilot performance of an interactive task with automa-tion (i.e., execution time) can be calculated as the summation of all operator

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

324 GIL AND KABER

TABLE 1Examples of Enhanced Goals, Operators, Methods, and Selection Rules (E-GOMS)

Language Operators, Time Estimates, and Log-Normal Distribution Parameters

HIP ModelConstruct Channel

ControlObject Operator Syntax

θ

Thresholdγ

Scaleσ

Shape

Cognitive Cognitive Think_of <description> 0 7.17 0.50Confirm <description> 0 7.35 0.43

Perception Visual Visual Look_at <object> 0 7.30 0.50WM Cognitive WM Store <value> under <tag> 0 7.26 0.38LTM Cognitive WM from

LTMRecall_LTM_item_whose <property>

is <value>, and_store_under <tag>

0 7.33 0.50

Responseselection

Motorhand

Key Press <object> 0 5.84 0.52

Note. WM = working memory; LTM = long-term memory.

time estimates in an E-GOMSL model capturing the target task behaviors. Thetime estimates can be considered to represent the range of pilot performance,including normal (average), “superskill,” and “slacker” behavior (considering per-formance labels used in formalized human performance rating systems; e.g., theWestinghouse Technique).

DEVELOPMENT OF E-GOMSL MODELING TOOL

Most contemporary HPM tools include multiple functionalities, such as task anal-ysis capability, prototyping interfaces, user modeling capability, model compilers,simulation engines, and results report generators. Such HPM tools include iGEN(Emmerson, 2000), Micro SAINT Sharp (Bloechle & Schunk, 2003; Schunk,2000), MIDAS (Man–machine Integration Design and Analysis System; Gore,Hooey, Foyle, & Scott-Nash, 2008), the enhanced QN-MHP with additionalgraphical interfaces (Wu & Liu, 2007), and CogTool (John, Prevas, Salvucci,& Koedinger, 2004; Teo & John, 2008). As a basis for the E-GOMSL tooldevelopment, we focused on the QN-MHP and CogTool because they currentlyprovide the most advanced capabilities for developing cognitive models. In gen-eral, the E-GOMSL modeling tool integrates and expands on the features of eachof these two tools. E-GOMSL models are based on a preliminary CTA and thetool integrates a task flowcharting capability. The workload analysis approach ofthe QN-MHP was adapted for the E-GOMSL tool to represent variations in usercognitive load (when the task simulation function is used). The E-GOMSL toolintegrates the concepts for visualizing HIP developed as part of the QN-MHP andCogTool, including simulating user eye-gaze patterns at an interface and present-ing a Gantt chart of the pattern of activity of perceptual, cognitive, and motor

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 325

processors. Beyond these integrated and expanded features, new capabilitiesfor the E-GOMSL tool include representation of cognitive action flow dia-grams (AFDs), operator parallel processing capability, and a stochastic simulationprocessor making use of the previously defined stochastic operator variables.

Prototype

The overall concept for the E-GOMSL tool development was to provide an easy-to-learn and easy-to-use method and procedure for creating cognitive models ofhuman behavior. To satisfy these needs, six software modules were developed,including a prototyping module, flowchart builder, flowchart to E-GOMSL trans-lator, E-GOMSL editor, parser and compiler, simulator, and report generator.Microsoft Excel was used as a platform to develop functions to process inputdata from designers, including user task and interface content. Visual Basic forApplications (VBA) macros were used to develop the modeling functions.

For the E-GOMSL tool, there are three categories of input:

1. Images of task interface prototypes, which can range from sketches ordrawings to electronic drawings or complete graphical renderings.

2. Outcomes of a CTA, including lists of visual and nonvisual objects (tasksteps, audio objects, WM and LTM contents at different stages of taskperformance).

3. The AFDs, which are essentially another formalism of CTA results thatare compatible with the E-GOMSL tool.

The tool generates two types of outputs, including a human behavior simula-tion and simulation report. During the simulation visualization, a designer can seehow pilot visual attention and motor control are directed based on task informa-tion processing. The target system interface is rendered as part of the visualizationand an “eye” icon is overlaid on the rendering to represent pilot gaze direction,along with icons to represent the locations of the hands and feet at controls. Theworkload analysis graph is also presented in this same display along with theGantt chart for identifying the distribution of user time spent on various cognitiveprocesses. The simulation report output includes task execution time, workloadanalysis results (i.e., the number of chunks in WM from moment to moment inthe simulation), the number of methods used in a task, the number of WM itemsoccurring during task processing, and the number of steps completed (or oper-ators executed). These human performance outcomes can serve as the basis foranalyzing and comparing new system design concepts.

There are seven steps to transforming E-GOMSL inputs to outputs, as part ofevaluation of a new system design concept. The first step is for the designer toconduct a CTA, including identification of initial system states, user goals and

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

326 GIL AND KABER

decisions, user tasks and methods, the sequence of tasks and user behaviors, andinformation requirements (including audio and visual). One useful approach tothis type of analysis is goal-directed task analysis (GDTA), as defined by Endsley(1993) and demonstrated by Usher and Kaber (2000). Second, with the modelingtool, a designer loads prototype interface images and identifies visual objects fortask performance (e.g., knobs, dials, etc.; see Figure 1a). Objects are identifiedby locating them in the interface image with the mouse, outlining the physicalarea of the object, and assigning it a label as a basis for model coding. Third,the designer defines nonvisual objects, including task items, audio objects (e.g.,clearances from an air traffic controller to a pilot via phone), items stored in WMduring task performance, and LTM items based on the outcomes of the CTA (seeFigure 1b). Fourth, the designer constructs task flowcharts, based on the infor-mation from the CTA, to represent the sequence of user physical and cognitivebehaviors in response to task events (see Figure 1c). The E-GOMSL tool canthen be used to translate these charts to E-GOMSL code, or a designer can usethe E-GOMSL editor to manually edit model code (see Figure 1d). The fifth stepis to run a model. This is an internal process of the system interface. This stepprepares the model to be compiled in the next step. The RUN collects all thedata from the second through fourth steps and reorganizes the data for compi-lation. The sixth step is compilation and analysis of the model. The E-GOMSLtool compiles model code from the fifth step and analyzes results. On the basis ofthese results, the seventh step is execution of a stochastic simulation of user taskperformance at the prototype interface (see Figure 1e), including visualization ofuser gaze behavior and patterns of information processing in the simulation dialog(see Figure 1f). After the visualization is complete, the E-GOMSL tool generatesa simulation report (see Figure 1g). As previously mentioned, the outputs includetask time estimates, complexity indexes, and a Gantt chart revealing the distri-bution of perceptual, cognitive, and motor processing throughout the task. Thetime estimates include simulation start time, end time, unit time, replications, andpercentage of total time accounted for by each method executed in the cognitivemodel. The complexity indexes table includes the number of methods, the numberof steps, the total number of chunks in WM, the highest number of chunks in WMat any given time, and the number of LTM transactions occurring based on theE-GOMSL model compilation.

EXPERIMENT AND MODEL SIMULATION

Objective, Flight Scenario, and Response Measures

The objective of the experiment was to compare the effects of various forms ofautomation for aircraft route replanning in a next generation (NextGen) arrivalscenario (a tailored arrival; TA) on pilot workload responses, task success rate, and

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 327

(a)

2nd: D

efin

ing

visu

al o

bjec

ts

(b)

3rd: D

efin

ing

non-

visu

al o

bjec

ts(c

) 4th

: AFD

flo

wch

artin

g to

ol

(g)

E-G

OM

SL m

odel

out

puts

(f)

7th: S

imul

atio

n di

alog

for

vis

ualiz

atio

n

(d)

4th: E

-GO

MSL

cod

e

(e)

7th: S

toch

astic

sim

ulat

ion

1st:

CTA

5 &

6th:

RU

N &

CO

MP

ILE

FIG

UR

E1

Asu

mm

ary

ofm

odul

esin

the

Enh

ance

dG

oals

,O

pera

tors

,M

etho

ds,

and

Sele

ctio

nR

ules

Lan

guag

e(E

-GO

SML

)to

ol.

Not

e.C

TA=

cogn

itive

task

anal

ysis

;AFD

=ac

tion

flow

diag

ram

.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

328 GIL AND KABER

time to task completion (TTC). (The design, procedures, and results of this exper-iment are covered in detail in Gil, Kaufmann, Kim, & Kaber [2010], with onlyessential information provided here as a basis for presentation of the cognitivemodeling tool assessment.)

A high-end PC-based flight simulator (see Figure 2a), integrating an enhancedflight simulation, was set up to present interfaces and functions of existing andfuturistic forms of cockpit automation, including use of a multi-control displayunit (CDU; see Figure 2b) as part of a flight management system (FMS) and aCDA tool for flight planning (see Figure 2c). These tools were combined withair traffic control (ATC) datalink communications in the simulation. Flight taskswere simulated using the X-Plane simulator. Pilots flew multiple arrivals underthree different modes of automation (MOAs). The TAs required route replanningto avoid convective activity and the route was constrained by a minimum fuelrequirement at an initial approach fix. Figure 2 is intended to provide the reader

Secondmonitor

Mainmonitor

Yoke Throttle

Computers

Rudder pedal

(a) Simulator setup

(c) CDA tool

CDA tool(touch screen)

(b) CDU

FIGURE 2 Simulator setup and simulation interfaces. Note. CDU = control display unit;CDA = continuous descent approach.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 329

with a clear idea of the simulator setup. This is important to establishing the degreeof generalizability of the experiment findings and the model projections.

The three MOAs included pilot use of the conventional CDU for route selec-tion and implementation, use of an enhanced CDU (CDU+), and use of theCDA tool for route planning. The CDU MOA was considered to be the low-est level of automation. Under this mode, the CDA tool presented pilots withall possible route waypoints. They were required to select waypoints to avoidany convective activity and to manually calculate fuel consumption associatedwith the planned route to ensure sufficient reserve levels. Subsequently, pilotswere required to manually program the new route into the CDU of the FMS.The CDU+ MOA was considered to be a medium level of automation. In thismode, the CDA interface presented pilots with a set of four routes to choosefrom. Fuel consumption levels were provided for each route; however, pilotshad to optimize route selection based on convective activity. Pilots were alsorequired to transfer a selected route to the FMS via datalink and confirm the routefor implementation. The CDA MOA was considered to be the highest level ofautomation in the experiment. The CDA tool provided pilots with the best routeto avoid convective activity and to meet fuel constraints. This mode providedfully automated reroute capability as well as automatic loading of the route intothe CDU/FMS.

Pilot task workload was also manipulated with two levels, including low andhigh, based on the starting position of the aircraft in the flight simulation relative tothe point at which a reroute decision had to be made. The low workload conditionallowed more time to complete the task and the high workload condition allowedless.

The success rate in completing the replanning task before the first waypoint onthe arrival, as well as the TTC, were measured as performance outcomes. Pilot car-diac response (heart rate; HR) during each test trial, as compared to the minimumHR response across all pilot test trials (i.e., percentage increase in HR during test-ing), was measured as an indicator of pilot workload (see Scerbo et al., 2001, forsupport for the use of HR in representing cognitive task workload under limitedphysical exertion conditions).

Cognitive Task Analysis and Model Development

The purpose of this step was to identify expert pilot behaviors in flying the simu-lated TA scenario using the different forms of cockpit automation and to developinputs for the E-GOMSL cognitive modeling effort. The CTA was expected toidentify expert pilot knowledge required for use of the new CDA tool concept ofautomation. An ex-U.S. Air Force squadron leader with 17 years of C-130H flightexperience (including check rides) and current Airline Transport Pilot certifica-tion participated in the CTA. Video recordings of the expert pilot simulator runs

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

330 GIL AND KABER

were reviewed and the GDTA methodology was applied to identify pilot goals,decisions, information requirements, and subtasks to achieving goals in the flightreroute with the CDU, CDU+, and CDA tools. The GDTA revealed sequences ofpilot behaviors under each MOA, which were reflected in the E-GOMSL models.Figure 3 shows the steps in the CTA process applied as part of this research andits inputs and outputs with feedback information.

From the CTA, a unique model (i.e., AFDs) of pilot behavior was developedfor each MOA and workload condition examined in the experiment. However,experiment results indicated that the workload manipulation was not signifi-cant in pilot TTC and HR for each MOA (Gil et al., 2010). Therefore, eachmodel of pilot behavior with automation could be applied across workload condi-tions. Furthermore, this study focused on validation of the E-GOMSL modelingapproach for forms of cockpit automation currently under design and devel-opment. Consequently, we only translated one AFD for pilot use of the CDAtool to an E-GOMSL model. We did not evaluate cognitive models for eachof the six different experimental conditions (3 MOAs × 2 workload condi-tions). Related to this, the manner of use of perceptual, cognitive, and motorprocessor by an E-GOMSL model is dictated by the underlying cognitive archi-tecture (the HIP model and its parameters). Therefore, any model representingany form of automation would make use of, for example, WM or LTM in thesame way, but WM load might vary based on the designs of the automation inter-faces and what information pilots must recall from moment to moment with aparticular MOA.

Methodology for Comparison of Experiment and Model Results

Twelve pilots with scheduled airline, charter, or corporate experience in aircraftequipped with FMSs participated in the experiment. All had an instrument ratingand average time was 3,706 hr. Observations were collected on pilot performancein three trials under each MOA. A portion of these data were used to developthe statistical distributions on TTC for various flight control behaviors (opera-tors) and to determine pilot HR (or workload) responses. Specifically, the datafor 30% of successful experiment trials were used to generate log-normal dis-tributions on operator time, specified by the threshold (γ ), scale (μ), and shape(σ ) parameter values. The time distributions for each operator were based on largesample sizes. For example, 288 observations were used for characterizing the timedistribution for Look at. We used 244 data points for defining the distributionfor the Confirm operator. Comparable sample sizes were used for the Think of,Store, and Recall operators. All samples were based on observations of pilotperformance of operators in 27 different videos. The data for the remaining70% of successful experiment trials were used to validate the cognitive modeltime predictions. To determine the likelihood of the E-GOMSL model predicting

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 331

1. S

cenario

Develo

pm

ent

2. S

imula

tion

Pro

toty

pin

g

3. V

erb

al

Pro

tocols

4. C

ognitiv

e

Task L

ists

Form

ula

tion

5. A

ction F

low

Dia

gra

m

Develo

pm

ent

6.E

-GO

MS

Mo

de

ls C

rea

tio

n

Ob

ject

of

sce

na

rio

Co

nstr

ain

ts

- R

isk

- A

pp

ara

tus

- C

ost

- P

art

icip

an

t p

oo

l

Go

al S

tate

me

nt

(Op

era

tor

be

ha

vio

r a

nd

Ch

ara

cte

ristics)

La

b e

xp

erim

en

t se

tup

Ta

sk s

itu

atio

n a

nd

pro

ce

du

re

Fe

ed

ba

ck:

Exp

ert

Re

vie

w Fin

al S

ce

na

rio

Fe

ed

ba

ck:

Exp

ert

re

vie

w

Mo

ck-u

p in

terf

ace

s

Inte

gra

tio

n w

ith

sim

ula

tor

se

tup

Pro

toty

pe

s w

ith

sim

ula

tor

se

tup

Fin

al S

ce

na

rio

La

b e

xp

erim

en

t se

tup

Dig

ita

l V

ide

o C

am

era

Th

ink A

lou

d

Re

co

rde

d v

ide

os

Dic

tatio

n

Ve

rba

l p

roto

co

ls

Ide

ntifie

d e

ve

nt

lists

Ide

ntifie

d o

bse

rva

tio

ns (

Tra

nscrip

tio

n)

Ide

ntifie

d t

ask u

nits

Ta

sk ite

ms

Su

b t

asks f

or

ea

ch

Mo

de

of

syste

m o

pe

ratio

n

Co

mm

on

ta

sk s

eq

ue

nce

Ta

sk c

om

po

ne

nts

(o

pe

rato

rs)

Ta

sk ite

ms

Su

b t

asks

Co

mm

on

ta

sks s

eq

ue

nce

Info

rma

tio

n a

nd

in

terf

ace

ob

jects

Ove

rall

task d

iag

ram

Sp

ecific

ta

sk d

iag

ram

Co

mm

on

su

b m

eth

od

dia

gra

m

Ta

sk ite

ms

Fe

ed

ba

ck:

Exp

ert

re

vie

w

Fe

ed

ba

ck f

rom

pro

toty

pe

Fe

ed

ba

ck f

rom

co

gn

itiv

e t

ask lis

ts

Fe

ed

ba

ck:

Exp

ert

re

vie

w

Fe

ed

ba

ck f

rom

actio

n f

low

dia

gra

m

Actio

n F

low

Dia

gra

m

E-G

OM

SL

Gra

mm

ar

E-G

OM

SL

Mo

de

l

Fe

ed

ba

ck

Fe

ed

ba

ck f

rom

GO

MS

Mo

de

ls

FIG

UR

E3

Cog

nitiv

eta

skan

alys

isflo

wdi

agra

m.N

ote.

IDE

F0=

Inte

grat

ion

Defi

nitio

nfo

rFu

nctio

nM

odel

ing.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

332 GIL AND KABER

pilot success in the reroute task with the CDA tool, only data from successfulexperiment trials were used. (We say more about this later.)

Two types of comparisons were conducted between the experiment data andE-GOMSL model output, including the flight TTC and pilot workload responses.The observed and model predicted task time distributions were compared. Theprobability of occurrence of success in the reroute task under the CDA MOAacross workload conditions, based on model predictions, was also determined andcompared with the experimental task success rate for the same conditions. ThePERT–3 estimate technique (project evaluation and review technique; Ravindran,Phillips, & Solberg, 1987) was used to determine the likelihood of pilot success inuse of the advanced CDA automation in the TA scenario based on the model timeoutputs. The simulator experiment and model workload results were comparedusing nonparametric correlation analyses on the HR responses and E-GOMSLmodel complexity index values (WM items counts). Both response measures weredetermined for common flight tasks occurring at common times in a scenario.

EXPERIMENT RESULTS

The results of the experiment are summarized here to provide a basis for com-parison with the E-GOMSL model outputs. (Again, the complete findings of thesimulator study are reported in Gil et al., 2010.) In general, the use of low-levelautomation, the CDU condition, led to a significantly longer TTC as comparedwith the CDU+ and CDA modes. Contrary to our expectation, the use of high-level automation, the CDA mode, led to a lower success rate, whereas the use ofintermediate-level automation (CDU+) led to the highest replanning success. Thiswas, in part, attributable to the criterion for success established for the CDA modebased on the expert pilot simulation runs, as compared to the criteria for successunder the CDU and CDU+ modes. As previously mentioned, there was no sig-nificant effect of the flight task workload manipulation (aircraft starting positionrelative to the reroute decision point) on the pilot workload responses (percentageincrease in HR) across all modes of automation.

E-GOMSL Output

The E-GOMSL modeling tool supported stochastic simulation of the cogni-tive model. Each simulation run generated time estimates (TTC values) for theoverall task procedure and methods used to complete flight tasks. In this study,27 replications of the CDA model were run to match the total number of successfulexperiment trials completed by pilots under the CDA MOA. The data fromthese trials were used as a basis for validation of the model output. Becausethe E-GOMSL model was created without considering the workload manipula-tion, there were 14 high-workload trials included in the experiment data set along

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 333

with 13 low-workload trials. Of the 12 pilots who participated in the experiment,performance data on 3 of them were used as a basis for developing the stochas-tic variables to represent the E-GOMSL model operators. The responses of theremaining 9 pilots were used to make up the experiment data set for model valida-tion. Because each participant performed three CDA trials, regardless of workloadconditions, there were 27 CDA trials. Ensuring the number of experiment tri-als and stochastic simulation runs are the same can prevent bias in estimatingresponse measure variance. The time response measures were aggregated acrosssimulation replications for the CDA MOA with calculation of overall proceduretime variance and variances for the method times.

Table 2 shows the stochastic simulation results. During the experiment tri-als, pilots successfully completed the flight reroute planning in 13 of 13 trialsunder low workload and 9 of 14 trials for the high-workload condition. (Notethat the decision time criterion for a successful trial was 92.112 sec for the high-workload condition and 129.112 sec for the low-workload condition, based onthe aircraft starting positions and airspeed.) Table 2 reveals that the E-GOMSLmodel generated very similar predictions with 8 of 14 success trials under highworkload and 10 of 13 success trials under the low-workload condition. Therefore,the model was found to have an ∼82% accuracy relative to the actual experimentresults.

TABLE 2Stochastic Simulation Results

High Workload Low Workload

No. S/F TTC (sec.) No. S/F TTC (sec.)

1 0 94.344 15 1 122.2712 0 104.182 16 1 124.2083 1 57.443 17 1 66.8314 1 80.776 18 1 122.9545 1 77.605 19 1 75.4086 1 78.187 20 1 87.3177 0 107.239 21 1 83.6098 1 83.002 22 0 158.7499 0 112.217 23 1 81.457

10 0 131.462 24 0 243.26711 1 87.335 25 0 139.28612 1 83.999 26 1 104.82313 0 197.074 27 1 102.96214 1 75.261Success rate (E-GOMSL) 57.14% (8/14) y = 97.86614 76.92% (10/13) y = 116.3955Success rate (Experiment) 64.28% (9/14) 100% (13/13)

Note. S/F = Success/Failure; TTC = time to task completion; E-GOMSL = enhanced Goals,Operators, Methods, and Selection Rules language.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

334 GIL AND KABER

PERT–3 Analysis

PERT–3 was used to estimate the likelihood of pilot success in use of the advancedCDA automation in the TA scenario, based on the E-GOMSL model output. Themodel-based estimate was compared with the average experimental success ratefor real pilots. There were 22 successful reroutes during the experiment under theCDA MOA, regardless of workload conditions. Based on the success trials, the95th percentile TTC for reroutes (Td) was calculated to be 148 sec. (The 95thpercentile of expert human performance is commonly used as a basis for systemdesign criteria.) Based on the E-GOMSL model output, the average expected timefor successful reroute completion, E(Te), was calculated at 106.8 sec and the vari-ance, V(Te), was 1674.4 sec2. With these parameters, the model-based likelihoodof a success trial can be calculated using Equation 2.

Pr (T ≤ Td) = Pr

(Z ≤ Td − E(Te)√

V(Te)

)(2)

The potential for pilot success in reroute planning for the TA using the CDA toolwas determined to be 84.3%. This was very close to the experiment trial successrate of 81.5% (22 of 27 trials).

TTC Validation

Due to the nature of the TTC data collected during the experiment, a nonparamet-ric correlation analysis (Spearman’s ρ) was conducted to assess the relationshipbetween the observations on actual pilot performance and TTC values predictedby the E-GOMSL model for the CDA trials. Again, there were 27 data pointsavailable from the experiment data set and replications of the E-GOMSL simu-lation for analysis. There was a marginal positive correlation between the modeland experiment times (ρ = 0.3489, n = 27, p = .0745). Results indicated thatthe E-GOMSL model explained about 35% of the variability in pilot performancetimes in CDA trials.

HR Validation

There were two forms of statistical analysis used for validation of E-GOMSLmodel predictions of pilot workload based on the experiment data, includinga two-sample t test and correlation analysis. The average percentage increasein HR from the minimum response across all individual pilot test trials wascompared with the number of chunks of information in WM at any specificpoint in E-GOMSL model execution. Based on HIP theory (e.g., Wickens &Hollands, 1999), attentional resources are required to maintain information in

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 335

Segments # of WMSTART 1

New Message 1Click New Mes. 2

CLK Reject 3CLK Replan 4

CLK REQUEST 3CLK SEND 3

New Message 3CLK New Mes. 3

CLK Accept 4CLK LOAD 3

FMS 3

0

1

2

3

4

5

START

New M

essa

ge

Click N

ew M

es.

CLK Reje

ct

CLK Rep

lan

CLK REQUEST

CLK SEND

New M

essa

ge

CLK New

Mes

.

CLK Acc

ept

CLK LOAD

FMS

FIGURE 4 Summary of flight segments and the number of chunks of information in workingmemory (WM) for each segment. Note. FMS = flight management system.

WM. Broadbent (1958) observed that residual attentional resources are a reli-able indicator of cognitive load. Therefore, WM content can be directly related tomeasures of cognitive load. There were 11 specific points (i.e., flight segments)during test trials and E-GOMSL model execution that were used for these analy-ses. The segments included starting the simulation, receiving new messages fromATC, and clicking buttons on the CDA interface to perform route planning andprocess clearances. Figure 4 shows the summary of segments and the numberof WM chunks counted during model execution for each phase of the flight sce-nario. There were two peak segments during the CDA trials, including clicking theReplan button on the CDA tool interface and clicking the Accept button. At thetime of replanning and acceptance of a new clearance, there were four chunks ofinformation in WM.

Due to the nature of the HR response data, a nonparametric Wilcoxon rank-sumtest was conducted to compare the average percentage increase in HR recordedfor the CDA trials in the experiment at the point at which pilot WM load was at aminimum versus the point at which it was at a maximum (based on the E-GOMSLmodel WM chunk counts). Because there were two peak points (during replanningand clearance acceptance) for the number of model WM chunks, two tests wereconducted. With respect to the replanning segment (M = 0.165, SD = 0.07), therewas a highly significant difference in the HR response from the monitoring seg-ment, which posed the minimum WM load as predicted by the model [W(n1 =12, n2 = 12) = 192, p =.0078]. With respect to the clearance acceptance phase(M = 0.152, SD = 0.03), there was also a highly significant difference in HR fromthe monitoring segment [W(n1 = 12, n2 = 12) = 190.5, p = .0099]. Figure 5apresents the mean percentage increase in HR (from the minimum pilot response)for the low-load monitoring segment (WM = 1) and the high-load replanningphase (WM = 4). Figure 5b shows the mean percentage increase in HR for themonitoring versus clearance acceptance segments (WM = 4).

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

336 GIL AND KABER%

HR

WM WM

%H

R

(a) Monitoring vs. Replanning (b) Monitoring vs. Accepting

FIGURE 5 Total percentage increases in heart rate from minimum for various flightsegments. Note. WM = working memory; HR = heart rate.

Due to the nature of the HR response data, a nonparametric correlation analysis(Spearman’s ρ) was computed to assess the relationship between the observedaverage percentage increase in HR during the experiment trials and the numberof WM chunks counted by the E-GOMSL model compiler across phases of flightin pilot use of the CDA automation. There was a significant positive correlationbetween the two responses with ρ = 0.2055, n = 132, p = .0181. Results indicatedthat the E-GOMSL model predictions accounted for a significant fraction of thevariance in actual pilot workload responses as indicated by the percentage increasein HR.

DISCUSSION ON MODELING TOOL AND VALIDATION

The new E-GOMSL tool supports modeling of human behavior in complex tasksthrough a simulation process. The process allows for representation of several typesof overt human behaviors (eye movements, hand movements, foot control, vocalresponses). Such behaviors are coded in E-GOMSL models using operators to directattention to visual objects in an interface, cause manual response execution (e.g.,click mouse, press button, press pedal, etc.) and speech. The simulation processalso supports representation of many internal behaviors, including use of WMand LTM, decision making, and higher level cognition (thinking on alternatives).Activation of all perceptual, motor, and cognitive processing channels can beanimated in the tool for a designer to see the particular method a pilot takes incompleting a task. This capability supports designer understanding of how themodel operates in handling different task scenarios. The animation of cognitiveprocessing channel activation also allows designers to see when E-GOMSL modeloperations are executed in a serial or parallel manner in pilot completion of tasks.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 337

During visualization of the simulation, designers can observe E-GOMSL pre-dictions of specific user behaviors (in a real or accelerated time frame). Graphicalrepresentation of user visual behavior and manual responses (i.e., eye, hand, andfoot movements) is supported using icons overlaid on prototype interface images.A designer can connect the use of specific processing channels with pilot actionsat various interface displays and controls. These human behavior representationmethods provided through the E-GOMSL tool are expected to enhance designerproductivity in the conceptual system design phase.

With respect to the experimental validation of the modeling tool, comparisonwas made of experimental data of pilot success in the use of advanced cockpitautomation (the CDA tool) for route replanning versus E-GOMSL model pre-dictions. In general, the E-GOMSL model was found to be highly accurate inpredicting actual pilot success rates in simulated flight task performance. Table 2supports this claim. The accuracy was ∼82%. John and Newell (1989) previouslyrecommend 80% prediction accuracy as a criterion for cognitive model validity inhuman performance applications.

The time for pilots to complete flight tasks was also measured in the experimentas a basis for comparison with model predictions. It was anticipated that TTCpredictions based on the E-GOMSL models would have a relationship with theTTC results from the experiment. Unlike previous GOMSL modeling approaches,E-GOMSL integrates stochastic variables for predicting task operator times. Thisallows for generation of a task time distribution similar to actual human behavior.In general, when testing the CDA MOA, the E-GOMSL model predictions andobservations on actual pilot behavior had a marginally significant relationship.Thus, the E-GOSML model was also considered useful for predicting specificaspects of pilot performance.

Pilot cardiac activity was also measured as an indicator of flight task workloadduring the experiment test trials. Pilot cognitive workload was predicted with theE-GOMSL models in terms of the number of chunks of information in WM atany given time during a simulation run. According to the cognitive model, thereplanning and clearance acceptance segments of the flight task affected pilotcognitive load. In replanning, WM chunk count was high as pilots used theCDA interface and simultaneously monitored cockpit instruments. In acceptinga clearance, pilots also maintained a high WM chunk count for recalling thereplanned route from LTM and simultaneously monitoring displayed information(e.g., current distance measuring equipment [DME]). Results of the compar-ison of the E-GOMSL model outputs with the pilot HR responses indicatedWM counts from the model could serve as a basis for predicting automationand task-induced cognitive load. In general, when the model predicted WMcount was at a minimum, the HR response for pilots revealed low arousal.When the model-predicted count was at a maximum, the HR response for pilotsrevealed high arousal. These findings indicate that the E-GOMSL model might

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

338 GIL AND KABER

explain differences in automation or task-induced cognitive load in terms ofWM use.

Finally, it was expected that the pattern of pilot WM usage described by thecognitive model simulation would be related to the pattern of HR responses forpilots during the test trials. The CDA use scenario posed events that requiredspecific actions by pilots. These actions included functions such as systems mon-itoring, generating decision alternatives, and selecting and implementing options(Endsley & Kaber, 1999). The functions were performed by pilots according tothe mode of aircraft automation. Thus, the model-predicted WM chunk countsrevealed the effect of the mode of cockpit automation on pilot cognitive work-load. The results on the WM chunk counts revealed a pattern of fluctuation similarto the observed pattern of pilot HR responses. The E-GOSML model was foundto have utility for predicting the effect of the mode of cockpit automation, asexperimentally measured, in terms of pilot cardiac responses. Based on all theseanalyses, it appeared that the E-GOMSL tool has utility for assessing the potentialof advanced automation prototypes to support pilot performance and to facilitateeffective conceptual design of automation and interfaces.

CONCLUSION

The main objective of this research was to develop an enhanced version of anexisting computational cognitive modeling language to address limitations of ear-lier techniques and to support application to a specific cognitive work domain.A second objective was to develop a software tool to provide easy access to themodeling approach for complex system designers to evaluate automation and sys-tem interface prototypes from a human factors perspective during the conceptualdesign phase. Finally, the work sought to validate the outcomes of the modelingtool in terms of predicting system user behavior based on comparisons with actualperformance and workload in an experimental evaluation.

E-GOMSL supports designers in representing a range of interactive systemuser behaviors, and specifically pilot cockpit behaviors, in serial or parallelperformance. It also promotes accuracy in terms of operator and task timepredictions. Whereas existing modeling approaches use deterministic operatortime estimates, E-GOMSL incorporates stochastic operator times. Consequently,when using the E-GOMSL tool, designers can obtain performance estimatesthat would be similar to estimates based on using participants with differ-ent demographics (e.g., age), skills, and anthropometries in an experimentalinvestigation.

The new E-GOMSL tool supports system designers in three different ways.First, existing HITL approaches require empirical data as a basis for designalternative selection and are time consuming and costly. The E-GOMSL-based

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 339

approach uses a cognitive simulation and models interactive operator behaviorwith an interface prototype. General estimates of pilot performance and cognitiveworkload can be obtained from the simulation or analyzed along with data fromhuman experimentation. Thus, the use of the E-GOMSL tool can substantiallyreduce design concept test time and selections among design alternatives. Second,the E-GOMSL tool provides designers with the capability to see and understandthe manner in which operators can use new design concepts, thereby providing arationale for future design iterations and reducing the design cycle time.

Caveats and Future Research

The stochastic variables used for generating E-GOMSL operator times were deter-mined based on analysis of experiment videos for a subsample of test participants.Videos were recorded of participants’ use of the PC-based X-Plane simulator.Therefore, the stochastic time variables are limited in applicability to real flightoperations based on the manner of user performance in the PC-based simulator.Thus, using a higher fidelity flight simulator with actual automated aircraft cock-pit controls for stochastic variable determination would increase the accuracyof model predictions of pilot time to flight task completion and cognitive loadrelative to the real cockpit.

As another caveat, the E-GOMSL tool is not capable of considering aircraftdynamics in modeling pilot behavior. This reduces the accuracy of task comple-tion time estimates and decreases the generalizability of the model output to actualflight situations. This limitation could be overcome by integrating the tool with aflight simulation application, such as the X-Plane flight simulator, which is capa-ble of providing realistic aircraft flight behavior information (Laminar Research,2010).

Beyond this, there is a need to model pilot behavior with the range of cockpitMOAs, including those examined in our experiment. This study was a prelimi-nary step to demonstrate the validity of the E-GOMSL modeling technique formaking predictions of pilot behavior, in terms of several response measures, withthe CDA tool for route replanning. A sensitivity analysis should be conductedon the new modeling approach in which differences are identified in modelpredictions of pilot performance under the various MOAs and differentiableworkload conditions. This would further support the validity of the method.

Another limitation of the present form of the E-GOMSL modeling tool is alack of support for modeling and simulation of team cognitive task performance(e.g., multiple pilot interaction in the aircraft cockpit for flight task performance).For example, the B-767 aircraft requires two pilots for operation, a pilot flying(PF) and a pilot not flying (PNF). Each performs different tasks in differentphases of flight, largely based on the pilot’s orders. For such a scenario, addi-tional E-GOMSL operators would need to be developed to represent pilot-to-pilot

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

340 GIL AND KABER

communication, task divisions, and hand-offs. Pilot collaboration and communi-cation on tasks would have a major impact on model-based assessment of pilotcognitive workload. Related to this, the results of the E-GOMSL analysis of sin-gle pilot performance of the arrival scenario using the CDA tool presented in thisstudy likely represent higher cognitive load than would be the case if flight taskswere distributed across multiple simulated pilots. To address this limitation, theE-GOMSL tool would need to include specific operators for representing delaysin PNF processes, initiation of communications between the PF and PNF, con-firmations of messages between the pilots, and so on. Extending the E-GOMSLtool in this way could support explanation of how pilots can share tasks and reducecognitive load in specific operations. Moreover, the results of a team cognitive per-formance simulation could be used as a basis for pilot training on how to reducecognitive load in terms of task sharing.

ACKNOWLEDGMENTS

This research was supported by NASA Ames Research Center under Grant No.NNH06ZNH001. Mike Feary was the technical monitor. A team from APTIMACorporation, led by Paul Picciano, programmed the CDA tool prototype used inthis experiment. The opinions and conclusions expressed in this article are thoseof the authors and do not necessarily reflect the views of NASA.

REFERENCES

Anderson, J., & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ: Erlbaum.Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated

theory of mind. Psychological Review, 111, 1103–1060.Bloechle, W. K., & Schunk, D. (2003). Micro saint sharp simulation software: Micro saint sharp sim-

ulation software. In Proceedings of the 35th Conference on Winter Simulation: Driving innovation(pp. 182–187). New York, NY: WSC.

Boeing. (2009). Statistical summary of commercial jet airplane accidents (Tech. Rep.). Seattle, WA:Author.

Broadbent, D. E. (1958). Perception and communication. Oxford, UK: Pergamon.Card, S., Moran, T., & Newell, A. (1983). The psychology of human–computer interaction. Hillsdale,

NJ: Erlbaum.Cohen, A. C. (1988). Three-parameter estimation (pp. 113–138). New York, NY: Marcel Dekker.Emmerson, P. (2000). iGEN (software review). Ergonomics in Design, 8(3), 29–31.Endsley, M. R. (1993). A survey of situation awareness requirements in air-to-air combat fighters. The

International Journal of Aviation Psychology, 3(2), 157–168.Endsley, M. R., & Kaber, D. B. (1999). Level of automation effects on performance, situation

awareness and workload in a dynamic control task. Ergonomics, 42, 462–492.Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude

of movement. Journal of Experimental Psychology, 47, 381–391.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

PREDICTING PILOT SITUATION AWARENESS 341

Foyle, D. C., & Hooey, B. L. (2008). Using human performance modeling in aviation. In D. C. Foyle &B. L. Hooey (Eds.), Human performance modeling in aviation (pp. 15–27). Boca Raton, FL: CRC.

Gil, G. H., Kaufmann, K., Kim, S. H., & Kaber, D. B. (2010). Effects of modes of cockpit automationon pilot performance and workload in a next generation flight concept of operation. In D. B. Kaber& G. Boy (Eds.), Proceedings of the 3rd International Conference on Applied Human Factors andErgonomics [CD-ROM]. Boca Raton, FL: CRC.

Gore, B., Hooey, B., Foyle, D., & Scott-Nash, S. (2008). Meeting the challenge of cognitivehuman performance model interpretability though transparency: Midas v5.x. In V. G. Duffy (Ed.),Proceedings of the 2nd International Conference on Applied Human Factors and Ergonomics[CD-ROM]. Boca Raton, FL: CRC.

John, B. E., & Newell, A. (1989). Cumulating the science of HCI: From S-R compatibility to tran-scription typing. In K. Bice & C. Lewis (Eds.), Proceedings of CHI 1989 (pp. 109–114). New York,NY: ACM.

John, B. E., Prevas, K., Salvucci, D. D., & Koedinger, K. (2004). Predictive human performancemodeling made easy. In E. Dykstrra-Erikson & M. Tschellgi (Eds.), Proceedings of the SIGCHIconference on Human Factors in Computing Systems (pp. 455–462). New York, NY: ACM.

Kieras, D. E. (1997). A guide to GOMS model usability evaluation using NGOMSL (2nd ed.). InM. G. Helander, T. K. Landauer, & P. V. Prabhu (Eds.), Handbook of human–computer interaction(pp. 733–766). Amsterdam, The Netherlands: Elsevier North-Holland.

Kieras, D. E. (2006). A guide to GOMS model usability evaluation using GOMSL and GLEAN4 (Tech.Rep.). Ann Arbor, MI: University of Michigan.

Kieras, D. E., & Meyer, D. E. (1997). An overview of the EPIC architecture for cognition andperformance with application to human–computer interaction. Human–Computer Interaction, 12,391–438.

Kieras, D. E., & Polson, P. G. (1999). An approach to the formal analysis of user complexity.International Journal of Human–Computer Studies, 51, 405–434.

Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). SOAR: An architecture for general intelligence.Artificial Intelligence, 33(1), 1–64.

Laminar Research. (2010). X-Plane 10 [Web page]. Retrieved from http://www.x-plane.comLimpert, E., Stahel, W. A., & Abbt, M. (2001). Log-normal distributions across the sciences: Keys and

clues. DioScience, 51, 341–452.Liu, Y., Feyen, R., & Tsimhoni, O. (2006). Queuing network-model human processor (QN-

MHP): A computational architecture for multitask performance in human–machine systems. ACMTransactions on Computer–Human Interaction (TOCHI), 13(1), 37–70.

Meyer, D., & Kieras, D. (1997). A computational theory of executive cognitive processes and multiple-task performance: I. Basic mechanisms. Psychological Review, 104(1), 3–65.

Olson, J. R., & Olson, G. M. (1990). The growth of cognitive modeling in human–computer interactionsince GOMS. Human–Computer Interaction, 5, 221–265.

Ravindran, A., Phillips, D. T., & Solberg, J. J. (1987). Operations research: Principles and practice.New York, NY: Wiley.

Reason, J. (2000). Human error: Models and management. British Medical Journal, 320(7237),768–770.

Sanders, M. S., & McCormick, E. J. (1993). Human factors research methodologies (pp. 23–43). NewYork, NY: McGraw-Hill.

Scerbo, M. W., Freeman, F. G., Mikulka, P. J., Parasuraman, R., DiNocero, F., & Prinzel, L. J. (2001).The efficacy of psychophysiological measures for implementing adaptive technology (Tech. Rep.No. 4 NASA/TP-2001-211018). Washington, DC: NASA.

Schunk, D. (2000). Micro Saint: Modeling with the Micro Saint simulation package. In J. A. Joines,R. R. Barton, K. Kang, & P. A. Fishwick (Eds.), Proceedings of the 32nd WSC conference on WinterSimulation (pp. 274–279). San Diego, CA: WSC.

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2

342 GIL AND KABER

Teo, L., & John, B. E. (2008). CogTool-explorer: Towards a tool for predicting user interaction.In Proceedings of 2008 CHI extended abstracts on Human factors in computing systems (pp.2793–2798). New York, NY: ACM.

Usher, J. M., & Kaber, D. B. (2000). Establishing information requirements for supervisory controllersin a flexible manufacturing system using goal-directed task analysis. Human Factors & Ergonomicsin Manufacturing, 10, 431–452.

Wickens, C. D., & Hollands, J. G. (1999). Engineering psychology and human performance. UpperSaddle River, NJ: Prentice Hall.

Wiener, E., & Curry, R. (1980). Flight-deck automation: Promises and problems. Ergonomics, 23(10),995–1011.

Wu, C., & Liu, Y. (2007). Usability makeover of a cognitive modeling tool. Ergonomics in Design:The Quarterly of Human Factors Applications, 15(2), 8–14.

Manuscript first received: October 2011

Dow

nloa

ded

by [

Dav

id B

. Kab

er]

at 1

1:26

22

Oct

ober

201

2


Recommended