An Investigation of the Use of Real-time Image Mosaicing ... · 2.3 Evaluation of cognitive maps...

An Investigation of the Use of Real-time Image Mosaicing for Facilitating

Global Spatial Awareness in Visual Search

by

Anthony Soung Yee

A thesis submitted in conformity with the requirements

For the degree of Doctor of Philosophy

Graduate Department of Mechanical and Industrial Engineering

University of Toronto

© Copyright by Anthony Soung Yee 2013

ii

Abstract

An Investigation of the Use of Real-time Image Mosaicing for Facilitating Global

Spatial Awareness in Visual Search

Anthony Soung Yee

Doctor of Philosophy

Graduate Department of Mechanical and Industrial Engineering

University of Toronto

2013

Three experiments have been completed to investigate whether and how a software technique

called real-time image mosaicing applied to a restricted field of view (FOV) might influence

target detection and path integration performance in simulated aerial search scenarios,

representing local and global spatial awareness tasks respectively. The mosaiced FOV

(mFOV) was compared to single FOV (sFOV) and one with double the single size (dFOV).

In addition to advancing our understanding of visual information in mosaicing, the present

study examines the advantages and limitations of a number of metrics used to evaluate

performance in path integration tasks, with particular attention paid to measuring

performance in identifying complex routes.

The highlights of the results are summarized as follows, according to Experiments 1 through

3 respectively.

iii

1. A novel response method for evaluating route identification performance was developed.

The surmised benefits of the mFOV relative to sFOV and dFOV revealed no significant

differences in performance for the relatively simple route shapes tested. Compared to the

mFOV and dFOV conditions, target detection performance in the local task was found to

be superior in the sFOV condition.

2. In order to appropriately quantify the observed differences in complex route selections

made by the participants, a novel analysis method was developed using the Thurstonian

Paired Comparisons Method.

3. To investigate the effect of display size and elevation angle (EA) in a complex route

environment, a 2x3 experiment was conducted for the two spatial tasks, at a height

selected from Experiment 2. Although no significant differences were found in the target

detection task, contrasts in the Paired Comparisons Method results revealed that route

identification performance were as hypothesised: mFOV > dFOV > sFOV for EA = 90°.

Results were similar for EA = 45°, but with mFOV being no different than dFOV. As

hypothesised, EA was found to have an effect on route selection performance, with a top

down view performing better than an angled view for the mFOV and sFOV conditions.

iv

Acknowledgments

I would like to thank my supervisor Paul Milgram for his countless efforts in helping to make

this dissertation a reality. I can still recall the first day that I met him in Toronto; I got lost

due to the construction around what was then Taddle Creek Rd., ending up somewhere in the

Mining building. Paul had to pick me up and guide me to his office (which I’ll also never

forget), and he has been there every step of the way. I am indebted to him for his support in

research, teaching et pour nos discussions en français. Merci.

I would like to acknowledge my thesis committee members Birsen Donmez and Justin

Hollands, as well as my external members Mark Chignell and Colin Ware. Their comments

have undoubtedly served to strengthen my work, as well as my belief in it.

I would like to thank my extended family of academic brothers and sisters in the ETC Lab,

whose influences can be found everywhere in this work. In particular, the support of my

friends Winnie Chen and Bardia Bina cannot be understated; we have stuck together through

thick and thin, and I am honoured to be graduating alongside them.

I would like to thank the lovely Audrey Kuo for her friendship, dedication and patience over

the years. Her adventurous spirit has made me realise what important things are waiting for

me after a hard day’s work, both at my doorstep and out in the world.

Finally, I would like to thank my brother and parents. Perhaps Lawrence can claim to have

first nurtured the curiosities of a budding researcher, by patiently answering his little

brother’s parade of “Why? Why? Why?”. In any case, I am grateful for his support to this

day. I am indebted to Mom & Dad for their unwavering support in whatever I choose to do.

To that point, I look forward someday to actually explaining to them what this thesis is

about!

v

Table of Contents

Abstract ii

Acknowledgments iv

Table of Contents v

List of Tables x

List of Figures xi

List of Abbreviations xv

Chapter 1. Introduction 1

1.1 Background and motivation 1

1.2 Image mosaicing/image stitching 3

1.3 Spatial awareness tasks in aerial search 4

1.4 Objectives 5

Chapter 2. Literature review and concepts 7

2.1 Introduction 7

2.2 Cognitive maps 7

2.3 Evaluation of cognitive maps and global awareness 9

2.4 Image mosaicing 12

2.4.1 Basic principle of mosaic construction 12

2.4.2 Off-line applications 12

2.4.3 Potential applications of real-time image mosaicing 13

2.5 Studies evaluating human spatial performance in real-time mosaicing 15

2.5.1 Aerial search – Morse et al. (2008) 15

2.5.2 Desktop augmented reality – Jeon and Kim (2008) 16

vi

2.6 Global and local spatial awareness in (teleoperated) aerial search 18

2.6.1 Global awareness in aerial search 18

2.6.2 Local spatial awareness in aerial search 19

2.7 Relevant parameters for the present study 20

2.7.1 Viewing perspective/Elevation angle 20

2.7.2 Display size (and resolution) 22

2.7.3 Speed of traversal and height above terrain 23

2.8 Summary 24

Chapter 3. Experiment 1 26

3.1 Introduction 26

3.2 Experimental tasks 26

3.2.1 Target detection 26

3.2.2 Route identification 28

3.3 Response method 30

3.4 Platform 31

3.5 Procedure 33

3.6 Experimental parameters 35

3.7 Experimental Hypotheses 37

3.8 Results 37

3.8.1 Target detection 37

3.8.2 Route identification 40

3.9 Discussion 42

3.9.1 Route identification results 43

3.9.2 Target detection results 44

3.9.3 Comparison of results to literature 44

vii

3.9.4 Synthesis 45


4.1 Introduction 47

4.2 Experimental task 49

4.3 Results of Analysis 53

4.3.1 Challenge of Defining Objective Scoring Method 54

4.3.2 Paired Comparisons Method 59

4.3.3 Application of Paired Comparisons 62

4.3.4 Outlier analysis 65

4.3.5 Statistical tests and checking assumptions 68

4.4 Discussion 72


5.1 Introduction 74

5.2 Experimental procedure 77

5.3 Results 82

5.3.1 Target detection task 82

5.3.2 Route identification task 83

5.3.3 Participant subjective ratings of six viewing conditions 87

5.4 Discussion 89

5.4.1 Target Detection Results 89

5.4.2 Route Identification Results 91

5.4.3 Participants’ Subjective Rating Results 94

5.4.4 Summary 95

Chapter 6. Conclusions 97

6.1 Summary of experimental results 98

6.1.1 Experiment 1 98

viii



6.1.4 Synthesis 103

6.2 Limitations 103

6.3 Contributions 105

6.4 Suggestions for future work 106

References 108

Appendix 1. Statistical Outputs (Descriptive measures and ANOVA results) 115

A1.1 Experiment 1 results 115

A1.2 Experiment 3 results 117

Appendix 2. Parameters for the long river for Experiment 2 120

Appendix 3. Aggregated Route Selections by Participants 121

A3.1 Experiment 2 Routes 121

Appendix 4. Instructions for the set of paired comparisons 123

A4.1 Copy of Experiment 2 Paired comparisons instructions and form 123

A4.2 Copy of Experiment 3 Paired comparisons instructions and interface 125

Appendix 5. Calculations for the linear scales using the Paired comparisons method 129

A5.1 Experiment 2: Paired comparisons without the Route 5 comparisons 129

A5.2 Experiment 3: Judge Paired comparisons 130

A5.3 Experiment 3: Participant Paired comparisons 132

Appendix 6. Statistical tests for assumptions of Thurstone’s Case V method 135

ix

A6.1 Experiment 2: Paired comparisons without the Route 5 comparisons 135

A6.2 Experiment 3: Participant Paired comparisons 139

A6.3 Experiment 3: Judge Paired comparisons 141

Appendix 7. Estimates of the Discriminal Dispersions for Paired Comparisons Method 144

Appendix 8. Contrasts for Paired comparisons 147

A8.1 Experiment 2: Height Paired comparison contrasts 147

A8.2 Experiment 3: Judge Paired comparison contrasts 148

A8.3 Experiment 3: Participant Paired comparison contrasts 149

Appendix 9. Calculation of Number of Mosaiced Frames for Equivalent Size to dFOV

condition 151

Appendix 10. Additional approaches and pilot tests 152

A10.1 Experiment 1 152



x

List of Tables

Table 4.1 - Aggregated confusion matrix of paired comparison judgements for performance

at four Heights: H1, H2, H3, H4. Table should be interpreted as preferences of the column

element over row element. 63

Table 4.2 - Aggregated scores converted to proportions of the total number of judgments

over all judges (in this case 126). 64

Table 4.3 - Proportion scores in the confusion matrix converted to Z scale units. The values

are then summed along the columns to compute the mean Z values. Finally, the values are

shifted by the minimum value to anchor the values to 0. 64

Table 4.4 - Aggregated confusion matrix, with column totals, ai. 71

Table 4.5 - Results of pairwise contrasts between levels of Height in Experiment 2, following

the Scheffé method outlined in Starks and David (1961). The value in each cell represents a

Q2 test statistic for the column element being preferred over the row element. Critical values

at α = 0.05 and 0.01 are indicated by * and ** respectively. 72

Table 5.1 – The six combinations of display condition and camera elevation angle used in

Experiment 3. 76

Table 5.2 – Sets of pairwise contrasts for the judges in Experiment 3, following the Scheffé

method outlined in Starks and David (1961). Each pair of contrasts is indicated by an X in a

particular row. The value in the last column represents a Q2 test statistic for the column

element being preferred over the row element. Critical values at α = 0.05 and 0.01 are

indicated by * and ** respectively. 85

Table 5.3 - Set of pairwise contrasts for the participant ratings in Experiment 3, following the

Scheffé method outlined in Starks and David (1961). The value in the last column represents

a Q2 test statistic for the column element being preferred over the row element. Critical

values at α = 0.05 and 0.01 are indicated by * and ** respectively. 88

xi

List of Figures

Figure 1.1 - Example of wide (left) and narrow (right) fields of view, taken from Google

Earth 1

Figure 1.2 - Example of an image mosaic, generated by aligning and blending a set of images

with overlapping content. 3

Figure 1.3 - Example of an image mosaic, generated in real-time from a set of video images

as the camera pans from left to right. The white border represents the most recent image

frame in the video. 4

Figure 2.1 - Three displays providing a perspective viewpoint, (a) nominal size FOV, (b)

mosaic FOV, (c) enlarged FOV 21

Figure 3.1 - Example of target used in Experiment 1: (a) target magnified to show textures,

(b) target within flyover terrain. The red dot in (b) represents shadow of the aircraft directly

beneath. In this screenshot, the target is found to the right of the red dot. Forward flyover

motion was along the blue river, from bottom to top, resulting in overall motion of the image

from top to bottom, as indicated by the arrow. 28

Figure 3.2 - 10x10 response grid for novel response method in which participants selected

the route they flew over. 30

Figure 3.3 – Illustrations of (a) route elements, including the curved and straight portions, (b)

Grid layout from which the participants selected the route they flew over. The values for the

length ratio and curvature radii are included here for illustrative purposes; participants only

saw the grid of routes. 32

Figure 3.4 - Illustration of neutral zones and target zones in each flyover. Target zones did or

not contain a target, while neutral zones did not contain targets. Note that the red box

representing the FOV of the sFOV (travelling from left to right) covers half the length of an

event. 33

Figure 3.5 - Screenshots of the three display conditions for Experiment 1, (a) single size:

sFOV (b) mosaic: mFOV, (c) double size: dFOV 36

Figure 3.6 - Proportion of targets detected in Experiment 1 for each display condition, for all

participants 39

Figure 3.7 - Proportion of targets detected in Experiment 1 for each display condition, for

each participant 39

xii

Figure 3.8 - Example of Route Selected by a participant, and the Correct Route. The

measures of Euclidean and City block distance are also shown. 40

Figure 3.9 - Graph showing the Euclidean distance error over all participants for Route

identification in Exp. 1 41

Figure 3.10 - Graph showing the Euclidean distance error, for each participant for Route

identification in Exp. 1 42

Figure 4.1 – (a) Winding river landscape; (b) analogous computer generated ‘river’,

consisting of sum of four sinusoids. 48

Figure 4.2 - Screenshots of one terrain segment, with constant 60° FOV, displayed at four

heights, (a) H1 = 20m, (b) H2 = 56m, (c) H3 = 92m, (d) H4 = 164m 49

Figure 4.3 - Display of the six Routes selected for Experiment 2, chosen from the long

continuous river (Left). Routes on right show start of each Route with a green marker and

end of each Route with a red marker. 50

Figure 4.4 - Screenshot of Route identification Window. Left: top-down view of entire river.

Centre: response buttons, for controlling response; Right: instantaneous indication of selected

route. Green and red markers indicate respective start and end points of currently selected

route. 52

Figure 4.5 - Examples of ensembles of selected Routes collapsed over all participants at (a)

Height H2, (b) Height H3. Each plot contains 14 Routes in black ink (two for each of the

seven participants), as well as one Route in dashed red ink representing the correct Route.

The routes are translated so that their starting points coincide, while maintaining the original

North up representation (as seen in the Route identification window). 53

Figure 4.6 – An illustration of a Correct Route (in red) and a selected route (in black). The

resulting RSME score between these two routes would be large, despite the fact that the

shapes are quite similar. 55

Figure 4.7 - An illustration of a Correct Route (in red) and a selected route (in black), where

small deviations in the route occur between the two routes. The resulting RSME score

between these two routes would be large, despite the fact that the overall shapes are quite

similar. 56

Figure 4.8 - An illustration of a Correct Route (in red) and a selected route (in black), that

exhibit similar shapes but that are mirrored with respect to each other. 57

xiii

Figure 4.9 – Examples of four types of errors observed in route selections (black dotted line)

compared to Correct route (red line), (a) Translation error, (b) Phase shift error, (c) Partial

matching error, (d) Mirroring error. 58

Figure 4.10 - Illustration of four types of errors observed in route selections (black dotted

line) compared to Correct route (red line), shown with starting points matching for (a)

Translation error, (b) Phase shift error, (c) Partial matching error, (d) Mirroring error. 59

Figure 4.11 - Final PCM scale values for route identification task for four Heights, from

Table 4.3. 65

Figure 4.12 - Aggregated route selections for Route 2 and Route 5, for each of the four

Heights H1 to H4. Each plot contains 14 Routes in black ink (two for each of the seven

participants), as well as one Route in dashed red ink representing the correct Route. The

routes are translated so that their starting points coincide, while maintaining the original

North up representation (as seen in the Route identification window). 66

Figure 4.13 – Graphs for PCM results for each of the six Routes, aggregated over all

participants, for Routes 1 to 6. 67

Figure 4.14 – Final PCM scale values: (a) using all comparisons; (b) using all comparisons

except those from Route 5. 68

Figure 4.15 - Three computed scale values for Experiment 2 results, (a) all data, Case V

method (b) all data excluding Route 5, Case V method, (c) all data excluding Route 5, Case

III method. 69

Figure 4.16 - Experiment 2 PCM values for all data excluding Route 5, Case III method. The

actual Heights in metres are shown. 70

Figure 4.17 - Plot of the PCM scale values, including contrasts results. Each line indicates

that a significant contrast was found between the conditions at the endpoints of that line. 72

Figure 5.1 - Illustration of the angled (45°) and top down (90°) viewpoints used in

Experiment 3. 74

Figure 5.2 - Display of the six Routes selected for Experiment 3, chosen from the long

continuous river (Left). Routes on right show start of each Route with a green marker and

end of each Route with a red marker. 78

xiv

Figure 5.3 - Example of target used in Experiment 3: (a) target magnified to show textures,

(b) target within flyover terrain. In this screenshot, the target is located on the bottom right of

the FOV. 78

Figure 5.4 - Screenshot of the ‘Route flyover’ window for the dFOV condition. Participants

were asked to press the ‘Target detected’ button beneath the image when a target appeared

within the area designated by the red markers. Note: a target is currently showing in the

screenshot, half-covered at the top of the FOV. 79

Figure 5.5 – Screenshot of the ‘Participant Paired Comparison’ Window, presented to the

participants after completing all experimental trials. 81

Figure 5.6 - Graph of target detection performance for the six experimental conditions:

{45°,90°}x{sFOV, mFOV, dFOV}. 82

Figure 5.7 - Plots of PCM results for closeness of aggregated route selections to Correct route

performance: (a) across the three FOV conditions for each Elevation angle, (b) across the two

Elevation angles for each Display size. 83

Figure 5.8 - Two-dimensional plots for data from Experiment 3 PCM route identification

performance, with pairwise contrast results, (a) across the three FOV conditions for each

Elevation angle, (b) across the two Elevation angles for each Display size. 86

Figure 5.9 - Plots of PCM results generated by participants, for the question “which of the

two viewing conditions allowed you to more accurately identify the shape of the Route?”, (a)

across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles

for each Display size. 87

Figure 5.10 - Two-dimensional plot for participant ratings of Display conditions, with

pairwise contrast results, (a) across the three FOV conditions for each Elevation angle, (b)

across the two Elevation angles for each Display size. 89

Figure 5.11 – A route used in Experiment 3 shown at a height of 457.2 m above the terrain.

The entire route is shown in the single FOV (sFOV). 90

Figure 5.12 - Graph highlighting (with a dotted circle) the three scale values that were not

found to be significantly different from each other in pairwise contrasts. 94

xv

List of Abbreviations

ANOVA Analysis of variance

AR Augmented reality

dFOV Double size field of view

DRDC Defence Research and Development Canada

EA Elevation angle

FLIR Forward Looking Infrared Red

FOV Field of view

mFOV Mosaic field of view

PCM Paired comparisons method

RMSE Root mean square error

SAR Search and rescue

SAR tech Search and rescue technician

sFOV Single size field of view

SDT Signal detection theory

UAV Unmanned aerial vehicle

1

Chapter 1. Introduction

1.1 Background and motivation

The tradeoff between local detail and global context in visual search tasks is commonplace in

daily tasks. Reading the words on this page requires that the reader ‘zoom in’ on this

particular line on the page, at the expense of the global context of the overall layout of the

page’s paragraphs. In the analogous context of camera viewing for tasks such as

surveillance, reconnaissance, command and control, quality control, etc., these two concepts

are often associated with the size of the field of view (FOV) provided. Extracting local detail

is often performed with a ‘zoomed in’ or high magnification narrow FOV, whereas the

global context of the surroundings is communicated by a relatively wide FOV. This is shown

in Figure 1.1, where the wide FOV (left side) allows the observer to extract the shape of the

canyon, while the narrow FOV (right side) affords local details of a particular area of the

canyon. Unfortunately, a narrow FOV can restrict the assimilation of more global features,

contributing to an operator becoming ‘lost’ with respect to global surroundings (Wickens &

Hollands, 1999).

Figure 1.1 - Example of wide (left) and narrow (right) fields of view, taken from Google Earth

More formally, these acts typically involve some combination of visual search and

wayfinding, both of which require attentional resources to be carried out. Visual search

involves scanning and extracting visual information from scenes to locate a particular object

or feature of interest (Wickens & Hollands, 1999), such as when looking for a particular

article of clothing from a rack, or searching for a set of keys from a cluttered desk. This may

2

be categorised as a local spatial awareness task, since necessary information can be gleaned

from within the current field of view (FOV). Wayfinding, the purposeful act of orienting

oneself in physical space while navigating between points of interest, is completed, for

example, when locomoting between landmarks in a new city or returning home from the

grocery store. That is, upon the basis of aggregated views of an environment, one can

develop a global understanding of the surrounding context.

Consider now the interplay between tasks that involve the assimilation of local detail and

those that rely on global understanding of the surrounding environment, such as is found in

visual search and wayfinding tasks. Woods (1984) coined the term ‘keyhole effect’, as an

analogue to the limited visual field experienced while peering through the keyhole in a door.

In a keyhole effect, only local information can be viewed at any one time, leaving the

observer to spatially integrate successive views through the limited FOV. The real-world

manifestation of this effect is often encountered with mediated viewing through remote

cameras, or new imaging technologies that offer high resolution or highly magnified views of

the world, often at the cost of being able to easily understand the surrounding context of the

environment being observed or scanned. Examples can be found in aerial search tasks such

as those in search and rescue (SAR) or unmanned aerial vehicle (UAV) surveillance

operations.

Whenever a task necessitates both local spatial task performance and knowledge of the

global environment simultaneously, the operator may be forced to trade off performance in

one task to maintain performance in the other task. This has been known to result in poor

performance in both local and global tasks and disorientation. An important underlying

question therefore relates to understanding the factors that might affect the available visual

information that allows operators to perform these tasks. Consequently, with the motivation

of compensating for cognitive deficiencies in mission critical tasks that require both local and

global spatial awareness, the present study investigates whether a real-time image processing

technique called ‘image mosaicing’ (explained below) might be able to enhance spatial task

performance in (simulated) aerial search tasks.

Paramount to making meaningful claims regarding the enhancing of task performance is a

critical examination of how performance is in fact evaluated in such spatial awareness tasks.

3

In particular, the present study also investigates methods for evaluating wayfinding

performance over complex winding routes for which objective computational methods are

not appropriate.

1.2 Image mosaicing/image stitching

An image mosaic is a spatially continuous image created by combining a set of smaller size

images, each containing some overlapping spatial content. An example of an image mosaic is

shown in Figure 1.2, generated from a series of images using a computer software algorithm.

In their most basic form, image mosaicing techniques are based on the ability to align

different images (or tiles) from a scene into a spatially continuous set and to blend them

together (Szeliski, 1996). Research in mosaicing algorithms generally aims to develop

mosaics with as few alignment or blending errors as possible, to create seamless mosaic

images (Irani et al., 1996).

Figure 1.2 - Example of an image mosaic, generated by aligning and blending a set of images with overlapping

content.

Advances in computational power and the decreasing cost of graphics hardware have

afforded many new possibilities in software implementation. For example, it is possible to

create image mosaics of impressive size and detail, on the order of Gigapixels per mosaic

using off the shelf hardware. Many modern cell phones in fact contain software applications

for generating image mosaics using the phone’s CPU and graphics chip. Of particular interest

in the present context is the existence of real-time image processing, which allows an image

mosaic to be generated from a video source as the images are captured. Whereas the image

in Figure 1.2 was created offline some time after the image tiles were captured, the

composite image mosaic in Figure 1.3 was generated in real-time from a video source. As the

4

camera is panned from left to right in the scene, the composited image is updated by aligning

and blending new image frames to the mosaic. The white border represents the most recent

frame in the image mosaic.

Figure 1.3 - Example of an image mosaic, generated in real-time from a set of video images as the camera pans from

left to right. The white border represents the most recent image frame in the video.

The immediately observable effect of mosaicing from Figure 1.2 and Figure 1.3 is that quite

simply, more visual information is present in the image mosaic compared to any single image

frame. If the white border in Figure 1.3 represents the field of view (FOV) size of a standard

camera setup, one can artificially extend the FOV, by aligning and blending just recently

viewed image frames to the most current image frame. The ability to provide an extended

FOV in software is often cited as a benefit of the technique (Lo, 2008; Shum & Szeliski,

2000; Irani & Peleg, 1991) and why refining the algorithms continues to be the topic of

continued research in computer science.

1.3 Spatial awareness tasks in aerial search

Tasks requiring simultaneous local and global spatial awareness often pose great perceptual

and cognitive challenges to the operators performing them. For example, operators charged

with controlling a UAV through remote video feeds require an awareness of the surrounding

context to be able to understand for example, the vehicle’s current position relative to the

5

environment, the locations of objects relative to each other and the areas they have already

flown over. The search and rescue technician (SAR tech) must also maintain awareness of

the surrounding environment, as environmental features such as shapes and landmarks are

useful in communicating to the pilot where a target was spotted when the aircraft is to be

brought back around. This can be challenging if the terrain is bereft of landmarks or the

aircraft follows a path whose trajectory is complex.

In addition to maintaining global spatial awareness, performing local detection tasks can also

be very challenging. Visual search tasks in aerial SAR can be time-critical, where the lives of

crash site victims are often in danger, and finding targets is a constant challenge. SAR techs

and UAV operators may use image enhancements such as infrared imaging, with the goal of

improving detection performance. This may be necessary whenever the resolution required to

identify targets is very high. However, high resolution or highly magnified camera systems

are often limited in their field of view, as the cost inevitably increases to produce larger or

wider FOV sensors at high resolution.

Thus, it is imperative that these operators be able to maintain both local and global spatial

awareness in order to successfully complete mission goals.

1.4 Objectives

Referring to the perceptual issues experienced in mediated viewing tasks, providing a display

whose effectively large field of view comprises a real-time image mosaic would seem like a

viable solution to the keyhole effect. That is, for perceptual tasks that are normally to be

performed with a relatively narrow FOV but at an adequate resolution to accomplish the

search task, one could artificially broaden the FOV in software to supplement the restricted

view of the environment. This would in theory enhance operator performance in spatial tasks,

without any additional hardware. Despite the continued interest in improving the

computational speed and efficiency of mosaicing algorithms, relatively little empirical work

has been done to determine if there is in fact any improvement in operator performance using

real-time mosaiced displays. As such, the present study aims to determine whether the

technique of image mosaicing can enhance human operator performance in tasks requiring

6

both local and global spatial awareness. Furthermore, the present study uses the notable

example of aerial search as a basis from which to design three controlled studies.

Of critical importance in evaluating the effect of image mosaicing is the appropriateness of

the measures used to evaluate performance, particularly for tasks that require path

integration. Thus in addition to understanding visual information in mosaicing, the present

study examines the advantages and limitations of a number of metrics used to evaluate

performance in spatial awareness tasks, with particular attention paid to measuring

performance in identifying complex routes.

7

Chapter 2. Literature review and concepts

2.1 Introduction

This chapter begins with a discussion of concepts related to cognitive mapping, dedicated to

understanding the processes, strategies and difficulties involved in forming accurate

cognitive representations of the environment. It is an important component of wayfinding -

the act of purposeful navigation from one place to another (Bowman, 2002), whereby

features from successive glances of the scene are assimilated and combined to continually

update current estimates of position and orientation (Kitchin & Blades, 2002). The methods

of evaluating cognitive maps are also discussed, as careful consideration must be made to

ensure that the collected data are able to provide insights into performance of the complex

spatial tasks. Next, the spatial awareness tasks in aerial search are considered, as an example

from which to base the experimental investigations in this thesis. A brief discussion of the

important concepts and applications of image mosaicing are provided to highlight the unique

properties of the image mosaic and the potential benefits of such a display system. Finally,

the factors of Viewing Perspective and Display size, identified as being relevant to both real-

life visual search conditions as well as controllable in an experimental setting.

2.2 Cognitive maps

Concomitant with the study of global spatial awareness is a discussion on the formation of

“cognitive maps”, also known as abstract maps, mental maps and conceptual representations.

The diversity of terms is evidence of the continued and varied interest in understanding

spatial behaviour in a number of disciplines, including geography, psychology, computer

science and anthropology. Cognitive maps are useful in that accessing them can provide

answers to questions such as “Where am I?”, “Where did I come from?” and “Along which

route did I travel to get here?”. Used to describe the spatial, geographical and environmental

knowledge acquired by people in wayfinding tasks, the term cognitive map refers to the

information encoded to embody a person’s cognitive representation of the environment

(Kitchin and Blades, 2002) using both short-term and long term memory (Gärling et al.,

1985). The information contained in cognitive maps generally falls into three categories:

8

objects or places (Lynch, 1960; Canter, 1997; Russell, 1982), spatial relations between places

(Kuipers, 1978), and travel plans or routes (Siegel and White, 1975). However, one must be

careful not to regard the term cognitive map necessarily as a form of cartographic map ‘in the

head’ (Kuipers, 1982; Cadwallader, 1979; Lowrey, 1970), but rather as a human’s

representation of space and spatial relationships (Kitchin and Blades, 2002; Gärling et al.,

1985).

Related to these concepts are generally accepted forms of spatial knowledge acquired while

traversing an environment: landmark knowledge, route knowledge and configurational

knowledge. Siegel and White (1975) proposed that the levels are arranged in a set pattern of

development. Landmarks are any salient features in the environment that help to anchor

zones or regions of interest to the observer. They are seen as fundamental building blocks to

developing the second level, route knowledge, which involves understanding the connections

between salient landmarks in the environments. With the development of route knowledge,

one acquires knowledge of routes between landmarks or nodes (Golledge, 1978), as well as

estimates of the distances between salient features. The highest level of spatial knowledge

comes in the form of configurational, or survey, knowledge. Configurational knowledge

allows the observer to understand the relationships between different routes in the

environment, which can help to develop higher order spatial knowledge, such as new routes

and shortcuts between points in the environment. While a body of research has sought to

clarify the hierarchical relationships between the three levels (e.g. Golledge et al., 1993;

Ferguson and Hegarty, 1994; Gärling et al., 1981), the general consensus is that the elements

observed from the environment help to develop knowledge about higher order spatial

relationships, which can be used to accomplish spatial awareness tasks successfully.

The interest in cognitive maps and the acquisition of spatial knowledge in the present study is

two-fold. First, there is an intrinsic interest in investigating the viewpoint parameters that

potentially affect spatial behaviour in forming cognitive maps across different proposed

information displays. For the evaluation of human spatial performance using a mosaiced

display, the factors of FOV size and viewpoint height were investigated in Experiments 1 and

2 respectively, while FOV size and viewpoint perspective were varied in Experiment 2.

Second, the results of the present study have many implications for real tasks that require the

9

accurate formation of cognitive maps during live operations. Among the number of domains

discussed in the following, the spatial processes involved in aerial search are explored in

experimentally controlled (and admittedly contrived) spatial tasks.

2.3 Evaluation of cognitive maps and global awareness

In recognition of the fact that cognitive maps are not a physical manifestation of the spatial

representations encoded in memory, there are two important considerations for the present

study. First, cognitive maps are prone to interference, which can result in distortions of one’s

cognitive map and ultimately the feeling of disorientation or feeling lost. It is this deficiency

that new information displays may be able to help resolve. Second, for evaluating

performance in tasks involving the formation of cognitive maps, careful attention must be

paid to how cognitive maps are externalized by observers, as “spatial products” (Liben,

1982). These can be elicited using a variety of techniques, including sketching, estimation,

reproduction and modelling.

Kitchin & Blades (2002) provide a classification of tasks used for assessing spatial

performance using cognitive maps. Unidimensional tasks are used to determine a

participant’s knowledge of the relationship between two locations, and can be divided into

two categories: distance tasks and direction tasks. In distance tasks, the participant is asked to

report the distance between two points, either as a magnitude (e.g. Cadwallader, 1979) or as a

ratio relative to some standard distance (e.g. Lloyd and Heivly, 1987). In some cases,

participants may be asked to draw places on a map on a scale smaller than the estimated

environment, in order that the distances between points of interest can be compared against

the true distances (e.g. Montello, 1991).

In direction tasks, participants are asked to estimate the direction between two places in the

environment, usually requiring the participant to point from a given place to a target place,

either on paper, a computer screen using a mouse (Kearns et al., 2002; Kitchin & Blades,

2002) or pointing in the physical world (e.g. Fujita et al., 2010; Loomis et al., 1999). One

such task is the so-called triangle completion task (Tan et al. 2004), commonly used for

evaluating spatial task performance in relatively simple traversed routes. In a triangle

completion task, the participant performs three movements in sequence: translation, rotation

10

and another translation. The participant must then point back to where he started the

sequence of movements, and the angular error from the participant’s response to the actual

direction represents the sum of all perceptual errors in spatial task performance.

While the triangle completion task is often regarded as one of the most direct ways of

measuring spatial ability (Golledge, 1999), this method, or any method that provides a single

computed value for error, may not be appropriate for evaluating global task performance

when traversing more complex routes, since the rather coarse angular error measure cannot

provide any insight into where along a complex route any misjudgements may have occurred.

Some researchers have questioned the validity of such one dimensional techniques in

evaluation cognitive maps (e.g. Montello et al., 1999; Kitchin & Blades, 2002).

In response to the limitation of one dimensional techniques, several two-dimensional data

collection techniques have been proposed to elicit the participant’s knowledge of the spatial

relationships between elements in the environment. Kitchin & Blades (2002) propose three

types of two-dimensional tasks: completion tasks, graphic tasks and recognition tasks.

Completion tasks involve the participant receiving some portion of a map or diagram

containing information pertinent to a spatial task, and being asked to fill in the rest of the

information. That is, the partial information is meant to prime the participant, relieving him

from having to reproduce an entire map from scratch. For example, Thorndyke & Hayes-

Roth (1982) asked participants to indicate a single location relative to two given points on a

map, while also being provided with scale and orientation information relevant to the task.

Kitchin (1996) varied the amount of information provided to the participant in similar spatial

tasks, and found that the amount of cueing information had an effect on the responses

provided.

Graphic tasks involve the participant producing a sketch or map of the environment (Kitchin

& Blades, 2002). A basic sketch map is one that is minimally defined by the experimenter.

For example, a participant might be given a blank piece of paper and asked to sketch a map

of a city, with no further instructions. By contrast, the participant may be provided some

constraints, such being asked to sketch a city with only major roads and street names. In this

case, the participant is said to produce a normal sketch map. There are several advantages to

11

the sketch map technique, as it requires the participant to express environmental features in

relation to one another. The technique is also simple to employ, and most adult participants

are familiar with the idea of drawing sketches.

However, there are also a number of limitations to the technique of sketch maps. Most

notably, the quality of sketch map depends on graphical skill of the participant, as well as the

ability to express their cognitive map as a sketch map. Furthermore, Beck and Wood (1976)

reported that participants were also unwilling to adjust details of features they had already

positioned on paper, which may result in distorted sketch maps. Finally, once the sketch

maps are collected, the challenge involves evaluating or comparing the maps’ features. In

other words, because the participants are given relatively few restrictions, there may be an

issue of quantifying them.

In recognition tasks, participants are asked to identify a configuration of objects or places to

which they have been exposed. Participants may be asked to identify a feature on a map or

aerial photograph of a familiar area, or be shown several configurations and be asked to

identify the correct spatial configuration. Wang (2005) used the latter technique, showing

participants physical scale models of possible tunnel configurations in order to evaluate

global spatial awareness. Evans et al. (1980) had participants walk routes through buildings

and then asked them to identify the individual routes from a number of floor plans.

Recognition tasks may be advantageous in that they provide a closed set of possible

responses, allowing the participant to recognise the correct configuration rather than recall

its exact properties. Compared to having a participant reproduce a sketch map, a participant

may been keener at identifying the map among a set of alternatives, thus obviating the

graphical skills required to produce a sketch of the environment. Finally, as with the graphic

tasks, recognition tasks may have ecological validity in that participants may also be familiar

with tasks that involve identifying areas or routes along a given map.

Given the wide variety of spatial tasks used in exploring cognitive maps, particular attention

must be paid to the methodological issues involved in selecting an appropriate method of

evaluation. Indeed, Kitchin and Blades (2002) caution that experimenters frequently do not

provide a justification for their choice of technique, which can introduce significant

problems, since some techniques may be more appropriate or less appropriate for certain

12

environments and populations. As is discussed in the experiment chapters of the present

study, careful attention was paid to the selection of an appropriate evaluation technique,

given the demands of the two tasks and the complexity of the environment being traversed.

2.4 Image mosaicing

2.4.1 Basic principle of mosaic construction The implementation of real-time image mosaicing used in this study was developed in a

MASc thesis project by Hok Man Herman Lo (2008) at the University of Toronto, as part of

a suite of software tools for improving visualization in laparoscopic surgery. While the reader

is invited to consult that MASc thesis for a full description of the mosaicing algorithm, a high

level description of the process is offered here to familiarise the reader with the important

concepts.

Constructing an image mosaic consists of three major steps: image registration, projection,

and blending (Mann, 2002). In the alignment (or registration) phase, we attempt to develop a

model of the geometric relationships between pairs of images (Schmidt et al., 2000). The

mapping between two images is called the projective coordinate transformation, which

ranges from relatively simple translations and rotations, to more complex full projective

transformations (Brown, 1992). The software implementation used in the present study

matches specific groups of pixels called ‘feature points’ that appear in the images. The

images in subsequent frames are then projected to be in alignment with one of the images

(called the reference image) and then blended together using one of several techniques

(Szeliski, 2006).

2.4.2 Off-line applications Some early examples of image mosaicing can be traced to underwater photographic

surveying, where photographs of the ocean floor were physically laid out and taped together

to form a single coherent map (Pollio, 1968). More recently, the fields of photogrammetry,

computer vision, image processing and computer graphics have sought to develop algorithms

that automate the process of mosaicing (Irani et al., 1996; Shum & Szeliski, 2000), with the

goal of minimising the ‘residuals’, or significant errors in alignment, and integrating

individual images into the mosaic. While the traditional applications for automated

13

mosaicing algorithms are found in satellite and aerial imagery (Shum & Szeliski, 2000), its

adoption in different fields is growing rapidly. Applications have expanded for example to

video processing, including file compression, search and indexing, and change detection

(Irani et al., 1996). Mosaicing has also been used to increase photo resolution (Irani & Peleg,

1991) and to emulate the effects of true panoramic cameras (Irani et al., 1996; Szeliski,

1994).

2.4.3 Potential applications of real-time image mosaicing With the advent of powerful low cost computing power, mosaicing algorithms can now be

applied to live video images, allowing mosaiced images to be ‘painted’ from the streaming

output of a video source (Shum & Szeliski, 2000). Consider a display where, instead of

continuously updated live images from a video source, an observer is presented with an

augmented video stream comprising frames captured in the recent past that are automatically

aligned and integrated into a single larger image.

This technique opens up a number of interesting possibilities for assisting an operator in

performing a live search task (i.e. in real time), where a narrow camera field of view often

restricts assimilation of global context as local details are taken in. A display augmented

with mosaiced images effectively constructs a broadened field of view using previously

captured camera images, at the same resolution needed to identify features and objects. This

mosaiced view provides a more global context of areas viewed in the recent past. The

following are examples of some domains in which such a real-time image mosaicing

capability might be used.

2.4.3.1 Histopathology

A histopathologist examines pathology slides (glass slides containing thin slices of tissue

stained with chemical dyes) to identify features in cells that are consistent with known

diseases. The histopathologist examines the slide using a microscope, with a set of eyepiece

lenses and (fixed magnification) objective lenses. The combined magnifications commonly

reach up to 400x. To examine different areas of the slide, the histopathologist either moves

the slide directly with her hand (freehand) or, depending on the circumstances, manipulates

control knobs to move the mechanical platform holding the slide. She must switch back and

forth between discrete levels of magnification to maintain context while examining local

14

details. This procedure is complicated by the movement of the slide, as even small

displacements cause an amplified change in scene under high magnification.

In response to these challenges, the work of Lo (2008) has been spun off into a company,

ViewsIQ (http://viewsiq.ca/), which uses his real-time image mosaicing algorithm to

automatically integrate the image frames coming from the imaging scope into a mosaic.

2.4.3.2 Remote camera surveillance

Pan, zoom and tilt cameras are commonly used for surveying large areas, allowing the

operator to control the orientation and FOV of the image produced. When objects such as

faces or license plate numbers must be identified, the human operator (HO) may need to

zoom out to maintain an awareness of the object's location relative to the area being

surveyed. An extra challenge occurs when the object of interest is moving within the area, in

which case the operator must track the object to keep the object within the camera's FOV.

2.4.3.3 Aerial search and rescue

A search and rescue (SAR) operator scans the environment outside the cockpit of a fixed

wing aircraft traversing its flight path. Perhaps the aircraft is passing over a canopy of trees

while the operator or ‘spotter’ must identify objects of interest such as wreckage or survivors.

To complicate the task of searching out the window of a moving aircraft, consider the spotter

using a view mediated by camera sensors (using infrared imaging, for example) to extract

local details of a small portion of the terrain below. In this situation, the spotter has no

contextual information outside the narrow FOV of the binoculars. The operator must either

pan slowly to scan the forest without feeling disoriented (Carver, 1990), or lower her

binoculars to regain global context. In either case, the operator may miss important local

details as the forest canopy rushes past.

A research project at Defence Research and Development Canada (DRDC) was undertaken

to develop a real-time mosaic system using a Forward Looking Infrared Red (FLIR) sensor.

As part of the ‘Infrared Eye’ project, an opto-mechanical pointing system rapidly steers the

narrow FOV IR camera to capture high resolution images that are stitched together,

producing high-resolution wide FOV images at very high speed. The goal was to provide

“the operator with fast access to points of interest without losing situation awareness.”

15

(Lavigne and Ricard, 2005). A system testbed was constructed but unfortunately never flew,

due to funding constraints.

2.5 Studies evaluating human spatial performance in real-

time mosaicing

The present study focuses on investigating performance in local and global awareness tasks

using a mosaiced FOV. More specifically, I investigated global awareness, in the form of

route identification, and local spatial awareness, in the form of target detection. With regards

to previous research on this topic, the interest of the research community has focussed

primarily on improving the computational effectiveness and speed of mosaicing algorithms,

with few studies dedicated to evaluating the performance of users of mosaicing displays.

From the research on this topic that has been found, we first consider in the following work

in evaluating human operator (HO) performance in search tasks1 using real time image

mosaic displays, followed by desktop augmented reality applications.

2.5.1 Aerial search – Morse et al. (2008) Morse et al. (2008) conducted a HO performance evaluation of a real-time mosaicing system,

termed a ‘temporally local mosaic’, which appended a limited number of mosaiced frames to

the single size FOV. The participants were given the primary task of identifying stationary

targets (red umbrellas) embedded in short video clips of flight paths over a terrain, while

completing a secondary task of identifying red coloured spots among multi-coloured spots on

a second monitor. Two display conditions were investigated; the participants used either a

relatively small fixed size FOV or the mosaiced FOV. There were no explicit non-target

distractors in the experiment.

Morse et al. found that more targets were detected in the mosaic FOV condition compared to

only the single FOV, with no difference in secondary task performance between the two

1 With regards to the potential of automatic detection algorithms in aerial SAR, Baker and Youngson (2007)

considered it unlikely that a sufficiently effective system could be developed, citing reasons of low signal-to-

noise ratio of many targets in the environment, as well as the high false alarm rates that might be expected. As

such, human operators continue to be deployed for these demanding visual search tasks.

16

conditions. By examining where in the display participants identified the targets, Morse et

al. also confirmed that participants did in fact detect targets in the mosaic that they had

missed in the single FOV.

While the results of this investigation suggest that there may be benefits to a mosaiced image

display system for local detection tasks, two issues with the experimental design are noted.

First, as there were no explicit trials without targets present, the participants could have

simply guessed whether they saw a target. In other words, there were no recorded Correct

Rejections and thus no possibility to compute the signal detection parameters d’ and Beta.

However, Morse et al. found that participants responded to artefacts in the display caused by

noise in the video transmission or misalignment of the frames. It was found that more of

these ‘false positives’ occurred for trials using the mosaic display.

The second issue with Morse et al.’s design is that they provided no indication that the

effective FOV of the mosaic remained constant (within reason) throughout the experiment.

In fact, it appears from their screenshots that the size of the mosaic varied during the trial,

caused by the fact that the trajectory of their camera followed a non-linear path. Therefore, it

is difficult to establish any relationship between the increase in size of the effective FOV and

target detection performance.

2.5.2 Desktop augmented reality – Jeon and Kim (2008) Jeon and Kim (2008) investigated the effect of FOV and display size (including real-time

image mosaicing) on task performance in a desktop tangible augmented reality (DTAR)

application. Augmented reality (AR) was used to enhance camera scenes viewing a desktop

environment, by overlaying virtual computer graphic generated elements onto the camera

view. A “nominal” FOV afforded by a conventional webcam mounted to the participant’s

head provided a limited FOV (40 degrees) of the desktop (72 x 120 cm). A “mosaiced” FOV

generated from the webcam provided an extended FOV (although no details on the number

of mosaiced frames was provided). The experimental setup was somewhat unconventional,

as the camera views augmented with AR elements were observed on an upright computer

monitor behind the desktop. In other words, for the “nominal” and “mosaiced” conditions,

the participant had to point his head toward the desktop while gazing at the monitor at a

different location.

17

Participants were shown a virtual object of a certain shape and colour in the middle of the

desktop, and were tasked with moving the virtual object to a different location on the

desktop. To investigate the effect of the two display conditions – with and without mosaicing

– on performance, Jeon and Kim collected time to completion data from 20 participants who

performed 100 trials for each condition. The results indicated that the time to complete the

trials in the mosaiced FOV condition were significantly lower than for the nominal FOV, and

there were significantly fewer movements of the head. Participants also showed a preference

for the mosaiced FOV based on Likert scale ratings to the question “How easy/convenient

was it to carry out the task in this environment?” The authors concluded that the extended

FOV afforded by the mosaiced condition provided performance enhancements, despite the

artefacts seen in the mosaiced images.

In the same paper, Jeon and Kim (2008) conducted a follow up experiment to compare

performance using a mosaiced FOV with that of a “fixed” FOV, afforded by a wide FOV

camera mounted directly above the desktop that provided a view of the entire desktop. Thus

the fixed FOV represented the widest FOV of the three DTAR conditions. The results from

12 participants indicated that the times to complete the trials in the fixed FOV condition were

significantly lower than that of the mosaiced FOV, as well as showing a preference for the

fixed FOV based on Likert scale ratings of the preferred viewing condition. The authors

posited that a mosaiced FOV could be useful in situations where providing a view of the

entire work area is impractical or impossible. They also reiterated the importance of

providing a FOV that is contextually relevant for the task at hand, positing that the results

may have been different for tasks involving close up manipulation.

This last point is particularly relevant for the experiments prepared in the present study. Jeon

and Kim (2008) investigated a task where there was no tradeoff in providing a larger view of

the environment. In other words, providing a view of the entire desktop area would

invariably be expected to provide the best performance compared to narrower fields of view

and thus, perhaps the results are not that surprising. In the present research, which focuses on

tradeoffs between local and spatial awareness, it will be seen (in Experiment 2) that it is

necessary to calibrate the height parameter in order to create a fair comparison – i.e., an

18

experiment whose results are not easily predictable in advance – between the different

display conditions.

2.6 Global and local spatial awareness in (teleoperated)

aerial search

As outlined in the introduction, it is common for the complementary features of wide and

narrow FOVs to be traded off in many tasks. As stated by Vos (1990): ‘... the gain in

visibility is obtained at the cost of “searchability”’. Thus, in situations where both local

detail and global context are needed simultaneously, the ability to trade off these forms of

visual information has a critical impact on the operator's ability to conduct visual search.

2.6.1 Global awareness in aerial search Understanding the spatial relationships between the UAV and the terrain, other aircraft,

points of interest (such as refuelling stations), and targets in the environment are crucial to

successfully completing UAV operations (Drury et al., 2006). Because the UAV operator is

remotely located from the aircraft, the multisensory information of their surrounding

environment one normally receives when directly flying an aircraft, is no longer available.

For example, UAV operators do not have access to kinaesthetic cues pilots of manned

aircraft use to gain an understanding of turbulence, aircraft movement and gravitational

forces (Hopcroft et al., 2006). Thus the operator performs control manoeuvres using visual

information provided by cameras mounted onto the UAV, transmitted via a data link to the

operator. However, the visual information can be limited in terms of image quality and FOV

(Draper and Ruff, 2000; van Erp, 1999), making accurate and up to date cognitive maps of

the environment challenging.

The benefits of providing greater context in global spatial awareness tasks are well known in

the literature. For example, Hodgson (1998) found that enlarging window size, and thus

global context, increases accuracy of human operators in identifying land use types from

aerial photographs. A coordinated series of zooming and panning movements was found to

be a prevalent technique for preserving global context in long distance pointing tasks using

multi-scale interfaces (Bourgeois et al., 2001; Pietriga et al., 2007).

19

Much research has been carried out on the impairment of spatial cognition based on display

parameters and positional relationships between the observer and the environment. In

particular, the current study focuses on global tasks such as map reading tasks or route

identification, whose response or output relies on an exocentric perspective. That is because,

for the aerial search tasks simulated in the experiments presented here, global spatial

awareness was evaluated using an exocentric recognition task, in spite of the fact that the

(simulated) world was experienced from an egocentric viewpoint, i.e. looking out the

window of an aircraft in a SAR-like task. Clearly, providing an exocentric response requires

a mental transformation between the two perspectives.

2.6.2 Local spatial awareness in aerial search Efforts in understanding the process of visual search in aerial SAR tasks generally fall under

two categories. On the one hand, one group of studies have investigated eye movement data

collected during SAR search, to make inferences about the gaze behaviours of spotters (Croft

et al., 2007; Stager and Angus, 1978; Stager and Angus, 1975). Those studies have been

particularly useful for estimating visual field coverage during simulated SAR, for identifying

differences in scanning behaviours between novice and expert spotters, and for determining

the effectiveness of new training programs for spotters.

On the other hand, efforts have also been placed in empirically determining the factors

related to the environment, the targets, and the operational parameters that influence visual

search in aerial SAR. The work of Stager (1974, 1978) was important in this regard, as it

showed that the rate of motion perceived by an operator varies as a function of the angular

velocity away from the perpendicular beneath the aircraft to the ground (Stager, 1974). This

angular velocity is defined as the rate of change of the angle subtending points or objects

moving across the terrain. As the operator gazes away from the terrain beneath the aircraft

(i.e. as the camera elevation angle decreases) towards the horizon, the angular velocity

decreases, which allows an object to remain within a given fixation radius for a

proportionally longer time. Furthermore, the aircraft’s altitude has an effect on angular

velocity; as the altitude is increased, the angular velocity decreases. Stager (1974) cited

practical concerns of searching for objects when altitude was low, as searching through dense

forests would become “nearly impossible”.

20

While attempts have been made to fully automate micro-UAV operations using artificial

intelligence and computer vision techniques, thereby obviating global spatial awareness, the

fact remains that the certain operations may require operators to remain “in the loop”. For a

survey of these efforts, see Michael et al. (2012). Two examples of such operations are covert

reconnaissance missions where enemy positions are to be visually confirmed and missions

for designating targets by training lasers on objects of interest (Austin, 2010). Furthermore,

in those cases where multiple objects must be tracked outside the FOV of the UAV’s camera,

errors in global and local spatial awareness can occur. Thus is it clear that both forms of

spatial awareness should be considered if image mosaicing is to be investigated as potentially

beneficial to performance in spatial tasks.

2.7 Relevant parameters for the present study

Taking into account the visual properties afforded by a real-time image mosaicing algorithm,

as well as the wealth of factors that may affect spatial task performance in aerial search type

tasks, it was crucial to identify a subset of those factors that are not only relevant to real-life

visual search conditions but also whose effects could be manipulated in a controlled

experimental setting. To that end, it was decided to consider the effects of Viewing

Perspective and Display Size.

2.7.1 Viewing perspective/Elevation angle Wayfinding errors can occur during the encoding of spatial information due to a variety of

factors, including incorrect sensing of velocity and time and distorted frames of reference as

the world is traversed (Golledge, 1999). The viewing perspective, manifested through the

camera’s elevation angle relative to the terrain’s surface, has implications for the present

study for two reasons.

First, it stands to reason that distortions such as those caused by perspective foreshortening

due to relatively small elevation angles (Andre et al., 1991), in combination with a restricted

FOV, might contribute to encoding errors in the formation of cognitive maps, and thus

degrade global spatial awareness (Wickens et al., 1989). The relevance of this issue has

increased with the advent of computer graphics, prompting a number of investigations into

performance with the use of so-called 3D displays, which usually provide information about

21

a scene from a perspective viewpoint. Performance using 3D displays is normally contrasted

with that of single or multiple coplanar 2D views of the same environment, to observe any

differences in, for example, route planning, judgements of relative position and mental

workload during operation. Experimental results are decidedly mixed, as many studies have

found 3D displays to be superior to 2D displays (e.g. Ellis et al., 1987; van Breda & Veltman,

1998); some studies have found performance to be roughly equal (e.g. Wickens et al., 1996);

while others have found 3D displays to be inferior (e.g. O’Brien & Wickens, 1997; Boyer et

al., 1995). After considering the wealth of studies in planar vs. perspective viewing, Haskell

& Wickens (1993) drew the logical conclusion that the results may be task dependent.

Second, the effect of viewing perspective has practical implications for an operator who

might use a real-time image mosaicing display in practice. Because stitching algorithms are

based on geometric relationships between individual frames, the mosaic can manifest itself

on the display in a pronounced way. Figure 2.1 shows three display conditions: on the left,

Figure 2.1(a) shows a nominal size FOV. On the right, Figure 2.1(c) shows an enlarged FOV,

whose display size is twice that of the FOV in Figure 2.1(a). This view is what could be

expected if one were to use a different lens, and thus simply extend the view of the

environment at the same resolution as the nominal size FOV.

(a) (b) (c)

Figure 2.1 - Three displays providing a perspective viewpoint, (a) nominal size FOV, (b) mosaic FOV, (c) enlarged

FOV

The mosaic FOV condition shown in Figure 2.1(b) provides not only an extended size FOV,

but also, as a consequence of changes to the image content in the video frames, displays the

22

shape of the path of motion traversed in the most recently displayed frames. For example, the

screenshots in Figure 2.1 show the path of the camera following a curved trajectory, which in

the case of the mosaic FOV (Figure 2.1(b)) is shown directly on the screen. In other words,

the mosaic view imparts not only that there is a curved river in the environment at that

moment in time, but also that camera just recently travelled through the scene in a curved

motion path. The image frames in the other FOV conditions in Figure 2.1 only convey the

former information, that there is a river present in the scene.

As one adds more frames to the mosaic in a perspective view, the mosaic view creates a

“tunnel effect” (a term coined by the author), where past images are appended to the outsides

of the most recent frame, as the camera traverses the scene. This additional visual

information, computed purely from the relative motion between successive frames and

superimposed along with the most recent image frame of the camera, is unique to the mosaic

display.2

Of course, the motion of the camera’s path can also be computed by spatial integration, when

viewing a sequence of images (i.e. videos) of the camera travelling through the environment

using any of these displays. However, the mosaic provides additional information of the

terrain features as seen in the doubled (in the present example) FOV directly as well as the

shape feature, without the need to mentally integrate that information over the two successive

image frames. It is these surmised advantages that form some of the bases of the empirical

investigations of the mosaic FOV in the present study.

2.7.2 Display size (and resolution) Because human observers use knowledge of landmarks as a means to form higher order

spatial relationships, increasing the size of the FOV through mosaicing has the potential

benefit of providing more time for those landmarks to be encoded into spatial memory.

However, there appears to be little consensus in the literature as to whether a wider FOV and

larger display sizes will necessarily result in a gain in spatial performance in aerial SAR type

2 It is interesting to note that this extra cue provided by the successive frames has not been programmed to appear, but is a

direct consequence of the technology itself. It is also relevant to point out that, in contrast to the curved trajectory illustrated

in Figure 2.1, if the trajectory were to have been straight, then this extra cue would not be visible, and Figure 2.1(b)

would, in principle, be identical to Figure 2.1(c).

23

tasks. For example, Brickner & Foyle (1990) found that navigation performance in flying

through a computer simulated slalom course was performed more accurately using a wider

FOV of 55 degrees compared to a narrower FOV of 25 degrees. In surveying target detection

performance, Crebolder et al. (2003), for example, posited that an “intermediate” FOV size

provides optimal performance using multisensor surveillance imaging systems for supporting

visual search tasks under poor visibility.

A parameter related to Display size is that of resolution of the sensor capturing the visual

information, as well as the display resolution of the monitor used for target detection tasks.

Warner & Hubbard (1992) found that using a higher resolution narrow FOV camera sensor

improved detection performance compared to a low resolution wide FOV in a flight

simulation. From these results it was critical in the present study to only consider Display

conditions whose camera and display resolutions scale equally, so that only the size of the

Display is manipulated relative to the baseline FOV condition. To that end, the present study

includes two Display sizes: the mosaic FOV and the double size FOV, both of which are

share the same resolution but at twice the size of the nominal (or single) FOV. The reader

should refer to Appendix A10.1 for a treatment of other Display conditions that were

considered for the present study.

One confounding issue is the proportion of the display area that can be effectively scanned

by an observer as the environment moves past the camera’s field of view. For example,

although a larger size display may allow the observer more time to detect an object passing

through the FOV, the amount of displayed information that must be searched also increases.

In the presence of distractors (non-target objects in the environment), a wide FOV or large

size display also means that more distractors are present at any one time. Thus, the observer

may not be able to effectively scan the entire area, and may miss targets due to poor

coverage.

2.7.3 Speed of traversal and height above terrain Other factors that were considered for investigation were the height of the aircraft above the

terrain and the speed of traversal across the terrain. Concerning the height, it was decided

that a fixed height would be used, since the height above the terrain would influence the

visibility of the targets in the environment. In order to avoid the confound of targets being

24

easier to detect at lower heights, the height was fixed for all display conditions in

Experiments 1 and 3. However, it will be shown that a calibration of height was necessary for

the route identification task, which was the goal of Experiment 2, in which case no target

detection task was deployed.

Varying the speed of traversal was considered as well, as a way to control for the time that an

object (i.e. target) remains on the screen during a flyover, between one FOV and one of twice

the size. In other words, if a target remains in the FOV of Figure 2.1(a) for 2 seconds, for

example, one should double the speed of traversal to ensure that the target appears for the

same amount of time in the FOVs in Figure 2.1(b) and (c). However, one caveat in that case

is that the total flyover time would be cut in half, presenting a disadvantage to display

conditions with a larger Display size when trying to understand global features of the

environment. For this reason, the speed of traversal was fixed for all display conditions for

all three Experiments.

2.8 Summary

The author will henceforth refer to the local spatial awareness task as the target detection

task, and to the global spatial awareness task as the route identification task.

A review of the literature on the formation of cognitive maps as a means of maintaining

spatial awareness provides a basis from which to begin investigating the potential benefits of

real-time image mosaicing during traversal of an environment. The software technique of

image processing provides an artificially broadened effective field of view (FOV), by

aligning and blending images from a relatively narrow FOV to form a larger spatially

continuous image.

Furthermore, during a flyover manoeuvre, the real-time image mosaic also augments the

display with additional shape properties that directly indicate the path of the camera’s

motion. Thus, another potential benefit unique to mosaicing is that spatial relationships of

most recently viewed frames promotes more efficient formation of route knowledge,

compared to a smaller FOV and an equivalently sized but fixed size FOV. Thus, the

empirical investigations presented in this study seek to investigate these potential benefits

25

explicitly, as well as to supplement the existing research on human performance using real-

time image mosaicing.

26

Chapter 3. Experiment 1

3.1 Introduction

Experiment 1 was designed as an exploratory investigation of simultaneous local and global

task performance, to investigate the theoretical benefits of using a mosaiced FOV (mFOV)

display over “conventional displays”. In the experiment, the mosaiced FOV was contrasted

with a baseline single size FOV (sFOV) as well as a FOV of twice the single size (dFOV).

Each participant was asked to watch a series of recorded videos of a flyover of a simulated

terrain from a top down perspective, and was asked to identify targets as he flew over the

terrain. The image presented looked like some variation of Figure 3.1(b), for which motion of

the image was from top to bottom (representing forward motion from bottom to top relative

to the terrain). The vertically oriented blue line in the figure represented a river, along which

the simulated aircraft flew during the flyover. After the video had ended, participants were

asked to select the shape of the route he flew over.

In the following sections, the rationale behind the design of the experiment is discussed, as

well as a description of the platform, experiment parameters and hypotheses. (For further

considerations made in Experiment 1 regarding display conditions, moving targets, response

methods and data analysis methods, please refer to Appendix A10.1.) Results and analysis of

performance in the target detection and route identification tasks are discussed, followed by a

general discussion.

3.2 Experimental tasks

3.2.1 Target detection As described in Chapter 1, the present study was inspired by the difficulty in aerial search of

discriminating between a signal (a difficult to spot target on the ground) and noise (any non-

target object), while in motion above the terrain. A number of options were considered for an

appropriate paradigm that would reflect the visual information processing demands of two

simultaneous spatial awareness tasks. While it cannot be claimed that the target detection

task devised was completely representative of an actual aerial search operation, the task was

nevertheless considered to adhere to most generic attributes of aerial signal detection.

27

In the study by Morse et al. (2008), participants were asked to identify all targets (appearing

as red umbrellas), without regard for misidentified or missed targets, resulting in no means of

scoring any Correct Rejections or False Alarms. In the present study, the environment was

segmented into zones, to act as discrete events and thus enable analysis using signal detection

theory (SDT). That is, some of these segments contained no targets, so that data on False

Alarms and Misses could be collected. Between 3 and 6 targets were implanted in each

flyover route, with each segment containing no target or one target.

The flyover videos were recorded from a top down or “bird’s eye” perspective, with the

camera viewpoint fixed and pointing down towards the terrain, perpendicular to the direction

of travel. This was done to avoid the potential confound of perspective foreshortening, where

objects or parts of objects would appear compressed depending on the camera’s perspective

relative to the objects. This would have potentially complicated performance in the target

detection task, as targets would have appeared as expanded (i.e. larger) for the extended FOV

display conditions.

In order to avoid potential floor and ceiling effects in the detection task, targets were

designed to be ‘moderately difficult’ to detect relative to the background texture. This

involved several iterations of pilot tests, in which several sizes, shapes and textures were

considered for the targets and matched against the background terrain. All candidate targets

were evaluated informally by both the author and his colleagues until a single target was

decided upon, as illustrated in Figure 3.1(a).

The reader should note that this approach is markedly different from that of Morse et al.

(2008). In contrast to the target red umbrellas used in their study, which presumably were the

only red objects in the environment3, the targets in the present study were designed to blend

into the surrounding environment more closely. This was done to simulate the kinds of

challenges faced in real aerial search tasks, where discrimination between targets and

distractors (any non-target object) can be difficult (Croft et al., 2007). For this reason, highly

conspicuous (red) targets such as those used by Morse et al. (2008) were not implemented in

3 This was based on the screenshots included in the Morse et al. (2008) paper, as well as the fact that no explicit

distractors were included in their study.

28

the present study4. This distinction will become important when discussing the results of

Experiment 1.

(a) (b)

Figure 3.1 - Example of target used in Experiment 1: (a) target magnified to show textures, (b) target within flyover

terrain. The red dot in (b) represents shadow of the aircraft directly beneath. In this screenshot, the target is found to

the right of the red dot. Forward flyover motion was along the blue river, from bottom to top, resulting in overall

motion of the image from top to bottom, as indicated by the arrow.

Each participant was asked to identify targets as the video flyover was played. In order to

ensure that the participants were in fact responding to the target they had detected (and not

some distractor on the screen), measures were taken to record where the actual target was

located. Participants were instructed to indicate the location of the target as quickly as

possible, primarily in order to minimise any interference with the route identification task,

described below.

3.2.2 Route identification A route identification task was devised to reflect the perceptual processes needed for some

generic aspect of global spatial awareness – that is, tasks for which spatial updating was

required, by continuous spatial integration (Tan et al., 2004). Furthermore, the desired task

would allow the component parts of the externalised cognitive map to be analysed offline.

That is, as discussed in Section 2.2, rather than relying on a single number to represent the

collectivity of all perceptual processes, one of the goals was to be able to examine the

constituent parts of the responses and the relationships among them. In other words, it

4 Furthermore, the targets used in Morse et al. (2008) were 2D images of umbrellas, seen only from a top down

view. As will be discussed in Experiments 2 and 3, 3D models were needed to simulate targets viewed from an

angled viewpoint.

29

seemed useful to request a response that reflects the entire shape of the route that the

participant believed he flew over. Second, it was beneficial that participants not have to rely

solely on memory to be able to identify the shape of the route. That is, providing aids to the

participants would make the act of responding less prone to errors involved in recalling from

memory the shape of the route.

It was decided that the participants would perform a route identification task, for which they

would retrospectively select the route they had just flown over from among a rectangular grid

of possible routes. The rationale was that in order to assess the participants’ cognitive

mapping performance across different display conditions, the task should require continual

attention to the shape of the flyover, as opposed to only intermittent attention. Furthermore,

whereas in a recall task such as sketch mapping participants would need to memorise the

whole route to be able to reproduce it, selecting from among a number of routes in a

recognition task theoretically would be a bit easier for the participant. Furthermore, if a

sufficiently large set of routes is presented, a route identification task would also provide

some granularity to the recorded data, which could provide the opportunity for analysis that

would be more extensive than scoring by means of a single metric.

Each route that was overflown consisted of three contiguous segments, ordered as straight,

curved, straight. For carrying out the route identification after the flyover, the routes were

arranged along two dimensions, where the rows represent the curvature change in the second

segment, and the columns represent the ratio of the lengths of the two straight segments, as

shown in Figure 3.3(a). There are a number of reasons why this response method was

selected. First of all, it made the response straightforward for the participant, allowing him to

quickly communicate the cognitive map formed during the preceding flyover. It was

important to allow participants to do this quickly, since it was recognised that significant

forgetting would likely occur if the delay were too long.

Another reason for devising this method, instead of asking participants for example to sketch

their routes, as was done by Kitchin and Blades (2002), was because it was desired that the

responses provided be standardized and easily analysed. This therefore both solved the

challenge of developing a method to digitise and quantify participants’ sketches, and avoided

potential confounds in analysing drawings.

30

Finally, the actual arrangement of the grid provided a means for participant to provide a

discrete response to a continuous task, by allowing them to examine an essentially

continuous spectrum of responses whose individual units can be easily compared to their

neighbour in the grid.

Yet another important aspect of this response method was that the routes in the grid were

presented in a canonical (nominally north-up) representation, meaning that all routes were

aligned according to world-centred coordinates. This had implications for the participants

during the flyover, which were presented track-up, and ultimately was identified as a

limitation of Experiment 1.

3.3 Response method

A novel response method was introduced, whereby the participant was ask to select the route

he had flown over from a number of alternatives placed on a 10x10 grid, as shown in Figure

3.2. The grid elements were arranged in a meaningful order, varying by curvature in

horizontal direction, and ratio of the lengths of the straight segments in the vertical direction.

For further details on the values selected for these dimensions, please see Section 3.4.

Figure 3.2 - 10x10 response grid for novel response method in which participants selected the route they flew over.

The 10x10 response grid exhibits the properties discussed earlier, namely that the entire route

is preserved for later analysis, as well as the fact that participants can presumably accomplish

the task through recognition of the route, rather than recall. Furthermore, another benefit of

this response method is that the researcher is able to control the granularity of the response

31

by changing the number of elements as required. In this case, a 10x10 grid was selected after

pilot testing grids with both fewer and more elements.

As far as it has been determined from a review of the literature on externalised cognitive

maps, the response method described here is a novel one, with a number of potential benefits

for researchers trying to evaluate participants’ performance in identifying route traversals.

3.4 Platform

The experimental platform for Experiment 1 consisted of a computer program developed

using Matlab and Psychtoolbox (Kleiner et al., 2011). All routes were generated in a virtual

environment in Google Sketchup and Google Earth, made up of a computer generated grass

terrain and a river running the length of the route5. Targets were also placed along the length

of the route.

First, the participant watched a flyover video in the Route Flyover window. Whenever a

target was believed to be present, the participant used the space button to pause the video,

which brought up the video mask. The participant used the mouse to click on the screen

where he believed the target to be located, after which the mask was removed and the video

resumed playing. For a description of another response method that was considered, namely

of a N-alternative forced choice method, please see Appendix A10.1. At the end of each

video, the Route Identification window appeared, and the participant used the mouse to click

on a Route on the 10 x 10 response grid representing his best estimate of the route that had

just been traversed. After the participant confirmed his selection, the next experimental trial

began after a 10 sec break.

Route layout: The route consisted of three contiguous segments traversed in this order:

straight, curve, straight. Both the length of the curved sections and the combined length of

the straight sections were constant for all routes, with the combined straight segments being

1.5 times as long as the length of the curved section. The curved section’s length was Lc =

2.75 km, then the combined length of the straight sections was Ls1+Ls2 = 1.5*2.75 = 4.05

5 Note that the scales and distances used in all three experiments were selected arbitrarily, within the default

settings of Google Sketchup. As such, they are not representative of distances used in aerial search tasks.

32

km and the total length of the route was L = Ls1+Ls2+Lc = 6.75 km. For each route the

radius of each curved section remained constant along its length.

Grid layout: A 10 x 10 matrix of routes was developed as the response grid, shown in Figure

3.3(b). The radius of curvature of the elements of the grid ranged from 687.9 to 1.91x105

metres, while the Ls1/Ls2 ratios of the straight segments had a range of 0.1 to 10. Note that

corresponding right curving grids were shown after right curving routes had been flown

(assuming that participants would never make an overall right versus left curving error).

(a) (b)

Figure 3.3 – Illustrations of (a) route elements, including the curved and straight portions, (b) Grid layout from

which the participants selected the route they flew over. The values for the length ratio and curvature radii are

included here for illustrative purposes; participants only saw the grid of routes.

Targets: For the signal detection task, the terrain was populated with stationary targets, all of

the same size, shape and image texture, as illustrated in Figure 3.1(a). The targets were

inserted at predefined locations to create target zones (or ‘events’). ‘Neutral zones’

containing no targets were added between events, with the size of each neutral zone being

twice that of the sFOV, to ensure that the terrain containing two target events could not

appear at any one time. ‘Neutral zones’ containing no targets were added between events,

with the size of each neutral zone being twice that of the sFOV, to ensure that the terrain

from two events could not appear at any one time. These elements are shown in Figure 3.4.

33

The length of the events was also set at twice the size of the sFOV, and by extension, the

dFOV and mFOV displays cover the entire length of an event. Note that the events and the

neutral zones were invisible to the participant, as he observed only a continuous forested

terrain.

Figure 3.4 - Illustration of neutral zones and target zones in each flyover. Target zones did or not contain a target,

while neutral zones did not contain targets. Note that the red box representing the FOV of the sFOV (travelling from

left to right) covers half the length of an event.

Flyover: Each flyover took approximately 1.5 minutes to complete. As the camera passed

over the terrain, each target took approximately 3 seconds to traverse the display in the sFOV

condition (and thus approximately 6 seconds for the mFOV and dFOV conditions). The

height remained fixed at 325 m over the surface, which was selected so that the river and

trees were visible at all times.

Camera elevation angle: In order to avoid the potential confounds of perspective

foreshortening described in Section 2.7.1, the camera was oriented pointing down towards

the terrain. Because the curves of the river were very gentle, the simulated aircraft (perhaps

somewhat unrealistically) perfectly followed the path of the river. The result was a top-down

view with a continuously changing track-up orientation. This meant that the world appeared

to rotate continuously at a rate that was always tangential to the present curvature of the

river, with the general flow of the terrain being from the top to the bottom of the display.

3.5 Procedure

A fully within subjects experiment was performed with 9 graduate students from the

University of Toronto. All were 18-40 years of age, with normal or corrected-to-normal

34

vision. None reported having experience as a search and rescue operator. Only male

participants were recruited in order to avoid any confounding factors of inter-gender

differences in spatial awareness performance. This decision was based on frequent reports in

the literature that males in general are known to perform better than females in mental

rotation and other spatial tasks (Linn & Petersen, 1985; Halpern, 2000).

Participants were briefed on the scenario and the two experimental tasks. They were asked to

perform both tasks equally well. The participants conducted 3 training trials for each display

condition, with feedback provided on both tasks. In the target detect task, the experimenter

watched the training video alongside the participant, and made note of any missed targets or

False Alarms. During training participants were told that there could be any number of

targets in each route, but that only one target would be present at any given time. At the

conclusion of each training video, the experimenter replayed the video and pointed out the

missed targets. For the global task, the participant was shown the correct route on the

response grid after having made each of his selections.

After the training period, each participant completed a total of 24 experimental trials divided

into 2 blocks, with each block containing 3 sets (for each of the 3 Display conditions) of 4

randomised trials per set. Three pseudorandom sequences of the 6 sets (2 blocks x 3 sets per

block) were generated and distributed among the 9 participants, three participants per

sequence. Trials within each set were randomised, even though participants may have

received the same pseudorandom sequence of sets.

Each trial consisted of identifying targets during a video flyover, and then selecting the route

he had just flown over once the video was completed. The participants completed the entire

experiment within two hours, including 10 second breaks between trials, and a 3 min break

after every block of 4 trials. Participants were compensated $30 for their participation.

35

3.6 Experimental parameters

Display condition was manipulated as a within subject factor with 3 levels:

Single field of view (sFOV)

Double the size of the single field of view (dFOV)

Mosaic field of view (mFOV)

The single FOV (sFOV), shown in Figure 3.5(a), was selected as a baseline display

condition. The sFOV condition has a field of view of 60°, equivalent to a display size of 8 cm

by 10 cm, or 344 pixels by 424 pixels, corresponding to a simulated real-world area of

approximately 300 x 375 m viewed from a flyover height of 325 m. The mFOV, as shown in

Figure 3.5(b), was selected to be approximately double the size of the sFOV, as a reasonable

size increase to compare performance. The resulting size is equivalent to the dFOV condition

when the movement of the camera viewpoint is purely translation. In the implementation of

the image mosaicing algorithm used, the size of the mFOV change is determined by the

number of overlapping frames that are composited into the image mosaic. For a camera

viewpoint travelling in the forward direction, the image mosaic grows in size as more images

are superimposed in the mosaic. Through trial and error testing, it was decided to use 7

frames in the mosaicing algorithm.

36

(a) (b) (c)

Figure 3.5 - Screenshots of the three display conditions for Experiment 1, (a) single size: sFOV (b) mosaic: mFOV, (c)

double size: dFOV

The double size FOV (dFOV), shown in Figure 3.5(c), provides a display of size equal to the

mFOV, but without the unique shape properties of the mFOV, described in Section 2.7.1.

The reason for including the dFOV condition is so that any potential differences in

performance between sFOV and mFOV can be also evaluated in comparison with simply

using a display with double the size. In other words, it is conceivable that any mFOV

performance benefits may be a consequence simply of the larger FOV rather than from the

unique shape properties of the mosaic.

The dFOV was twice the size of the single field of view - equivalent to a display size of 8 cm

by 20 cm, or 344 pixels by 848 pixels. This corresponds with a simulated real-world area of

approximately 300 x 750 m viewed from a flyover height of 325 m.

It should be noted that, to eliminate one potential confound, the spatial resolution of the

dFOV was set to be equal to that of the sFOV, in spite of the fact that it was twice the

physical size of the sFOV. Due to the nature of the mosaicing algorithm, the mFOV display

had the same resolution as the other two displays, thus permitting a fair comparison for

which FOV size was the main factor.

37

For Experiment 1, eight routes, comprising different combinations of two straight segments

and one curved segment were chosen for the experiment. The routes were treated as a

random factor.

3.7 Experimental Hypotheses

For the target detection task, it was hypothesised that the larger the amount of time that the

target remains within an extended FOV, as in the mFOV and dFOV conditions, the better the

target detection performance compared to the sFOV condition.

Concerning the route identification task, it was hypothesised that the larger size FOV in the

mFOV and dFOV conditions would result in better route identification performance

compared to that of the sFOV.

Furthermore, it was hypothesised that the unique shape properties in the image mosaic

(mFOV) would improve route identification performance, compared to the fixed shape in the

dFOV condition.

3.8 Results

3.8.1 Target detection As mentioned earlier, the data for the target detection task comprises a recording of where

participants indicated with the cursor the targets were located. In order for the experimenter

to determine whether participants responded to the target or some other environmental

feature, a Matlab script was written to display the screenshot that was recorded at the

moment of pausing together with the location of the recorded mouse click. Because it was

reasonable to expect that participants would report the location with some error, the location

of the mouse click was represented by a box of 100 x 100 pixels, to provide some tolerance

for defining a correct detection. The experimenter visually assessed all reported target

detections to classify them as Hits, Misses, False Alarms, or Correct Rejections.

The first observation was that there were no False Alarms among any of the participants; that

is, there were no incidents of reporting a target where there was no target. Given the effort

spent developing targets that blended in with the surrounding grass and tree features, and

thus were intended to be difficult to detect, it was expected that there would be some False

38

Alarms. One possible explanation is that the strength (d’) of the signals (targets) was larger

than anticipated. Another is that participants may have adopted a conservative response

criterion, resulting in more Misses at the cost of fewer (ie. no) False Alarms. Overall, this

may have been a failure to create a classic signal detection task, and thus it stands to reason

that a signal detection theory analysis (calculating d’ and Beta) was not appropriate for these

data. A simpler measure of the proportion of correctly identified targets for each route was

thus used.

Figure 3.6 displays the target detection performance for the three display conditions,

collapsed over all nine participants. It suggests that overall, detection performance was

highest for the single size FOV (sFOV) compared to the other two display conditions.

Similarly, for the plots of each participant, shown in Figure 3.7, participants appeared to

perform better for sFOV in comparison to either mFOV or dFOV.

Statistical analyses were conducted to verify these observations. Because Experiment 1

included one independent variable (display condition) and two dependent variables (accuracy

in the route identification task and distance error in the target detection task), a repeated

measures MANOVA was performed. All SPSS output tables are presented in Appendix

A1.1.

Mauchly’s test indicated that the assumption of sphericity was not violated for the local

measure (χ2(2) = .67, p > 0.05) across the three display conditions. The one-way within-

subject ANOVA indicated a significant main effect of display condition on the target

detection measure (F(1.58,14.66) = 13.55, p = 0.001). The within-subjects contrasts verified

that there were significant differences between the sFOV and mFOV conditions (F(1,8) =

26.04, p = 0.001) and between the sFOV and dFOV conditions (F(1,8) = 20.90, p = 0.002).

All other contrasts were found to be not significant. These results are consistent with the

observations made in Figure 3.6, that the target detection performance was highest using the

single FOV condition.

39

Figure 3.6 - Proportion of targets detected in Experiment 1 for each display condition, for all participants

Figure 3.7 - Proportion of targets detected in Experiment 1 for each display condition, for each participant

sFOV mFOV dFOV0

0.25

0.5

0.75

1Exp1: Local task performance for three display conditions, over all participants

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

Exp1: Local task performance for three display conditions, for each participant

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

sFOV mFOV dFOV0

0.25

0.5

0.75

1

Display condition

Pro

port

ion o

f ta

rgets

dete

cte

d

40

3.8.2 Route identification The data recorded from the route identification task came in the form of the routes selected

by the participant after watching each flyover video. Figure 3.8 shows an example of a

response from one participant, in the form of a route selected, together with the Correct

Route. The challenge for analysing such data was to develop a measure of performance,

based on participants’ selections relative to the correct response for each flyover.

Figure 3.8 - Example of Route Selected by a participant, and the Correct Route. The measures of Euclidean and City

block distance are also shown.

Note that different routes on the grid are represented by changes in curvature in the

horizontal direction and changes in straight line (Ls1/Ls2) ratios in the vertical direction.

Consequently, the distance between the participant’s response and the correct response was

defined as a measure of the participant’s error for this task, with lower Distance error

corresponding to better route identification performance. Because errors along the horizontal

and vertical axes were weighted equally, this Distance error measure should be unbiased

along either dimension.

41

The Distance error was computed first as the Euclidean distance – the hypotenuse of the right

triangle formed by the vertical and horizontal errors.6 For each participant, the Euclidean

distance scores were averaged across 8 trials, for each of the three display conditions, as

shown in Figure 3.10. Although the error in the global task appeared to be lower for the

mFOV and dFOV conditions for some participants, collectively the performance plots for the

global task did not reveal any clear differences between the three displays. In addition, the

aggregate Euclidean distance error plot of route identification performance data collapsed

over all participants in Figure 3.9 suggests that the mosaiced field of view provided no

advantage for correctly identifying the flyover route.

Mauchly’s test indicated that the assumption of sphericity was not violated for the global

measure (χ2(2) = 2.19, p > 0.05) across the three display conditions. The one-way within-

subject ANOVA (display condition) indicated a non-significant display main effect on the

global measure (F(1.83,14.66) = 0.622, p > 0.05), which was consistent with the graphical

results.

Figure 3.9 - Graph showing the Euclidean distance error over all participants for Route identification in Exp. 1

6 Note that units of Distance error computed here are unrelated to physical distances in the simulated world. The difference

between adjacent objects on the grid was assumed to represent an equal change of perceived curvature and length ratio for

horizontal and vertical dimensions, respectively.

sFOV mFOV dFOV

1

2

3

4

5Exp1: Global task performance ("Euclidean distance"), over all participants

Display condition

"Eucl. d

ist.

" err

or

42

Figure 3.10 - Graph showing the Euclidean distance error, for each participant for Route identification in Exp. 1

After examining the participants’ responses plotted on the grid, it was observed in some trials

that errors along one dimension (curvature or length ratio) were much larger than in the

other. Because the “Euclidean distance” error involves squaring both terms, the error value

skews toward the larger of the two terms. Therefore, a “city block” analysis was also

conducted, where, as illustrated in Figure 3.8, the distance error was represented by the sum

of the errors along both grid dimensions. The goal here was to dampen the influence of any

individual dimension for trials in which large differences exist between the two dimensions.

However, the computed City-block distance error yielded results that were similar to the

Euclidean distance error; that is, no difference in global task performance was found between

the three display conditions.

3.9 Discussion

Based on the results from Experiment 1, none of the hypotheses for the two spatial awareness

tasks were supported, as performance was not found to be higher in the mosaiced display

condition for either the global or target detection task. Performance was also not higher in

either task for the double FOV relative to the single FOV case.

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

Exp1: Global task performance ("Euclidean distance"), for each participant

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition"E

ucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

sFOV mFOV dFOV

1

2

3

4

5

Display condition

"Eucl. d

ist.

" err

or

43

3.9.1 Route identification results The route identification task was developed to provide insight into the potential use of

mosaicing as a means of enhancing global spatial awareness. It was surmised that appending

recently viewed information to present information would aid participants in updating their

cognitive map of the environment to come up with an integrated image by the end of the trial.

Thus it was hypothesised that, compared to a single FOV size, using the image mosaic

display would on average result in smaller errors when identifying the route within a

rectangular grid of routes. The results in Figure 3.9 and Figure 3.10 suggest that overall, the

participants were able to identify the correct Route with an average error of between 2 and 3

“Euclidean” units of distance. Taking into consideration that there were 100 possible

responses presented in the 10x10 grid, as shown in Figure 3.8, performance was deemed to

be quite good overall. In fact, the average error in the sFOV condition was smaller than

expected; it was surmised that a flyover time of 1 minute and 30 seconds and the demands of

the two tasks would render the task rather difficult, leading to a larger average error in the

sFOV condition than what was observed. However, no differences were found in the route

identification task between the three display conditions. An examination of the experimental

procedure may offer some explanations for this.

Although the routes were designed so that participants would form a cognitive map of the

flyover, the simplicity of the routes may not have offered enough of a challenge for

participants to take advantage of the potential benefits of the mosaiced display. For example,

because visual cues for understanding the shape of a straight segment are not necessarily

diminished by a smaller field of view then, admittedly in hindsight, it was perhaps not

reasonable to expect any differences for those portions of the route.

However, for the curved portion of the route, the larger displays should provide a wider

portion of the curve from which the participant can more easily extract the degree of

curvature. (Note that this same effect could also be achieved for a relatively wide FOV but at

a low altitude, which will become important in a later discussion.) Thus, with more

information presented from an extended FOV, it was expected that differences would be

observed relative to the single FOV. Perhaps, in hindsight, the length of this portion of the

route was sufficiently long such that, even in the sFOV condition, the degree of curvature

44

could be easily determined over the duration of the flight over the curved portion. In

particular, the curved portion maintained a constant curvature throughout, perhaps lending

itself to no benefit for the mFOV and dFOV conditions when identifying the correct route.

3.9.2 Target detection results Concerning the target detection task, the participants performed unexpectedly with higher

accuracy when using the single (smaller) FOV. Referring to Figure 3.5, it is surmised that

two factors could have played a role here: scanning coverage and the inclusion of more

clutter in the larger displays. First, note that each target would take approximately 3 seconds

to cross the display in the sFOV condition, and by extension approximately 6 seconds in the

mFOV and dFOV conditions, which were twice as long. Given that no instructions were

given that might have prescribed any particular scanning behaviour, it is conceivable that

with the extended FOV conditions, participants may have returned to areas they had

previously scanned without detecting a target, while neglecting other areas in their visual

field. In other words, within the context of the display conditions, it is possible that the

additional time available for revisiting previously scanned areas was not appropriately

exploited.

Furthermore, the presence of distractors in the visual field, in the form of trees and

background texture, could have contributed to the comparatively higher performance in the

sFOV condition. For a display with twice the size as the sFOV at the same spatial resolution,

the number of distractors present in the visual field at any one time is also doubled.

Combined with a suboptimal scanning pattern, it was surmised that these confounds for

viewing the terrain with an enlarged FOV outweighed the potential benefit of additional time

in detecting targets.

3.9.3 Comparison of results to literature Although the target detection results were surprising, they are not unprecedented. As

discussed in Section 2.7.2, Crebolder et al. (2003) conducted a survey of Defence Research

and Development Canada (DRDC) research on the effect of Display size on target detection

performance, concluding that results are task dependent and that an “intermediate” size FOV

offers the best performance. In other words, adopting a larger FOV size did not necessarily

result in better performance. Taking this work into consideration, as well as the number of

45

target sizes, shapes and textures that were pilot tested to avoid “floor” and “ceiling” effects in

the target detection task, perhaps it was not unreasonable to find that performance was better

in the sFOV.

Experiment 1 was motivated by the results of Morse et al. (2008), where a significant

improvement in the number of detected targets was found using a mosaicing display.

Although there appears to be contradicting results between Experiment 1 and the work of

Morse et al., there is one important difference in the target’s features that may explain the

discrepancy. Morse et al. (2008) reported that red umbrellas were used as targets. It is

presumed from the screenshots provided in their paper, as well as from the fact that no

explicit distractors were used, that the targets were the only red coloured features in the

environment. Thus their targets could be identified along a single salient dimension of

colour, making the task amenable to parallel search (Wickens and Hollands, 1999). In the

case of parallel search, search times in identifying targets has been shown to be relatively

unaffected by the number of distractors (or equivalently as the FOV is extended), as

corresponds to the mosaic FOV used by Morse et al. (2008).

In the case of the present study, the targets were selected to have characteristics similar to

those of the surrounding environment. In order to make the target search moderately difficult,

the targets shared similar shape, size and colour with the surrounding trees and grassy terrain,

as shown in Figure 3.1(b). Because the target could not be identified along a single salient

dimension, serial search along multiple dimensions was assumed to be required, where each

object must be scanned before moving on to the next, to identify the target among the

distractors. Thus, with more distractors appearing in the extended FOV conditions at any one

time, participants perhaps adopted a serial search strategy in a larger search area, leading to

worse performance compared to the smaller FOV.

3.9.4 Synthesis Taken together, it was perhaps possible for participants to accomplish the global task of

identifying the route without continuously updating their cognitive maps. For example, rather

than spatially integrate the information from successive views of the display, participants

could have simply memorised the time taken to traverse the straight parts. The single curve in

the route could have been determined relatively early in the curved portion, since it followed

46

a constant curvature throughout. Thus, instead of forming a cognitive map, the route could

have been reconstructed by combining three relatively simple judgments and then matching

the route to the corresponding shapes on the response grid.

Furthermore, although participants were asked to perform the detection and route

identification tasks “equally well”, the results suggested that, perhaps due to the actual

relative difficulties of the two tasks, the participants focused more on the target detection task

than the route identification task. Because the route identification task could be accomplished

relatively easily without continuous updating of one’s cognitive map, perhaps more attention

was placed on the detection task. Thus both tasks were too easy, allowing the tasks to be

accomplished without the benefits of an extended FOV, and hence the highest overall target

detection accuracy was found in the sFOV condition, where the smallest amount of area

needed to be searched.

In summary, this investigation into spatial task performance using three display conditions

yielded results that did not support the hypotheses predicting better performance for a

mosaiced FOV display. It was observed that for relatively simple routes, comprising portions

of two straight lines plus a constant curvature, performance using the extended FOV afforded

by either the mosaic or enlarged FOV was worse in the case of target detection and resulted

in no difference in the case of route identification. A thorough examination of the

experimental hypotheses, platform and results revealed the potential for a number of changes

in the procedure required to test the hypotheses.

47


4.1 Introduction

With the lessons learned from Experiment 1, it was decided to focus next on the benefits of

mosaicing for enhancing global spatial awareness, by excluding the target detection task and

thereby any confounds of dual task performance. To that end, a number of changes were

made to the simulated environment, as well as the viewpoint parameters of the virtual camera

providing a view of the terrain. First and foremost, it was decided that the most effective way

to test whether mosaicing has the potential to enhance route identification performance was

to devise a much more challenging global mental mapping task, one that encouraged

participants to sample the displayed visual information continuously. The relatively simple

routes in Experiment 1 were thus replaced by more complex winding routes that were not

predictable, thus requiring constant attention to accomplish the route identification task.

A single continuous shape consisting of a sum of four sinusoids was generated, representing

a long winding river of path length 18750m, contained in a forested area. This was inspired

by the naturalistic shapes of rivers often found in aerial landscape photography, such as that

shown in Figure 4.1(a). A number of segments within the long river were designated for the

purpose of the experiment as Correct Routes. Each of the routes had a different shape and

each had the same path length, 1500m. The amplitudes, frequencies and phase shifts of the

four sinusoids were adjusted in order to generate, through trial and error, a set of routes with

sufficient variety that the details of the entire set of routes could not reasonably be

memorised. Details of the sinusoid parameters are given in Appendix 2.

In order to further elicit differences between the various viewing conditions, a presumably

more difficult camera elevation angle was used in Experiment 2. As described in Section

2.7.1, distortions caused by perspective foreshortening may lead to encoding errors in

cognitive mapping. Thus, the camera elevation angle was fixed at a forward facing 45 degree

angle, instead of the top down view used in Experiment 1.

It was surmised that a crucial factor in performing the spatial awareness task would lie in the

height above the terrain. Figure 4.2 shows a particular segment of the long river, displayed

48

with the same 60° FOV, but presented at four different heights. Flying at a higher altitude

allows the observer to view a larger portion of the terrain below, which should in principle

afford better understanding of the global shape of the flown over route. In fact, one can

imagine the extreme case where the camera viewpoint is placed at such a high altitude that

the entire route is shown in the camera’s FOV, making the task of identifying the route

essentially trivial. What was not clear for the present environmental features and camera

parameters, however, was at what altitude global performance improvements might begin to

saturate. In other words, logic suggested that a point may be reached at which increasing the

altitude has no further benefit for identifying the route.

(a) (b)

Figure 4.1 – (a) Winding river landscape; (b) analogous computer generated ‘river’, consisting of sum of four

sinusoids.

49

(a) H1 = 20m (b) H2 = 56m (c) H3 = 92m (d) H4 = 164m

Figure 4.2 - Screenshots of one terrain segment, with constant 60° FOV, displayed at four heights, (a) H1 = 20m, (b)

H2 = 56m, (c) H3 = 92m, (d) H4 = 164m

In relation to the topic of mosaicing, recall that one of the anticipated benefits of the enlarged

FOV is that more of the terrain can be seen at any one time, an effect surmised to be

analogous to an increase in height. Therefore, with regards to our goal of determining

whether such benefits actually exist, it was important to avoid a situation for which the route

was so easily identifiable at a selected height in the sFOV condition that extending the

effective FOV by introducing the mFOV and dFOV conditions would have no measurable

benefit. The primary goal of the present experiment, therefore, was to determine empirically

the effect of the camera’s height above the terrain on participants’ performance in the route

identification task, using only the single FOV display condition. This experiment was thus

considered as a ‘calibration experiment’, in that the results would be used to select an

appropriate height for which to test the mosaiced FOV in the next experiment.

The hypothesis for the present investigation, therefore, was that as height is increased,

performance in identifying the traversed route should improve.

4.2 Experimental task

The experimental task for Experiment 2 was similar to the route identification task performed

in Experiment 1, but without the target detection task. The participant was first shown a 20

sec flyover video of an out-the-window view of an aircraft flying above the terrain along a

predetermined flight path. After completion of the video, the participant was asked to

identify which route he flew over. One experimental trial consisted of watching one video

and providing the Route identification. An important distinction from Experiment 1 was the

method of responding, explained below.

50

The experimental platform consisted of a computer program developed using Matlab. All

routes were generated in a virtual environment created in Google Sketchup and Google

Earth, made up of a computer generated grass terrain and a river running the length of the

route. Figure 4.3 illustrates six separate Routes selected from the long river, representing the

Correct Routes for Experiment 2. Each trial consisted of an overflight of one of the six

Routes, while the participant watched the flyover video in the Route Flyover window, similar

to those shown in Figure 4.2.

Figure 4.3 - Display of the six Routes selected for Experiment 2, chosen from the long continuous river (Left). Routes

on right show start of each Route with a green marker and end of each Route with a red marker.

At the end of each video, the Route Selection window appeared, as shown in Figure 4.4, and

the participant used the mouse to interactively indicate the route he believed he had just

flown over. The Route Selection window showed on the left side a top-down view of the

entire river, with green and red markers designating the respective start and end points of the

currently selected route. The participant clicked on the buttons in the centre to shift the start

Route 1

Route 2

Route 3 Route 4

Route 5

Route 6

51

and end markers together7 in either direction along the length of the river. The window on the

right showed a magnified version of the currently selected (highlighted) route resulting from

the button clicks.

Although there was theoretically an infinite number of start/stop positions that could have

been chosen along the continuous long river, in fact it was divided up into a total of 460

discrete equal length routes. The single arrow control buttons were used to displace the

markers along one segment at a time, while the double arrow control buttons moved the

markers more coarsely, 20 routes at a time. (Because of the large number of discrete

segments, and because the boundaries of those segments were not visible to the participants,

it is believed that they were not in fact aware that their inputs were not continuous.) When

the participant was content with his selection, he clicked on the Submit button at the bottom

right to enter that selection8. For a description of an alternative response method considered

in Experiment 2, combining the response grid from Experiment 1 with the complex routes,

please refer to Appendix A10.2.

7 Because all Routes had a common fixed length, only one degree of freedom was necessary for manipulating one’s

response; consequently the green and red dots moved together as participants clicked on the response buttons. 8 Although the form of the response has changed from the 10x10 grid used in Experiment 1, the route selection method

exhibits many of the same characteristics, despite the absence of an actual grid. Instead of 10x10=100 choices, there are now

460 choices, placed on a 1-dimensional response layout. This should make the task arguably easier for participants selecting

a route. Furthermore, the shape of the route is still recorded as in the grid method used in Experiment 1.

52

Figure 4.4 - Screenshot of Route identification Window. Left: top-down view of entire river. Centre: response

buttons, for controlling response; Right: instantaneous indication of selected route. Green and red markers indicate

respective start and end points of currently selected route.

A fully within subjects experiment was conducted by recruiting seven male participants from

the University of Toronto. All participants had normal to corrected vision, and ranged in age

from 18 to 40. None reported having had prior experience with aerial search and rescue type

tasks.

Following an explanation and two practice trials to become familiar with the platform,

participants performed 6 training trials, each at a different height and for a different Route.

The participants viewed each of the six randomised routes at each of the four heights in

Figure 4.2: H = {20, 56, 92, 164} metres. There were also two heights, 128 and 200 m, used

in the practice session that were not used during the experiment. After each selection, the

participants were shown the correct Route.

During the data gathering phase, participants completed a total of 48 experimental trials

divided into 2 blocks, with each block containing 4 sets (for each of the 4 Heights) of 6

randomised trials (for each of the 6 Correct routes) per set. Two pseudorandom sequences of

53

the 8 sets (2 blocks X 4 sets per block) were generated, and distributed among the seven

participants. Three pseudorandom sequences of the 8 sets were generated and distributed

among the 7 participants. One pseudorandom sequence was given to 4 participants, while the

other was given to 3 participants. Trials within each set were randomised, even though

participants may have received the same pseudorandom sequence of sets. A break of at least

3 seconds was given between trials (participants could extend this if desired), and a two

minute break was enforced in between blocks.

4.3 Results of Analysis

The selected routes were collected for all participants and were plotted for each Height along

with the corresponding Correct Routes. The set of six graphs for Height 2 (56m) and Height

3 (92m) are given in Figure 4.5. The complete set of graphs for all four heights can be found

Appendix A3.1. Each plot contained 14 selected routes in black ink (two for each of the

seven participants), as well as one Correct Route in red ink representing one of the six

Correct Routes. Note that the red Correct Routes (1 to 6) are identical in the two sets of

graphs, due to the fact that it was only the Heights that were varied.

(a) Routes selected at H2 (b) Routes selected at H3

Figure 4.5 - Examples of ensembles of selected Routes collapsed over all participants at (a) Height H2, (b) Height H3.

Each plot contains 14 Routes in black ink (two for each of the seven participants), as well as one Route in dashed red

ink representing the correct Route. The routes are translated so that their starting points coincide, while maintaining

the original North up representation (as seen in the Route identification window).

54

4.3.1 Challenge of Defining Objective Scoring Method Given the variety of route selections shown in Figure 4.5, one challenge is that of defining an

objective scoring method for evaluating performance. One class of performance measures

involves a purely computational approach, where routes are broken down into constituent

parts and then objectively compared to come up with an overall measure of error. To

demonstrate how one might apply an objective quantifiable metric to assess differences

between routes, Figure 4.6 presents two routes from the long river with slightly differing start

and end points, calling one the Correct Route and other the selected route.

Using the conventional root mean square error (RMSE) metric, one could simply sample N

points along each of the curves (CRcorrect and CRselected), define an appropriate distance score

between corresponding samples, and then compute an aggregated error over all points

between the curves by the equation:

√∑ ( )

.

The RMSE method is a standard practice in a number of domains, and served as a starting

point for comparing the route identifications with the Correct route. In the present context,

RMSE = 0 if the selected Route matches the Correct route, growing as differences between

samples along the two routes increases. Although this formula seems at first glance quite

straightforward, and thus appropriate for evaluating the accuracy of selected routes, it quickly

became clear that neither this RMS error measure nor most other ‘obvious’ computational

metrics would necessarily capture the real extent of errors for this particular route selection

task. The challenge posed in analysing the results of the present experiment involved how not

to inflate the error scores for route selections for which the errors were arguably in fact not

very large. The following are three examples to illustrate the inappropriateness of using RMS

error measures to score route selections used in Experiment 2.

The left side of Figure 4.6 depicts an error where the two segments lie very close to each

other on the long river, but with a slight offset. Because this discrepancy seems rather small,

one would intuitively expect a relatively low RMS error score to be produced by the formula

above. Regarding the right side of Figure 4.6, however, where two routes are presented with

a common starting point, as would be the case if the RMSE equation were to be applied, the

55

source of error between the two Routes lies in how far along the Route one travelled before

approaching the first curve to the left. In this case, the left turn occurs earlier in the trajectory,

as if the turn were shifted along the curve. Even though some elements of the flyover were

judged correctly, the RMSE score would still accumulate all of the differences in positions

along the Routes, potentially leading to inflated RMSE scores for curves that are shifted

relative to each other.

Figure 4.6 – An illustration of a Correct Route (in red) and a selected route (in black). The resulting RSME score

between these two routes would be large, despite the fact that the shapes are quite similar.

Another type of error occurs where the overall shape of the selected route tracks the Correct

route except for a small number of deviations somewhere along the trajectory. This is shown

in Figure 4.7, where the routes exhibit similar shapes despite being at different areas along

the long river. The only discrepancy lies in the relative lengths of the beginning and ending

segments. For the present example, the error in judging the first curve would carry over

throughout the rest of the error computation, causing the RMSE error to become inflated in

spite of only small differences in the shapes of the two Routes.

56

Figure 4.7 - An illustration of a Correct Route (in red) and a selected route (in black), where small deviations in the

route occur between the two routes. The resulting RSME score between these two routes would be large, despite the

fact that the overall shapes are quite similar.

Furthermore, given the complexity of the routes found in the long river, it was possible for

the participant to select a route with similar overall shape to the Correct Route, but simply

mirrored along the vertical axis, as shown in Figure 4.8. In other words, the shape was

correct, except for reversing the direction of the turns. Clearly the RMSE value describing

the magnitude difference between the two routes would be high, leading one to conclude that

the routes are very dissimilar. However, taking into consideration the demands of the flyover

task, I believe that the two routes are actually similar, with the important difference being the

turn directions. That is, the reader is reminded that the flyover videos are presented from a

track-up perspective without a canonical North, meaning that in sections where the path is

straight, it is impossible to know in which world-referenced direction one is travelling. In

other words, upon the basis of only watching the video, flying along an Eastbound direction

along a straight path is indistinguishable from flying along a Westbound direction along a

straight path. As such, a simple reversal of the turn directions can lead to the participant

selecting a mirrored route. In the Correct Route the first turn occurs to the left, whereas the

first turn occurs to the right in the selected route. Each subsequent turn direction is also

reversed, and even though the sequence of straight and curved portions is more or less

57

correct, as well as their relative proportions, the reversed turn directions lead to the mirrored

shape being selected.

Figure 4.8 - An illustration of a Correct Route (in red) and a selected route (in black), that exhibit similar shapes but

that are mirrored with respect to each other.

After considering these three examples of errors in the participants’ route selections, I

endeavoured to classify different types of errors based on the route identifications made by

the participants. Four types are identified, with examples of each type shown in Figure 4.9:

Translation errors: An error in which the overall shape was correctly identified, but

where the start point differed from the Correct route by a large amount. The

occurrence of this type of error was a consequence of the fact that the underlying long

river was a quasi-random signal, generated by the sum of a series of sine waves,

which resulted in similar (but not identical) patterns occurring along its length.

Phase shift errors: An error in which the start point of the selected route was close to

the Correct route except for some (small) offset. This type of error reflects a

participant’s having recalled the route very well, but with only a relatively small error

in recalling its start (or end) point. The result is that there is a slight offset between

the correct and designated routes, analogous to (in Engineering terms) the two routes

being “out of phase” with each other. Figure 4.6 is an example of this type of error.

58

Partial matching errors: An error in which most, but not all, of the overflown route

was recalled correctly, resulting in the choice of a route segment containing a small

number of deviations along the trajectory. An example of a partial matching error is

shown in Figure 4.7.

Mirroring errors: An error in which the relative distances were tracked well, but

with reversals of the left and right turns. As mentioned earlier, Figure 4.8 shows an

example of mirroring error, where even the steepness of the turns was tracked

adequately, but not the direction of the turns.

Figure 4.9 – Examples of four types of errors observed in route selections (black dotted line) compared to Correct

route (red line), (a) Translation error, (b) Phase shift error, (c) Partial matching error, (d) Mirroring error.

It became clear that any scoring method would have to adequately characterise these

different types of errors, given the cognitive challenges in accomplishing the task. Applying

an RMSE score to these four types of errors produces mixed results. Consider Figure 4.10,

which shows the routes in Figure 4.9 with the starting points are aligned. In the case of a

translation error (Figure 4.10(a)), an RMSE score would appropriately score the error as low

(i.e. RMSE = 0), and thus, an RSME score might be sufficient. However, it still remains that

for the other 3 error types (phase shift, partial matching and mirroring), RMSE would be

artificially inflated, as errors are carried over throughout the lengths of the trajectories, even

if the shapes are considered similar.

59

Figure 4.10 - Illustration of four types of errors observed in route selections (black dotted line) compared to Correct

route (red line), shown with starting points matching for (a) Translation error, (b) Phase shift error, (c) Partial

matching error, (d) Mirroring error.

Taking into account the kinds of errors described above, as well as the inability of most

RMSE measures9 to adequately capture the errors observed, the use of purely computational

methods was abandoned. With no viable alternatives in using objective measures, I turned to

a subjective method of evaluation, namely Thurstone’s Method of Paired Comparisons to

evaluate the Route selections in Experiment 2.

4.3.2 Paired Comparisons Method The paired comparisons method (PCM) is one of several well-known scaling methods for

comparing object attributes (Dunn-Rankin et al., 2004). Thurstone (1927) proposed the PCM

as a law of comparative judgment for placing objects along a psychological continuum of

quality. The ratings can refer to essentially any properties, ranging from perceived weights

(Thurstone, 1927), to opinions on political issues, to perceived video quality (Woods et al.,

2010).

The essence of PCM is to aggregate a set of judgments about attributes of objects, carried out

two at a time, and to transform those aggregations onto a single rating scale. Judgments of

quality between any two objects may vary across judges and may also vary in time within

judges from one comparison to the next, even for the same objects. Thus the underlying

neural or psychological processes whereby the quality of objects is judged dictate the spacing

of this scale, typically based on the premise that the underlying processes follow a normal

9 In addition to the computation of RMS error in terms of geometric distance between the two curves, other RMS measures

were investigated and ultimately abandoned. For example, both first and second order derivatives of the selected routes were

matched against those of the Correct routes in order to develop separate RMS velocity and acceleration measures. The

rationale there was that participants might be especially sensitive to changes in direction, and/or rate of change of direction,

and thus to recognise such performance accordingly. Similarly, a metric of RMS curvature error was also computed, under

the rationale that participants might excel in recalling the continuity of the overflown route. In all cases, inflated values of

error were observed, for essentially the same reasons as described above. It was thus concluded that these alternative

objective RMS error measures could also not adequately evaluate the participant’s performance in identifying routes.

60

distribution. In the present work, we consider Case V of Thurstone’s law of comparative

judgment, which provides the most simplifying assumptions for the standard deviations and

correlations of the distributions of these “discriminal processes”.

With PCM, pairs may be evaluated by a single judge performing multiple repeated

comparisons or, as in the present work, by multiple judges, who are informed about the

criteria along which they must compare the objects’ quality. It is the multiplicity of

judgments that provide the basis for satisfying the model’s requirements as being based on

underlying discriminal processes.

For a comparison set of t objects, the total number of comparisons required is: t*(t-1)/2.

Judgments for all pairs of objects are tabulated in a confusion matrix. For example, if object

A is judged to be of higher quality than object B, then the cell entry for column A / row B is

increased by one. If the converse judgment is made, then the cell entry for column B / row A

is incremented. After all judgments have been tabulated, the cells in the matrix are converted

to proportions of the total number of comparisons, and then converted into standard normal

Z-scores. The latter values are averaged down the columns of the matrix to generate a mean

Z-score for each of the t objects.

Due to the underlying assumption of normally distributed discriminal processes, the resulting

mean Z-scores can be shown to lie along an equal interval scale, meaning that unit

differences between two points at one place on the scale are equal to unit differences at other

places on the scale10

. The spacing between objects on the scale represents the ‘psychological

distance’ along the particular perceived continuum of quality. Implicit to Thurstone’s method

is the fact that the relative spacing between objects is what is important, not the values

themselves. This has two implications for Thurstone’s scales. First, the linear scale can be

adjusted by means of any linear transformation. For example, a common approach is to scale

the PCM values by √2 to transform the scale to units of Standard Normal deviates

(Thurstone, 1927). Second, because only relative spacing is important, the assignment of the

10

A common example is that of the Celsius temperature scale. The difference between 20ºC and 35ºC is the same as the

difference between 100ºC and 115ºC.

61

0 value is arbitrary. Typically one assigns one of the psychological objects, usually the one

with the lowest mean Z-score, as an anchor and assigns it a value of zero.

A number of variations and modifications to Thurstone’s original method have been

proposed, including corrections for extreme proportions of less than 0.02 or greater than

0.9811

(Dunn-Rankin et al., 2004), eliminating bias in presentation of objects by generating

ordered pairs (Ross, 1934), and relaxing some of Thurstone’s original assumptions

(Mosteller, 1951). Furthermore, a number of statistical tests for significance of paired

comparison scores have been proposed (David, 1988; Gridgeman, 1963; Jackson and

Fleckstein, 1957; Starks and David, 1961; Woods et al., 2010). For example, Edwards (1957)

provides for a comprehensive treatment of the data from PCM, including checking for

consistency among data, testing assumptions of Thurstone’s Case V method, and estimating

the properties of the discriminal dispersions of psychological objects.

An interesting consequence of Thurstone’s model is that, even though a (preferably) large

number of judges carry out a set of comparisons between different instances of a desired

quality, it is the aggregation of those judgments that are used to form proportions, and are

then transformed, using the Normal distribution function, to generate equal interval scale

ratings. As a result, the aggregated data12

may not be normally distributed and may have

inter-subject dependencies, which violates the assumptions of a conventional analysis of

variance (ANOVA). In other words, the transformations performed on the aggregate data do

not scale appropriately to conventional statistics, and for this reason an ANOVA was not

applied. However, statistical tests for one-way effects and contrasts between conditions are

available (Starks and David, 1961). This is discussed in further detail in Section 4.3.5.

Of particular interest in the present context was the need to evaluate performance of

participants in estimating the shape of complex routes, in the absence of an alternative

directly computable scoring method. The premise was that PCM would provide insight into

11

The problem with extreme proportions is that the Z approaches -∞ as probability approaches zero, and Z approaches +∞

as probability approaches one. 12 For example, one can imagine a table whose rows represent the judges and whose columns represent preferences of one

object over another. In other words, each cell represents the number of times each option was preferred (out of the choices

provided) within each person.

62

the collective performance of the participants in the assigned route identification task, based

on their aggregated data.

In concluding this section, three important points merit being pointed out.

The paired comparisons method as used here relied on sets of pairwise judgments that

were carried out on the response data recorded during the experimental trials. As such,

applying the paired comparisons method to the participants’ route identification data

represents, as far as is known to the author, a previously unexplored method for

evaluating spatial task performance.

Further to the point above, it is important to distinguish between two very different

populations of “judges” in this experiment:

o The participants who observed the video flyovers and then used their judgement

to select for each flight the route that had just been overflown.

o The volunteer judges who were recruited (see below) to judge the responses of the

route selecting participants in the experiment.

In spite of the fact that in carrying out the paired comparisons each of the (second) group

of judges provided their individual subjective judgments regarding the different pairs of

route identification data, it is important to realise that the scale values produced by their

collective judgments represents a set of objective scores of spatial task performance.

4.3.3 Application of Paired Comparisons In setting up the paired comparisons, it was reasoned that, because the interest was in scaling

performance as a function of height, there was no point in comparing performance across

different routes, especially given that the routes were quite different from each other. (For

example, with reference to Figure 4.5, there was no point in comparing performance at

Height 2 for Route 1 with performance at Height 3 for Route 2.) Consequently a separate set

of paired comparisons of route identifications across the different heights was carried out for

each of the six routes. That is, for Route 1 performance for H1 was compared with that for

H2, H3 and H4, for a total of 6 (=4*3/2) judgements. This was repeated for each of the 6

routes, for a total of 36 comparisons.

63

Twenty-one volunteer (i.e. unpaid) judges were recruited by email to carry out the paired

comparisons. Please see Appendix A4.1 for a copy of the instructions and response format.

They were presented with pairs of ensembles of selected routes (the black curves), similar to

the sample ensembles shown in Figure 4.5, and were instructed for each pair of ensembles to

indicate which of the two more closely matched the corresponding Correct Route, shown in

the figures as the red curve. Note that for all comparisons the two red curves were always

identical; it was only the relationships of the ensembles of black curves to the red curves that

were being compared.

All judges completed the full set of 36 comparisons by replying on a Google form created for

the purposes of collecting the PCM data. Using Thurstone’s method, the judgments were

compiled into a single 4x4 confusion matrix for the four Heights, by combining judgments

for all of the six routes for each height pair. For example, all the comparisons between H1

and H2 are grouped for Routes 1 to 6 in the same cell. In other words, completing the task for

the six routes was considered to represent six instances of the same perceptual task, at a

given Height. The aggregated confusion matrix and other calculations are shown in Table

4.1.

H1 H2 H3 H4

H1 - 87 79 103

H2 39 - 46 75

H3 47 80 - 80

H4 23 51 46 -

Table 4.1 - Aggregated confusion matrix of paired comparison judgements for performance at four Heights: H1, H2,

H3, H4. Table should be interpreted as preferences of the column element over row element.

The raw matrix values were converted into proportions of the total number of judgements, as

shown in Table 4.2. Note that the diagonal entries of the matrix are filled with a proportion of

0.5, as it is assumed that being presented two sets of the same Routes, the judges overall

would select one of those sets 50% of the time.

64

H1 H2 H3 H4

H1 0.500 0.690 0.627 0.817

H2 0.310 0.500 0.365 0.595

H3 0.373 0.635 0.500 0.635

H4 0.183 0.405 0.365 0.500

Table 4.2 - Aggregated scores converted to proportions of the total number of judgments over all judges (in this case

126).

The proportions in Table 4.2 were then converted to Z-score values using standard normal

tables. The columns in the confusion matrix were then summed and averaged over the

number of objects (4 in this case) to obtain the mean Z-scale for performance at each of the

four Heights. Because this is an equal interval scale, a shift of all the values does not affect

the distances between the scale values. Thus, as a last step, the scale values are shifted so that

the lowest scale value acts an anchor at a value of 0 for the scale. Calculations are shown in

Table 4.3, and the final scale values are plotted in Figure 4.11.

H1 H2 H3 H4

H1 0.000 0.497 0.324 0.906

H2 -0.497 0.000 -0.345 0.241

H3 -0.324 0.345 0.000 0.345

H4 -0.906 -0.241 -0.345 0.000

Sums -1.727 0.601 -0.366 1.492

Means -0.432 0.150 -0.091 0.373

Means + 0.432 0 0.582 0.340 0.805

Table 4.3 - Proportion scores in the confusion matrix converted to Z scale units. The values are then summed along

the columns to compute the mean Z values. Finally, the values are shifted by the minimum value to anchor the values

to 0.

Although it was hypothesised that increased height would result in better performance in

identifying the flown over Route, the results obtained in Table 4.3 were unexpected. In terms

of the order of the psychological objects, the performance at H1 and H4, corresponding to

the least and greatest heights respectively, fell in line with the expected results, with the

worst performance at the lowest altitude and the best performance at the greatest altitude.

The results also suggested, however, that performance at H2 was better than at H3, which

65

was inconsistent with the hypothesis for Experiment 2. These unexpected results let to further

analysis of the data, as follows.

Figure 4.11 - Final PCM scale values for route identification task for four Heights, from Table 4.3.

4.3.4 Outlier analysis An examination of the raw data revealed an interesting observation about the participants’

responses. In particular, as illustrated in Figure 4.12, there appeared to be some confusion

between the Routes 2 and 5, which had some similarities in overall shape but whose left and

right turns were reversed.13

In retrospect, it was surmised that this important commonality

may have confounded performance for the Route 5 trials, by leading the participants to

frequently select routes resembling Route 2 instead. This is especially evident in the bottom

row of Figure 4.12.

13

Although Route 2 and Route 5 might appear different in the (exocentric) North up representations shown in Figure 4.12,

the reader is reminded that the participants carried out these tasks from a track-up (rotating azimuth) perspective, meaning

that there was no easy way to maintain a sense of a canonical North as the participant flew over the route.

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(b)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

(c)Height

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(a)

66

Figure 4.12 - Aggregated route selections for Route 2 and Route 5, for each of the four Heights H1 to H4. Each plot

contains 14 Routes in black ink (two for each of the seven participants), as well as one Route in dashed red ink

representing the correct Route. The routes are translated so that their starting points coincide, while maintaining the

original North up representation (as seen in the Route identification window).

This supposition was further supported by examining the individual PCM graphs derived

from the non-aggregated confusion matrices for each of the six Routes, shown in Figure 4.13.

For the most part, the graphs for each of the routes follow a general trend where a greater

height resulted in better Route identification performance. However, the relative scores for

the four Heights {H1, H2, H3, H4} for Route 5 lied in stark contrast to those for the other

five routes, with the greatest height H4 exhibiting the worst performance, followed by H3.

67

Figure 4.13 – Graphs for PCM results for each of the six Routes, aggregated over all participants, for Routes 1 to 6.

Taking into consideration what appeared to be a confounding outlier Route among the set of

six, PCM scale values were recomputed without the comparisons for Route 5. (The

calculations are provided in Appendix A5.1.) The resulting graph is shown in Figure 4.14(b),

along with the graph from Figure 4.11 for all six Routes reproduced for comparison in Figure

4.14(a).

Although the order of the psychological objects remained the same, comparing the graphs

reveals an interesting change in the spacing between performances at the different heights.

Whereas the objects in the original scale were spread out fairly evenly, the adjusted graph in

Figure 4.14(b) indicates non-uniform differences between objects. In particular,

performances at H2 and H3 appear to be much closer to each other, whereas the distances

between those two performances and at H1 were enlarged. In other words, it appears that

performance at both H2 and H3 was better compared to H1, but that the difference between

H1 H2 H3 H40

1

2

3Route 1

Scale

valu

e

Height

H1 H2 H3 H40

1

2

3Route 2

Scale

valu

e

Height

H1 H2 H3 H40

1

2

3Route 3

Scale

valu

e

Height

H1 H2 H3 H40

1

2

3Route 4

Scale

valu

e

Height

H1 H2 H3 H40

1

2

3Route 5

Scale

valu

e

Height

H1 H2 H3 H40

1

2

3Route 6

Scale

valu

e

Height

68

H2 and H3 was relatively small. Similarly, the distances between H4 and all other

performances were even more pronounced in the adjusted scale.

Figure 4.14 – Final PCM scale values: (a) using all comparisons; (b) using all comparisons except those from Route 5.

4.3.5 Statistical tests and checking assumptions As mentioned earlier, the Case V model of PCM uses simplifying assumptions concerning

the standard deviation of a discriminal processes, which is referred to as its discriminal

dispersion. In order to check the assumptions of the Case V model, Edwards (1957) provides

a significance test which is sensitive to the property of additivity among other assumptions.

Furthermore, after the original 1927 paper on the PCM method, Thurstone (1932) and Burros

(1951) proposed methods for estimating discriminal dispersions of the psychological objects.

This would allow the objects to be scaled based on the dispersion values, in the event that

assumptions of the Case V model were not found to be tenable (Edwards, 1957).

Applying this test led to the conclusion that the assumptions of Thurstone’s Case V model

were not tenable for the Experiment 2 PCM data. The explanation and calculations are

provided in Appendix A6.1. Thus, adjustments were made to the scale values by estimating

the discriminal dispersions of the psychological objects, in order to interpret the data under

Thurstone’s Case III model. The Case III model imposes fewer restrictions on the discriminal

processes, namely that the standard deviations (or discriminal dispersions) are not assumed to

be equal. As this was the case for the Experiment 2 data, the discriminal dispersions were

calculated, as shown in Appendix 7. Essentially, the Case III model could be followed

instead of the more general Case V model. Figure 4.15(c) shows the linear scale values with

the inclusion of the discriminal dispersions, under the Case III model.

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(b)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

(c)Height

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(a)

69

Figure 4.15 - Three computed scale values for Experiment 2 results, (a) all data, Case V method (b) all data excluding

Route 5, Case V method, (c) all data excluding Route 5, Case III method.

In order to further illustrate the differences at the various levels of Height, a two-dimensional

plot is presented in Figure 4.16, with the X-axis showing the Height in metres above the

simulated terrain and the Y-axis showing the scale values.

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(b)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

(c)Height

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(a)

70

Figure 4.16 - Experiment 2 PCM values for all data excluding Route 5, Case III method. The actual Heights in

metres are shown.

Follow-up analyses on the adjusted confusion matrix were performed to test for statistically

significant differences among scores. Starks and David (1961) developed a test statistic D for

this purpose:

[∑

] ⁄

⁄

14

14

Note that special consideration must be taken for the value of N with regards to Experiment 2. There were 21

judges performing the paired comparisons for the 4 objects (different heights), meaning t(t-1)/2 = 6 pairs.

Furthermore, each of the 6 pairs of comparisons was repeated for each of the six Routes, for a total of 36 paired

comparisons per judge. It is believed that the original interpretation of N representing the number of judges is

somewhat misleading in this case, as each judge in essence provides a multitude of sets of judgements for the

PCM, albeit for pairs representing different instances of the same perceptual task. In other words, five of the six

sets of judgments would not be accounted for under the original interpretation of the statistical test. For this

reason, the value was modified to be N = 21*6 = 126.

71

This is the special case of a more general test proposed by Durbin (1951), and equivalent to

one proposed by Kendall and Smith (1940). It follows a χ2 distribution with t-1 degrees of

freedom for the number of psychological objects compared. The test statistic uses the matrix

of raw data, operating on a null hypothesis that all treatments (in this case, Heights) are all

alike in the response they evoke. In other words, the judgments from the paired comparisons

are matched against a null hypothesis of there being no preference between each pair of

objects, across all comparisons, which in turn is equivalent to postulating that all discriminal

dispersions are equal. The alternative hypothesis is that there is a preference between pairs

of objects. The ai values from the Experiment 2 data are shown in Table 4.4.

H1 H2 H3 H4

H1 - 71 73 98

H2 34 - 43 74

H3 32 62 - 75

H4 7 31 30 -

ai 73 164 146 247

Table 4.4 - Aggregated confusion matrix, with column totals, ai.

In this case, = 630/4 = 157.5, t = 4, N = 21*6. Analysis showed that the differences

observed in the PCM scores were statistically significant, D = 121.63 against a critical value,

χC2(3, N = 21*6) = 7.82, p < .05. Because D > χC

2, the null hypothesis was rejected. Thus,

results from the Paired Comparison Method suggested that Height did have a significant

effect on participants’ ability to perform the route identification task.

Pairwise contrasts were calculated for the six pairs of combinations of Heights using a

method analogous to the Scheffé method (corrected for Type I error), developed by Starks

and David (1961). The method uses an approximation of the distribution of joint probabilities

of the preferences for one treatment against the remaining treatments in the set. For the

specified contrasts, a Q2 test statistic is created to match the differences between sets of those

approximations against a critical χ2 value, different from the one described above, which has

been adjusted for the contrast between two treatments. The results in Table 4.5 show that 5 of

the 6 contrasts were statistically significant for the column element being preferred over the

row element. Please refer to Appendix A8.1 for calculations of the contrasts.

72

H1 H2 H3 H4

H1 - 65.72** 42.29** 240.29**

H2 - 54.67**

H3 2.57 - 80.96**

H4 - Table 4.5 - Results of pairwise contrasts between levels of Height in Experiment 2, following the Scheffé method

outlined in Starks and David (1961). The value in each cell represents a Q2 test statistic for the column element being

preferred over the row element. Critical values at α = 0.05 and 0.01 are indicated by * and ** respectively.

Figure 4.17 - Plot of the PCM scale values, including contrasts results. Each line indicates that a significant contrast

was found between the conditions at the endpoints of that line.

4.4 Discussion

Overall, the results confirmed the hypothesis of increased performance with increased height,

but with some important nuances. It was clear from Figure 4.15(c) and Figure 4.17 that

performance at the lowest height H1 = 20m afforded the worst performance, and that

performance was best at the highest height, H4 = 164m. Pairwise contrasts revealed that

performance at these two Heights was found to be significantly different from all other

Height levels. However, performance at H2 and H3 was not found to be significantly

different from each other, suggesting that increasing the height from 56m to 92m provided no

additional benefit, even though an increase from 20m to 56m showed a statistically

significant performance increase.

73

In conclusion, the results of Experiment 2 therefore indicated that Height did have an effect

on performance in the route identification task. Consequently, one of the height values was

selected as a fixed value for Experiment 3. In deciding which height that would be, it should

be recalled that increasing the height above the terrain with a fixed size FOV one increases

the amount of terrain that can be observed in any one image. A functionally equivalent effect

might be achieved by providing a FOV whose size is double that of the single size (i.e. in the

dFOV condition), such that it affords a larger portion of the flyover route to be viewed at any

one time, albeit with a larger display size15

. Therefore, in order to select a height value from

Experiment 2 to be used in Experiment 3, a value was chosen with the intention that there

would be potential for improved performance from a larger view of the terrain, whether by

increased height or by extending the FOV. In the case of Experiment 3, I sought to

investigate potentially improve performance by extending the FOV through mosaicing.

Within the context of the Experiment 2 results, H4 = 164m was found to produce much

improved performance relative to H3 = 92m. Thus setting the height at H3 for the single

FOV condition was expected to allow sufficient room for potential improvements to be found

using the dFOV and mFOV conditions, whose extended FOVs might provide an effect

analogous to increasing the height to H4.

In summary, given the complex winding routes and perspective viewpoint introduced in

Experiment 2, this investigation was carried out to determine the effect of height on route

identification performance. As such, the results had important implications for the follow-up

experiment, as the equal interval scale allows the appropriate height to be selected at which

the route identification task can be accomplished. Furthermore, a new method was developed

for evaluating human performance in route identification involving complex winding routes.

The Paired Comparisons Method has proven useful in objectively evaluating aggregate data

along a psychological continuum of quality, to provide a measure of performance that could

not be attained by purely computational methods.

15

The only other difference would be that the spatial resolution with which the terrain is viewed changes as the

height is varied. However, I do not believe that this is important for the purposes of selecting the height for the

global spatial awareness task in the present investigation.

74


5.1 Introduction

After developing a complex route in the form of a long winding river, the results of

Experiment 2 served to determine an appropriate height above the terrain, H3 = 92m, for

which it was expected to observe some performance differences between the three display

conditions. Furthermore, as a consequence of the limitations of conventional computational

methods for evaluating global task performance, the method of Paired comparisons was

adopted for evaluating the routes selected by the participants.

It should be recalled that a perspective viewpoint was adopted in Experiment 2, as another

factor that was presumed to make the task of performing route identification more difficult. It

was surmised that the introduction of perspective foreshortening in an angled view would

lead to a greater challenge in forming an accurate cognitive map of the terrain compared to a

top down view. In Experiment 3, I sought to explicitly test the effect of viewing perspective

on route identification performance. In particular, two camera elevation angle (EA) values

were tested: EA = 90°, corresponding to the top down view used in Experiment 1, and EA =

45°, equivalent to the viewpoint used in Experiment 2. The two EA values are shown in

Figure 5.1. The results are expected to have practical implications for the use of mosaic

displays, whose shape properties depend on the parameters of the camera’s viewpoint

relative to the terrain. For a discussion of changes made to the flyover environment based on

pilot testing of EA, please refer to Appendix A10.3.

Figure 5.1 - Illustration of the angled (45°) and top down (90°) viewpoints used in Experiment 3.

75

Using the results and lessons learned from the first two experiments, Experiment 3 sought to

determine the effect of the mosaiced FOV on performance in a spatial awareness task

involving traversal over and identification of a complex winding route. Thus in Experiment 3

I revisited the hypothesis presented in Experiment 1 – that the unique shape properties and

increased size of the mosaic FOV condition should afford increased performance in both the

local and global spatial awareness tasks compared to a fixed single FOV. Thus three display

conditions were tested in Experiment 3 in order to investigate the effect of image mosaicing

on spatial task performance.

Single field of view (sFOV)

Double the size of the single field of view (dFOV)

Mosaic field of view (mFOV)

In order to determine the number of superimposed frames for the mosaic condition to have

roughly equivalent display sizes for the mFOV and dFOV conditions, an analysis of the

display areas determined that an image mosaic composed of 10 frames was equivalent to the

display size of the dFOV. Details of the procedure are provided in Appendix 9.

Taken together, the two camera elevation angles and the three display size produce a total of

six combinations of viewing conditions, as shown in Table 5.1.

76

Display size

sFOV mFOV dFOV

Ca

me

ra e

levatio

n a

ng

le

45°

90°

Table 5.1 – The six combinations of display condition and camera elevation angle used in Experiment 3.

As the Elevation angle is changed from 90° to 45°, the viewpoint will cause perspective

foreshortening, which is expected to distort the participants’ judgements of the spatial

relationships of terrain features in the scene viewed from the camera’s FOV. In the 90°

viewpoint, the participant should be better able to integrate the spatial information of the

environment between successive views, and thus it was hypothesised that performance in

identifying the traversed route should be better in the 90° condition compared to the 45°

condition. Concerning the target detection task, the EA=45° provides a ‘preview’ of the

upcoming terrain compared to the view afforded by the EA=90° condition. Compared to the

targets in the EA=90° whose size remain fixed during the flyover, the foreshortening causes

the targets to appear smaller as they enter the FOV from the top of the display, but also

appear to grow in size as they reach the bottom of the display. As Stager (1974) pointed out,

77

targets appear in the fixation field for a longer period of time as the operator gazes away

from the terrain beneath himself, and thus it was hypothesised that the target detection

performance would be higher at EA = 45°.

The single FOV (sFOV) acts as a baseline condition, whose display size is smaller than both

the mosaic FOV (mFOV) and the double size FOV (dFOV). It was hypothesised that the

larger size afforded by the mFOV and dFOV will result in better route identification

performance compared to the sFOV condition. In addition, the unique shape properties of the

mFOV will help in forming the spatial relationships between the objects and textures in the

environment. Accordingly, it was hypothesised that the task performance in the mFOV

condition will be better than the dFOV condition. Concerning the target detection task, as in

Experiment 1, it was hypothesised that the extended FOV would afford higher detection

performance in the dFOV and mFOV conditions, compared to the sFOV condition.

5.2 Experimental procedure

Combining the procedures of Experiments 1 and 2, participants were asked to watch a series

of 25 sec flyover videos, generated in a virtual environment created in Google Sketchup and

Google Earth. As shown in Figure 5.2, six routes were selected from the same long river used

in Experiment 2. A similar approach to that of Experiment 1 for defining ‘target zones’ was

taken (as described in Section 3.3), by including ‘neutral zones’ to ensure that only one target

could appear within the display at any one time. Each video consisted of a computer

generated grass terrain and a river running the length of the route, with each route containing

seven targets, shown in Figure 5.3, placed within the events the grass terrain. During the

flyover, participants indicated that they detected a target by pressing on a large button located

on the bottom of the ‘Route flyover’ window using a computer mouse, shown in Figure 5.4.

After each video, participants were asked to select the route flown over from within the long

winding river using the buttons in the ‘Route identification’ window, as in Experiment 2.

78

Figure 5.2 - Display of the six Routes selected for Experiment 3, chosen from the long continuous river (Left). Routes

on right show start of each Route with a green marker and end of each Route with a red marker.

(a) (b)

Figure 5.3 - Example of target used in Experiment 3: (a) target magnified to show textures, (b) target within flyover

terrain. In this screenshot, the target is located on the bottom right of the FOV.

In order to ensure that display areas of equal size were being searched for targets for all

conditions, and thus that no advantage was given to the larger mFOV or dFOV conditions,

participants were asked to detect the targets within a portion of the screen demarcated by red

markers superimposed on the display window. In the screenshot shown in Figure 5.4, the

markers flanked the sides of the entire sFOV display area. In the mFOV and dFOV

79

conditions, the markers demarcated an area equal to that of the sFOV condition in the top

half of the display.

In their briefing on the scenario and the two experimental tasks, participants were asked to

perform both tasks equally well. They then conducted 12 training trials with feedback on

both tasks. Two trials for each combination of display condition and viewing perspective

were provided. In the target detection task, the experimenter watched the training video

alongside the participant, and made notes of any missed targets or False Alarms. Participants

were told that there could be any number of targets in each route during the training and

experimental trials. At the conclusion of the video, the experimenter replayed the video and

pointed out the missed targets. For the global awareness task, the participant was shown the

correct route on the response grid after having made his selection.

Figure 5.4 - Screenshot of the ‘Route flyover’ window for the dFOV condition. Participants were asked to press the

‘Target detected’ button beneath the image when a target appeared within the area designated by the red markers.

Note: a target is currently showing in the screenshot, half-covered at the top of the FOV.

80

To ensure that participants were searching within the markers, as well as to generate baseline

performance data, 3 trials were conducted after the training session for which participants

were asked to performed only the target detection task. The participants then completed a

series of 42 experimental trials, carried out in 6 sets of 7 trials in each set. In each of the six

sets, one combination of display size (3) and camera elevation angle (2) was presented. The

first trial of each set of 7 trials was a flyover video different from the 6 Correct routes chosen

for the experiment16

. (This first trial from each block was excluded from the analysis to avoid

any confounding transfer effect between blocks.) The remaining 6 trials corresponded to the

6 Correct routes illustrated in Figure 5.2, randomised for each set and for each participant.

Four pseudorandom sequences of the 6 sets of display conditions were generated, and

distributed among the 13 participants. One sequence was given to 4 participants while the

other three were given to 3 participants. A break of 1 minute was enforced between trial

blocks.

After completing all of the experimental trials, the participants conducted a subjective rating

of the six viewing conditions, to elicit some knowledge about which conditions they felt

afforded the best performance in the global task. The participants were asked to perform a

series of randomised paired comparisons of each of the 15 (=6*5/2) combinations of display

conditions and viewing parameters. In each comparison, the Participant Paired Comparison

Window showed two flyover videos of the same Route but with different viewing parameters

playing simultaneously. The participant was then asked to indicate “which of the two viewing

conditions allowed you to more accurately identify the shape of the Route?” A screenshot of

the ‘Participant Paired Comparison’ Window is shown in Figure 5.5. The participant

responded to each comparison by filling out a Google form on another computer monitor

containing the options LEFT or RIGHT for each pair.

16

Three different routes were chosen for the purposes of being the first trial in the set. Thus, across the 6 sets of

routes, each of these three routes was used twice.

81

Figure 5.5 – Screenshot of the ‘Participant Paired Comparison’ Window, presented to the participants after

completing all experimental trials.

A fully within subjects experiment was performed with 13 graduate students from the

University of Toronto. All were between the ages of 18 and 40, with normal or corrected to

normal vision. Only male participants were recruited in order to avoid any confounding inter-

gender differences in spatial awareness performance. None reported any previous experience

as a search and rescue operator.

The participants completed the entire experiment within approximately two hours and were

compensated with $30.

A similar procedure to Experiment 2 was carried out for using the Paired Comparisons

method to evaluate the performance in the route identification task. The route selections were

aggregated over all participants in order to provide a representation of the collective

performance across the 6 viewing conditions. For each of the routes, 15 pairs (=6*5/2) of

display plus viewing conditions were compared. For the 6 routes, this resulted in a total of 90

pairs (6*15) of diagrams. Twenty-four volunteer judges were recruited to carry out the

paired comparisons. Appendix A4.2 shows the instructions sent to the judges, as well as

screenshots of the interface for performing the paired comparisons.

82

5.3 Results

5.3.1 Target detection task The procedure for analysing the target detection performance was similar to that of

Experiment 1. Screenshots of the display were recorded when the participants clicked on the

‘Target detected’ button on the interface. To measure performance on the task, the

experimenter visually assessed all instances of reported target detections for all participants.

While there were instances where some targets were missed, there were no instances for

which a participant had detected a target when there was in fact no target present (i.e. False

Alarms). As was done in Experiment 1, a simple performance measure of proportion of

identified targets was thus adopted.

The baseline target detection trials, carried out prior to the regular flyovers, when the

participants performed only the detection task, showed perfect performance – that is, all

targets were detected, with no False Alarms. Figure 5.6 shows target detection performance

for the six combinations of display conditions, aggregated over all 13 participants. The graph

suggests that overall different levels of the FOV size and viewing perspective did not appear

to influence the detection rate.

Figure 5.6 - Graph of target detection performance for the six experimental conditions: {45°,90°}x{sFOV, mFOV,

dFOV}.

45,sFOV 45,mFOV 45,dFOV 90,sFOV 90,mFOV 90,dFOV0

0.25

0.5

0.75

1Exp3: Local task performance for six display conditions, over all participants

Viewing condition

Pro

port

ion o

f ta

rgets

dete

cte

d

83

A 2-way within-subject ANOVA (elevation angle X display condition) found no significant

angle main effect (F(1,12) = .156, p > 0.05), as well as no significant display condition main

effect (F(2,11) = .156, p > 0.05), nor a significant interaction effect (F(2,11) = .156, p >

0.05). Please refer to Appendix A1.2 for the statistical tables and the results of the

assumption tests.

5.3.2 Route identification task Using Thurstone’s method under the Case V method, the judgments were compiled into a

6x6 confusion matrix for the six combinations of display condition and viewing condition, by

combining judgments for all of the six routes17

. Given that the computed scale values

represent six combinations of a two-factor design, the scale values were plotted into two 2-

dimensional graphs separated by Display size and Elevation angle in Figure 5.7, to illustrate

the relationships between the treatments. The scale values are shown in units of Standard

Normal deviates, for reasons of convenience when discussing the results. The calculations for

the scale values are shown in Appendix A5.2.

(a) (b)

Figure 5.7 - Plots of PCM results for closeness of aggregated route selections to Correct route performance: (a)

across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for each Display size.

17

Note that there was no interest in rating performance for the different routes, which is why data were

combined over routes.

sFOV dFOV mFOV0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Display size

Scale

valu

e

EA = 45

EA = 90

45° 90°0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Elevation angle

Scale

valu

e

sFOV

dFOV

mFOV

84

It was observed that the worst route identification performance was judged to be at {EA=45°,

sFOV}, while the best performance was judged to be at {EA=90°, mFOV}. The results in

Figure 5.7(a) suggest that for each of the three Display sizes, performance at EA = 90° was

judged to be better than at EA = 45°. The separation between the EA values for the sFOV

and mFOV appear to be larger than those for the dFOV. The results in Figure 5.7(b) suggest

that for each level of Elevation angle, performance in route identification was judged to be

best for the mFOV, followed by the dFOV and then the sFOV. However, for EA = 45°, the

separation between dFOV and mFOV was relatively small, compared to the equivalent

conditions for EA = 90°. Among the three Display sizes, the route selections in the sFOV

were judged to be of the lowest overall quality, that is, that least closely matched the correct

route.

A one-way statistical analysis showed that the differences observed in the PCM scores were

statistically significant, D = 233.16 against a critical value χC2(5, N = 24*6) = 11.07, p < .05.

Because D > χC2, the null hypothesis was rejected. In other words, there were significant

differences among the six combinations of Display size and Elevation angle; consequently, a

set of pairwise contrasts between pairs of viewing condition combinations was computed18

.

The contrasts between the three Display sizes (contrasts 1-6) and between the two Elevation

angles (contrasts 7-9) are shown in Table 5.2. The calculations for the complete set of all 15

pairwise treatments contrasts are provided in Appendix A8.2.

18

One issue that arises here is that of being able to collapse across conditions, such as Display size or Elevation

angle, to investigate the overall effect of each factor on its own using pairwise contrasts. In order to do this, one

would require that a separate set of paired comparisons be carried out with the routes presented on the same

graph across conditions. For example, to perform pairwise contrasts collapsed over Elevation angle, one would

need to present judges sets of routes for the various Display sizes, aggregated over EA = 45° and 90°. Such

data were not collected in this study, and thus pairwise contrasts to test for such main effects are not possible.

85

Treatment χC2(5,0.05)* =

22.14,

χC2(5,0.01)** =

30.18 45°,

sFOV

45°,

mFOV

45°,

dFOV

90°,

sFOV

90°,

mFOV

90°,

dFOV

Con

trast

nu

mb

er

1 x x

185.19**

2 x

x

114.12**

3

x x

8.56

4

x x

183.34**

5

x

x 31.89**

6

x x 62.30**

7 x

x

45.38**

8

x

x

44.46**

9

x

x 2.89

Table 5.2 – Sets of pairwise contrasts for the judges in Experiment 3, following the Scheffé method outlined in Starks

and David (1961). Each pair of contrasts is indicated by an X in a particular row. The value in the last column

represents a Q2 test statistic for the column element being preferred over the row element. Critical values at α = 0.05

and 0.01 are indicated by * and ** respectively.

The results revealed a number of significant pairwise contrasts, as 7 of the 9 contrasts were

found to be significant at α = 0.01. Contrasts 1 to 6, between the three Display sizes for each

Elevation angle, were found to be significant except for that between {45°, mFOV} and

{45°, dFOV}. Contrasts 7 and 8, between the two Elevation angles for the sFOV and mFOV

respectively, were found to be significant. Contrast 9, for {45°, dFOV} and {90°, dFOV}

was not found not be significantly different. Figure 5.8 shows the results from the paired

contrasts graphically, where each line on the graph represents a significant contrast between

the conditions at the end points of the line.

86

(a)

(b)

Figure 5.8 - Two-dimensional plots for data from Experiment 3 PCM route identification performance, with pairwise

contrast results, (a) across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for

each Display size.

87

5.3.3 Participant subjective ratings of six viewing conditions The participants were asked to conduct a series of paired comparisons for the 6 different

viewing conditions, each of which represented a combination of viewing perspective and

display FOV. The participants were told that they should respond to which viewing condition

they felt supported better performance in the global task (regardless of the particular route

shown in the paired comparison). In other words, the videos shown in the paired comparisons

were meant to remind the participant of all of the different viewing conditions used during

the experimental task for the particular route shown. The resulting scale values are shown as

two-dimensional graphs in Figure 5.9. Please see Appendix A5.3 for the related

computations.

(a) (b)

Figure 5.9 - Plots of PCM results generated by participants, for the question “which of the two viewing conditions

allowed you to more accurately identify the shape of the Route?”, (a) across the three FOV conditions for each

Elevation angle, (b) across the two Elevation angles for each Display size.

Overall, participants found that identifying the route shape was easier in the EA = 45° angle

case compared to EA = 90°. Participants also indicated that an enlarged FOV, either mFOV

or dFOV, made identifying the route shape easier compared to sFOV. In the EA = 90°

condition, participants found the global task was easier with the dFOV compared to the

mFOV. The positions were reversed in the angled viewpoint, as participants found the

mFOV made it easier than the dFOV when performing the global task.

sFOV dFOV mFOV0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Display size

Scale

valu

e

EA = 45

EA = 90

45° 90°0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Elevation angle

Scale

valu

e

sFOV

dFOV

mFOV

88

Analysis showed that the differences observed in the PCM scores were statistically

significant, D = 68.38 against a critical value χC2(5, N = 13) = 11.07, p < .05. Because D >

χC2, the null hypothesis was rejected. Thus, results from the paired comparison method

suggested that there were overall significant differences between the participants’ subjective

ratings of the viewing conditions.

Pairwise contrasts were computed for all combinations of the six Display conditions. The

contrasts between the three Display sizes (contrasts 1-6) and between the two Elevation

angles (contrasts 7-9) are shown in Table 5.3, showing that only 3 of the 9 contrasts were

significantly different. Please refer to Appendix A8.3 for calculations, as well as the full set

of contrasts.

Treatment

χC2(5,0.05)* = 22.14,

χC2(5,0.01)** = 30.18 45°,

sFOV

45°,

mFOV

45°,

dFOV

90°,

sFOV

90°,

mFOV

90°,

dFOV

Con

trast

nu

mb

er

1 x x

11.54

2 x

x

18.51

3

x x

0.82

4

x x

16.62

5

x

x 27.13*

6

x x 1.28

7 x

x

34.67**

8

x

x

37.38**

9

x

x 16.62

Table 5.3 - Set of pairwise contrasts for the participant ratings in Experiment 3, following the Scheffé method

outlined in Starks and David (1961). The value in the last column represents a Q2 test statistic for the column element

being preferred over the row element. Critical values at α = 0.05 and 0.01 are indicated by * and ** respectively.

Figure 5.10 shows the two-dimensional graphs with the statistically significant pairwise

contrasts highlighted.

89

(a) (b)

Figure 5.10 - Two-dimensional plot for participant ratings of Display conditions, with pairwise contrast results, (a)

across the three FOV conditions for each Elevation angle, (b) across the two Elevation angles for each Display size.

5.4 Discussion

Experiment 3 exploited the results from Experiment 2, whence an appropriate Height was

selected for which a reasonable chance was expected that differences might be observed

between the three display conditions: {sFOV, dFOV, mFOV}. Two camera elevation angles

were also tested: {45°, 90°}, the perspectives used in Experiments 2 and 1 respectively.

5.4.1 Target Detection Results While there were some missed targets in the participants’ target detection responses, no False

Alarms occurred, which was consistent with results for Experiment 1. The results showed no

significant differences in performance across the six display FOV and elevation angle

conditions, with a mean detection rate of approximately 78% of targets. (See Figure 5.6.)

Thus, the hypotheses were not confirmed for the target detection task.

As described in Section 2.6.2, Stager’s (1974) investigation of the effect of Elevation angle

(EA) on target detection performance revealed that an angled viewpoint would allow targets

to remain in the visual field for a longer period of time compared to a top down view. This

led to the hypothesis that target detection performance would be better at EA = 45° compared

to EA = 90°. However, the target detection results revealed no differences between the two

level of EA. In explaining this result, it should be noted that, although the simulated

90

environment was created to replicate some of the environmental features of aerial search, the

height above the simulated terrain was not scaled to real aerial search scenarios. Typically

real aerial search operations may be conducted at heights of up to 1000 ft (approximately 305

m) above the terrain (Stager, 1974). Although it was anticipated that the effect would still

manifest itself at a lower height, as indicated by pilot studies showing encouraging results, no

effect was observed in Experiment 3.

The reader is reminded that the height above the terrain in Experiment 3 was set at 92 m as a

result of calibrating the route identification task in Experiment 2. Presenting the terrain at a

height of 457.2 m would have resulted in the entire route being visible in the sFOV case, as

shown in Figure 5.11, making the route identification task trivial. However, in hindsight,

perhaps an important factor is the apparent size of the targets appearing at a height of 305 m.

In other words, the targets appearing at a height of 92 m in Experiment 3 could have been

scaled to be approximately the same size as objects appearing in the visual field at a height of

305 m. Accomplishing this would involve conducting a search in the literature of the size of

typical objects searched in aerial search scenarios, and calculating the size of the object’s

subtended visual angle at 305 m. Nevertheless, this change may have resulted in some

advantages in the angled viewpoint, in accordance with those reported by Stager (1974).

Figure 5.11 – A route used in Experiment 3 shown at a height of 457.2 m above the terrain. The entire route is shown

in the single FOV (sFOV).

91

Display size was hypothesised to result in better performance in the dFOV and mFOV

conditions, as targets appeared in the FOV longer compared to the sFOV condition.

However, the results showed no differences between the three Display sizes. It is possible to

explain this result, admittedly in hindsight, by considering the particular structure of the task.

In order to ensure that the participants were searching in approximately equal areas of the

display, they were asked to identify targets only within markers placed on the screen. The

markers demarcated an area equal to the size of the single size FOV in all three conditions.

Thus, in retrospect perhaps it is not surprising that no differences in target detection

performance were found. It should also be noted that two participants found the mFOV

condition to be distracting during the flyover, as the mosaic FOV displayed the occasional

jittery frame. This may have shifted their attention away from the zone demarcated by the red

markers, contributing to lower target detection performance in the mFOV. However, one side

effect of the apparently uniform performance levels in the target detection task across

conditions is that any observed differences in the route identification task can be argued to be

the result solely of differences in the independent variables, and not the influence of the

participants’ performance in the target detection task.

5.4.2 Route Identification Results The Route selections were aggregated across participants to gain insight into collective

performance across the different conditions. The 24 volunteer judges performed the Paired

comparisons method by selecting, for each pair of Route ensembles, which set of Routes

collectively more closely resembled the Correct Route. Figure 5.7 shows that the

performance judged to be highest at {90°, mFOV}, while worst performance was judged to

be at {45°, sFOV}, with the remaining four clustered together. Because Thurstonian scaling

produces an equal-interval scale, the smaller relative distances between the four points

indicate that these conditions were judged to be closer in quality (i.e. in matching the Correct

route). A one-way analysis confirmed that there were significant differences between the six

viewing conditions, and two-dimensional graphs (Figure 5.10) with contrasts were generated

to illustrate the differences in performance between the conditions.

92

It was hypothesised that performance at EA = 90° (top-down view) would be better than at

EA = 45°, due to the difficulties imposed by perspective foreshortening from an angled

viewpoint. Figure 5.9(a) illustrates the relationships between the six conditions separated by

Display size for each level of Elevation angle. For each Display size, performance in the EA

= 90°, or top down view condition, was judged to be better compared to EA = 45°, as

hypothesised. In both the sFOV and mFOV, pairwise contrasts revealed significant

differences between the two levels of EA; however this was not the case for dFOV.

The reasons are not clear, as the perceived expansion and compression of objects in the FOV

from angled viewpoint would be even more pronounced in an extended FOV compared to the

sFOV (where an effect was observed). Perhaps in the dFOV condition the effect of preview

was sufficient to aid participants in forming a cognitive map of the flyover route, countering

the potential negative effects of perspective foreshortening in identifying the route.

Furthermore, the fact that route features remained in the FOV longer compared to the sFOV

may help to explain the differing results between the sFOV and dFOV conditions. That is,

the extended FOV allowed the participant to sample terrain features over a longer period of

time compared to the sFOV, facilitating the formation of the cognitive map, such that parity

was reached between the two levels of Elevation angle.

The relative placement of the sFOV, mFOV and dFOV on Figure 5.10(b) followed the same

pattern for each camera Elevation angle (EA), EA=45° and EA=90°. Performance in

identifying the Route was judged to be worst in the sFOV, likely due to the relatively limited

FOV that displayed fewer features of the environment at any one time compared to the

extended FOV conditions (mFOV and dFOV). Because the displayed features remained in

the FOV for a relatively short period of time, less accurate cognitive maps were formed,

leading to less accurate Route selections.

Performance in the dFOV condition was judged to be better than for sFOV. The extended

FOV afforded more environmental features of the complex route, as well as their spatial

relationships to be viewed at any given time, so that features could be integrated more easily

during the traversal.

93

Finally, the mFOV performance was judged to be the highest among the three display

conditions, for both EA = 45° and EA = 90°. In addition to having an extended FOV at

roughly the same size as the dFOV, the important difference between the dFOV and mFOV

conditions was the shape property of the mosaiced image, which directly displayed the shape

of the path followed by the simulated aircraft traversing the terrain.

Contrasts revealed significant pairwise differences between the three Display sizes for each

level of EA, which was consistent with the hypotheses, except for one contrast between {45°,

dFOV} and {45°, mFOV}. In the condition where the angled viewpoint was used, the

potential advantage of the explicit shape of the camera’s path for the mosaic FOV did not

produce significantly better performance.

As discussed earlier, performance at EA = 90° was found to be better than at EA = 45°,

which was consistent with the hypothesis that the perspective foreshortening would cause

distortions in cognitive maps formed during a flyover. Perhaps there was a “ceiling effect” in

the performance being reached at EA = 45°, in that the benefits of the additional shape

information could not completely overcome the deficiencies imposed by the perspective

foreshortening.

Note that a significant contrast was found between dFOV and mFOV in the top-down EA =

90° condition, however. In order to investigate at what value of EA the advantage of viewing

the shape of the camera’s path on the screen manifests itself over having an extended FOV

(i.e. the dFOV condition), more levels of Elevation angle should be tested. This is noted in

Section 6.4 as a suggestion for future work.

Relating the previous discussion of the effect of Elevation angle, Figure 5.12 highlights the

three scale values {45°, dFOV}, {45°, dFOV} and {90°, dFOV} that were not found to be

significantly different from each other. Perhaps additional collection of additional data from

participants would have elicited further separation between these conditions.

94

Figure 5.12 - Graph highlighting (with a dotted circle) the three scale values that were not found to be significantly

different from each other in pairwise contrasts.

5.4.3 Participants’ Subjective Rating Results In addition to carrying out the route identification task, the participants were also asked to

give subjective ratings of the effectiveness of the different display conditions for carrying out

that task. After completing all flyover trials, they used the paired comparison procedure to

compare the six display conditions with regards to which conditions allowed them in general

to achieve the best performance in identifying routes. The resulting graphs (Figure 5.7) had

both similarities and differences in comparison with the route identification results obtained

from the judges (Figure 5.9).

One interesting result from the subjective rating scale was that participants felt that the

viewpoint perspective EA = 45° allowed them to perform better than with the top down view

at EA = 90°. Significant contrasts found this to be the case for sFOV and mFOV, but not for

mFOV. A number of participants noted that the angled viewpoint allowed them to see further

ahead, providing the effect of “preview” for the incoming terrain features. However, the

results from the external judges’ evaluation showed that performance in the top down view

was generally better than with the angled viewpoint. Referring to known difficulties in

acquiring spatial knowledge from perspective views, the distortions caused by perspective

foreshortening for EA = 45° appeared to be greater than the participants’ perceived benefits

of preview. However, from the top down viewpoint, no perspective foreshortening occurred,

allowing the benefits of the real-time mosaic to manifest themselves.

95

In general, the conditions with the extended FOV (mFOV and dFOV) were found to be more

helpful than the sFOV. However, there was a reversal between the mFOV and dFOV

conditions about which provided better performance in the EA = 45° and EA = 90° cases,

respectively. In the EA = 45° condition, the mFOV was found be better, whereas the dFOV

was found to be better in the EA = 90° condition.

Although these differences were not found to be statistically significant, the written

comments provided by the participants after they completed the experiment may offer some

insights into the preference of the extended FOV. One participant wrote that the 45° angle

condition with the “mosaic view also helped a lot, mostly through redundancy (not just the

line of river indicating curvature, but angle at which frames are connected, too)”, referring

presumably to the shape directly displayed in the mFOV. As described earlier, the tunnel

effect may have provided extra visual cues about the sharpness of curves along the route.

It should be recalled that no such “tunnel effect” was present for the mFOV condition in the

top down view. Two participants noted verbally that the mFOV condition could have

distracted them during the flyover, as the overlapping frames of the mosaic FOV displayed

the occasional jittery frame, depending on the terrain features captured in the camera’s field

of view. One of these two participants reported that in a few of the EA = 90° trials, his

attention shifted to the bottom of the mFOV as it changed shape and size. This may have led

participants to select the dFOV condition as being easier to identify the shape of the route.

In summary, the participants’ ratings of the Display conditions revealed some interesting

findings. First, the ordinal relationships between the single FOV and the extended FOV

conditions were consistent with the hypotheses, as well as the judges’ PCM results. However,

participants expressed a clear preference for the angled viewpoint over the top down view,

which was counter to the hypotheses and the evaluation of their route identification data.

5.4.4 Summary Contrary to the hypothesised improvements in performing the target detection task using

angled viewpoints and in the extended FOV conditions, the results for Experiment 3 did not

support the hypotheses. The results showed no differences in target detection performance

between the six combinations of Display size and Elevation angle.

96

The route identification results, on the other hand, did support both hypotheses concerning

the Display size and Camera Elevation angle. For Display size, performance was judged to

be highest in the mFOV condition, followed by dFOV and then sFOV, for both EA = 45° and

90°. Pairwise contrasts revealed significant differences in the hypothesised directions, except

between {45°, dFOV} and {45°, mFOV}. For the Elevation angle, performance using the

top-down view (EA = 90°) was judged to be better than at an angled viewpoint (EA = 45°)

for all three Display sizes. Contrasts revealed that all pairwise differences were significant,

except between {45°, dFOV} and {90°, dFOV}.

The participants’ ratings of the Display conditions revealed some interesting findings. First,

the ordinal relationships between the single FOV and the extended FOV conditions were

consistent with the hypotheses, as well as the judges’ PCM results. However, participants

expressed a clear preference for the angled viewpoint over the top down view, which was

counter to the hypotheses and the evaluation of their route identification data.

97

Chapter 6. Conclusions

The demands of visual tasks in a number of domains have brought to light the frequently

encountered difficulty in forming and maintaining an accurate cognitive map while scanning

a scene for objects of interest. One notable example of this challenge is in aerial search, in

which operators may be tasked with searching for targets using a narrow field of view (FOV)

camera system, while maintaining global awareness of the environment in the event that a

target is detected. In doing so, the operator may be forced to trade off performance in one

task in order to accomplish the other, which may lead to targets being missed, disorientation

and/or a loss of understanding of the surrounding environment. An investigation of the

research literature on target detection and route identification tasks revealed a concerted

effort to provide means for enhancing human performance in aerial search type tasks. More

broadly, the extensive literature on how humans form cognitive maps and the range of

methods related to how cognitive maps can be externalised and evaluated provides a strong

theoretical basis for the tasks and evaluation methods used in the present study.

In considering ways to potentially improve human performance in these types of tasks, the

continued increase in computer processing power and the dropping cost of computing

hardware present an intriguing software solution in the form of real-time image mosaicing.

Essentially one is able to generate and present an artificially broadened FOV by using a

series of previously viewed image frames, aligned and stitched together from a video source.

This has some interesting properties, particularly for cases in which the camera is traversing

a landscape. The ribbon of images formed from image mosaicing directly displays the path of

the camera, without requiring the observer to mentally integrate successive image frames.

Furthermore, whenever the mosaicing algorithm uses specific features of the input image as a

reference, and whenever any camera rotations relative to that reference occur, the result is an

easily perceivable explicit rotation of the outer mosaic image frame, which is able to

communicate the extent of the relative rotation directly to the observer. It was surmised that

this shape property, in addition to the extended FOV size, would enhance operator

performance in spatial awareness tasks. Furthermore, the present study sought to investigate

the effect of viewing perspective on spatial awareness, in light of its practical implications

98

for conducting search tasks from an elevated viewpoint, as well as the many issues cited in

the research on perspective views in 2D and 3D displays.

6.1 Summary of experimental results

6.1.1 Experiment 1 Experiment 1 was an exploratory investigation into the potential benefits of real-time image

mosaicing. Inspired by the work of Morse et al. (2008), who found that an extended FOV

through mosaicing was helpful in a target detection task, the first experiment comprised three

major modifications to Morse et al.’s experiment. First, an additional type of display that was

twice the size (dFOV) of the single FOV (sFOV) was introduced, in order to determine

whether any potential improvements in performance with a mosaic display (mFOV) were due

to the size of the mosaic or the shape property of the mosaic. Second, the target detection

task was modified to provide a means to analyse the task under a signal detection theory

paradigm. This was accomplished by creating discrete events that either did or did not

contain targets. Third, a route identification task was introduced, by having the camera fly

over Routes that contain straight and curved sections of different lengths and curvatures. The

9 participants were asked watch a video of the flyover for approximately 90 sec and while

searching for targets. After the video ended, they were asked to identify the Route overflown

from within a 10x10 grid that contained the correct Route. It was hypothesised that

performance using the mosaiced FOV (mFOV) would be greater for both the target detection

and route identification tasks compared to sFOV and dFOV.

The results for the target detection and global tasks for Experiment 1 were not consistent

with the hypotheses. In the target detection task, there were no False Alarms recorded, thus

preventing a signal detection theory analysis. Using a simpler percent accuracy metric,

performance was found to be highest for the smallest size FOV, despite the fact that targets

appeared in the FOV twice as long in the mFOV and dFOV. The lower levels of performance

in the two extended FOV conditions were approximately equal. The influence of distractors

in the environment may have been the cause for this, as the number of target and non-target

objects to scan increased in the extended FOV conditions.

99

Whereas Morse et al. (2008) used targets that could be identified along a single salient

dimension of colour, the targets in the present study were more closely matched to the

environment with respect to colour, texture and size. It was surmised that this effect, in

retrospect, required participants to serially search the enlarged FOVs containing more

distractors. Coupled with inadequate scanning of the display as the objects passed across the

extended FOV, this likely led to better detection performance in the relatively small size

sFOV. Although the target detection results were surprising, they were not unprecedented.

For example, Crebolder et al. (2003) concluded that an “intermediate” FOV offered best

performance in a survey of DRDC research in aerial search tasks, stating that the results are

task dependent. Indeed, this appears to be case here (and in Experiment 3) where the

characteristics of the targets appeared to play a role as well.

In the route identification tasks, the participants’ route selections were compared to the

correct Route by computing a metric of “distance” between the two Routes using a novel grid

response technique. Two metrics, the “Euclidean” and “City block”, were used. Neither set

of results showed any significant differences between the three display conditions. In other

words, the participants performed the task with similar accuracy despite having an extended

FOV in some conditions. It was surmised that the parity in the performances was due to the

simplicity of the scene about which they were asked to maintain spatial awareness. The

sequence of straight and curved sections in the Routes was too predictable, and the long

flyover time may not have demanded that continual attention be paid to the global task.

6.1.2 Experiment 2 The lessons learned from Experiment 1 led to a focus in Experiment 2 on the issue of

enhancing global task performance. Target detection was thus put aside in favour of devising

an appropriate environment that would require continual attention to form an accurate

cognitive map. To this end, two major changes were made. First, the top-down view of

Experiment 1 was replaced with an angled viewpoint, in order to simulate the kind of

viewing perspective that might be adopted in a SAR type task. Secondly, inspired by

complex terrain features often found in naturalistic landscapes, the environment was changed

to consist of a simulated forest terrain with a winding river running through it. The river was

designed with a number of turns of various curvatures to ensure that sections would not

100

reasonably be memorised. The participants flew over sections (called “Routes”) of the long

river for approximately 20 seconds, after which they were asked to identify which section of

the long river they had just flown over.

The critical factor investigated in Experiment 2 was the Height at which the camera flew

over the terrain. Increasing the Height has an effect similar to extending the FOV, as a

greater portion of the terrain can be viewed at once. On the one hand using an altitude that

was too high might allow the route identification task to be completed without any need for

an extended FOV. Conversely, using an altitude that was too low might render the task too

difficult to accomplish under any conditions. Therefore, in order to avoid potential floor and

ceiling effects in the Route identification data, one of the goals of Experiment 2 was to select

an appropriate Height for the new spatial task environment and the sFOV condition. Four

levels of Height were tested; it was hypothesised that increasing Height above the terrain

would result in progressively better Route identification performance.

Data from 7 participants were collected in the form of their Route selections. Observing the

aggregate Route data from all participants relative to the correct Routes, it was clear that

objective computational measures such as RMS error could not adequately characterise the

subtleties in evaluating the closeness of the participants’ selections relative the correct Route.

As Kitchin and Blades (2002) cautioned, careful attention must be paid to the method of

evaluating the accuracy of cognitive maps. As such, another objective method was devised,

whereby performance was evaluated as a series of paired comparisons. Volunteer judges

were asked to select which of two sets of route selections (each representing the collective

responses from the participants for a particular Route) more closely matched the correct

Route. The aggregated paired comparison data from 21 volunteer judges were processed

using Thurstone’s (1927) Paired Comparisons method (PCM) to produce an equal interval

scale that quantified route identification performance in terms of closeness to the Correct

Route.

The PCM results confirmed the hypothesis that the Height has a significant effect on Route

identification. The one way analysis and contrasts revealed that as the Height increased, the

aggregated Routes were judged to more closely match the Correct Route. The only

discrepancy occurred between Heights of 56m and 92m, where no significant difference was

101

found. Of equal importance within the broader research context was that a particular Height

for this particular winding river plus route identification task, H3 = 92m, could be selected

for the subsequent experiment. The third outcome of Experiment 2 was confirmation of the

viability of using the Paired Comparisons method and a group of informed external judges to

evaluate subtle yet complex differences between selected routes, in the absence of simpler

objective metrics.

6.1.3 Experiment 3 The final experiment in the present study revisited the original research question regarding

the effect of image mosaicing on target detection and route identification. As in Experiment

1, performance in both target detection and route identification tasks was evaluated, once

again across three display conditions: {sFOV, mFOV, dFOV}. Concurrently, the effect of

viewing perspective was tested, for two camera elevation angles: {EA=45°, EA=90°}, in

order to determine whether the Experiment 2 supposition regarding the expected difficulty in

forming accurate cognitive maps from an angled viewpoint was in fact valid. In total, 6

combinations of conditions were evaluated. The 13 participants were again asked to perform

a target detection task during a 25 second flyover video and then identify a Route within the

long winding river, using the same response method as in Experiment 2. They were also

presented a series of paired comparisons to indicate which conditions they felt allowed them

to more accurately identify the Correct Route.

The target detection data contained, as in Experiment 1, no False Alarms, and a simple metric

of percent detection accuracy was thus used. No difference was found in performance across

the three display conditions, and thus the hypotheses were not supported. Concerning the

factor of Elevation angle, the work of Stager (1974) suggests that an angled viewpoint would

allow objects to remain in the visual field longer due to the decreased angular velocity when

gazing away from the perpendicular underneath the aircraft. Thus it was hypothesised that

target detection at EA = 45° would be better than at EA = 90°. However, it was found that

there was no difference at the two levels of EA. It is possible that scaling the terrain and its

features to more closely approximate a realistic altitude (of approximately 1500 ft) would

have brought the results in line with those of Stager (1974).

102

Concerning the effect of Display size, it was hypothesised that target detection would be

better for an enlarged FOV (mFOV and dFOV) compared to sFOV. However, no significant

differences were found between the Display sizes. An important difference may have related

to the structure of the task, in which participants were asked to respond to targets within a

designated area of the screen. Despite the assurance gained from knowing where on the

screen a target was detected (as opposed to the more disruptive response method used in

Experiment 1), this may have inadvertently led to no differences in performance between

Display sizes.

For the route identification task, PCM data from 24 external judges were used to derive an

equal interval scale for the six conditions along the continuum of closeness to matching the

correct Route. Performance was found to be significantly different among the six conditions.

Plotting the scale values on 2D graphs showed that route identification performance for

mFOV was judged to be superior, followed by dFOV, and then sFOV for both levels of

Elevation angle. This was confirmed by contrasts, which showed significant pairwise

differences between conditions in all but one contrast. The unique shape properties of the

image mosaic, created by the trail of images aligned and stitched together, is believed to have

facilitated a more accurate cognitive map of the routes traversed during the flyover.

Furthermore, performance at EA = 90° was judged to be superior to that at EA = 45° for all

three Display sizes, but only reached statistical significance for sFOV and mFOV.

The results from the participants’ subjective ratings of the effectiveness of the different

display conditions with regards to route identification revealed that they found the extended

FOV conditions (dFOV and mFOV) to afford better performance, although the judgments of

which of these two conditions was more effective appeared to depend on the camera

Elevation angle. Curiously, the participants found that the angled viewpoint allowed them to

identify the Route more accurately, even though the data evaluated by the external judges

was in disagreement. The evaluation of the selected Routes showed that performance in the

top down view was generally better. This may have been due to problems of perspective

foreshortening outweighing the benefits of “preview” of upcoming spatial information.

103

6.1.4 Synthesis The work in this dissertation comprises three experiments motivated by the question of

whether a software technology called real-time image mosaicing can enhance performance in

spatial awareness tasks. The conclusions drawn from this investigation suggest that the

answer is a (qualified) “yes”. For the route identification tasks developed throughout

Experiments 1 – 3, the hypothesised advantages of mosaicing, including the extended FOV

and the shape of the camera’s path being displayed directly on the screen, appeared to be

helpful in tasks for identifying complex routes, relative to a FOV of equivalent or smaller

fixed size. Furthermore, the effect of Elevation angle was consistent with the literature, in

that route identification from a top down view was better than at an angled (45°) viewpoint.

However, counterintuitive results were discovered in the target detection tasks in

Experiments 1 and 3. These results were explainable when compared to previous work in

visual search (e.g. Crebolder et al. (2003)), as well as looking at the important differences in

target characteristics between the present work and the work of Morse et al. (2008).

6.2 Limitations

Although the three experiments were carefully designed on the basis of both surmised

advantages of image mosaicing and past research in cognitive mapping in real-time spatial

tasks, a number of limitations are acknowledged.

The route identification task was completed in two steps: the participants watched the flyover

video using an ego-centric track up view, and then identified the route he flew over from a

number of options presented from an exocentric North up perspective. Even though this may

have been representative of actual aerial search scenarios, a transformation from one

perspective to the other nonetheless may have been a confounding factor in the experiment.

In Experiments 1 and 3, efforts were made to design the target detection task so that the data

could be analysed under a signal detection theory paradigm. This involved iterating through

several different target sizes, shapes and textures in order to develop a target, and a

background, whose discriminability was such that participants would occasionally respond to

targets when in fact there were none (i.e. False Alarms). However, no False Alarms occurred

in either Experiment. As such, although a preliminary measure of target detection

104

performance could be gleaned from the data, no insights could be gained about detection

sensitivity (d’) nor about the biases of the participants in responding YES or NO to targets in

the search environment.

In Experiment 1, it was identified that the route identification task was too easy for the

participants, due to the relatively simple route shapes. As such, no differences were found

between the three Display sizes, which was contrary to the hypotheses.

In Experiment 2, an outlier analysis revealed that the similarity between Routes 2 and 5

resulted in a scale for Route 5 that was inconsistent compared to the scales for the other five

Routes. This presented a limitation in the selection of the Routes, and another set of Routes

was selected for Experiment 3.

In Experiment 3, markers on the Route flyover window demarcated where participants were

asked to search for targets. This was done to ensure that the participants were searching in

approximately equal areas of the display. However, admittedly in hindsight, this may have

contributed to there being no differences in performance across the display conditions.

Although the method of Paired Comparisons appeared to be appropriate for evaluating the

complex Routes in the present study, one limitation of the technique concerns the granularity

of the information contained in each individual judgement. Due to the practical

considerations of presenting a reasonable number of paired comparisons to the judges19

, the

route selections were aggregated over all participants for each of the conditions.

Nevertheless, however impractical, providing comparisons between two individual route

selections would have been provided more granular data.

Another limitation to the Thurstonian method of Paired Comparisons was that the aggregate

data do not scale to conventional methods for testing statistical significance. Although a one-

way analysis and contrasts based on the Scheffé method using χ2 test statistics are available,

19

For example, one approach is to present only two selected routes along with one Correct route for each paired

comparison. For the three experiments in the present study, this would have resulted in hundreds of judgments

to be made. Therefore, for practical considerations, the route selections were aggregated over all participants for

each of the conditions.

105

the preferences indicated by judges violate the assumptions of the more conventional

ANOVA.

Furthermore, the reader is reminded that in order to obtain data from a large number of

judges, they were recruited on a volunteer basis and were asked to complete the paired

comparisons on the Web. Although considerable effort was put into providing instructions

that explained the task the participants performed as well as the types of route selection

errors that could occur, it was impossible to know how well the judges understood the nature

of the task they were asked to perform.

6.3 Contributions

The primary contribution of this research was in showing that global spatial awareness can be

enhanced for some tasks by using real-time image mosaicing. The experimental results

revealed that the mosaicing software, by using recently captured image frames, both

extended the size of the useful FOV and directly displayed information of the shape of the

camera’s path. These features are believed to have provided a means for participants to more

easily form a cognitive map of the environment in comparison to fields of fixed shape and/or

smaller size.

A careful examination of the most common methods of evaluating cognitive maps led to the

conclusion that conventional algorithmic methods would be inadequate for assessing the

subtle complexities of Route selections carried out on the basis of the recently formed

cognitive maps. The Paired Comparisons method (PCM) proved to be an effective, although

labour intensive method of objectively evaluating Route selections relative to a known

Correct Route, especially when those Routes are complex. The reliance on a large number of

judges reduces the effects of bias, as long as they are well informed about the desired quality

that is to be evaluated.

Given the variety of domains in which the conflicting demands of maintaining both local and

global spatial awareness exist, it is envisioned that the concept introduced in this research of

providing an extended FOV through image mosaicing may be readily transferrable beyond

the class of visual search (and rescue) scenarios used here as examples. Furthermore, because

such solutions can be implemented solely in software, without the need for additional camera

106

hardware, there is a potential for the output of any existing video system to be retrofitted with

this software.

Finally, in recognition of a number of desirable characteristics for researchers to be able to

evaluate performance in cognitive mapping, novel response methods were developed for

evaluating performance in identifying flyover routes. In Experiment 1, participants selected

routes within a two-dimensional grid of alternatives, varying along two dimensions

(curvature and length ratio). The response method provides a number of benefits including

granularity, the ability to recognise the route rather than recall it, and the recording of the

entire route. Furthermore, these benefits were carried over into the response method for

Experiments 2 and 3 when more complex routes were adopted.

6.4 Suggestions for future work

The three experiments presented here represent a starting point for investigating the potential

benefits of real-time image mosaicing for enhancing spatial task performance. During the

development of the experiments, a number of issues surfaced. For example, although the

factor of Height was identified as critical to the research, there were others that fell beyond

the scope of this project but nevertheless have theoretical and practical implications for the

use of real-time mosaicing.

One of the more intriguing issues is that of objects moving in the environment. Consider for

example, an object such as a car travelling alongside the river as the camera flies over the

terrain. Not surprisingly, using a camera system with a single size FOV, the car would be

seen to be moving within the scene until it falls outside the boundaries of the display.

Similarly, in the mosaicing condition, the car would move within the boundaries of the most

recent image frame. However, when the car exits that last image frame, that final frame

becomes stitched into the mosaic. In other words, the object that was just seen moving now

appears frozen in the image mosaic (Szeliski, 1996).

One can imagine a host of interesting real-time scenarios where this might be either useful or

a potential hindrance. For example, if a UAV were tracking a moving object using a camera

system with real-time image mosaicing, the human operator observing the camera feed may

misjudge the location of the object, as it may have changed speed or direction after appearing

107

in the mosaic. On the other hand, a frozen image mosaic may be useful for an operation in

which the instantaneous positional relationships among multiple objects must be tracked. In

such situations, creating a mosaic from multiple images may be reveal configurations that

would not be possible with a conventional, relatively limited FOV.

Finally, as mentioned earlier, the experiments in the present study cannot be claimed to be

directly representative of actual live aerial search tasks. However, the experimental results

make a compelling case for continuing the effort to move this technology forward, into the

domains discussed earlier, such as telerobotics and surgery, histopathology and remote

camera surveillance.

108

References

Andre, A., Wickens, C., & Moorman, L. (1991). Display formatting techniques for

improving situation awareness in the aircraft cockpit. The International Journal of

Aviation Psychology, 1(3), 205–218.

Austin, R. (2010). Unmanned aircraft systems: UAVS design, development and deployment.

West Sussex, UK: John Wiley and Sons.

Baker, K., & Youngson, G. (2007). Advanced Integrated Multi-sensor Surveillance (AIMS)

Operator Machine Interface (OMI) Definition Study (Tech. Rep.). Toronto: Defence

R&D Canada.

Beck, R. J., & Wood, D. (1976). Cognitive Transformation of Information from Urban

Geographic Fields to Mental Maps. Environment and Behavior, 8(2), 199–238.

Bourgeois, F., Guiard, Y., & Lafon, M. B. (2001). Pan-zoom coordination in multi-scale

pointing. In CHI ’01 extended abstracts on Human factors in computing systems - CHI

’01 (p. 157). New York, New York, USA: ACM Press.

Bowman, D. (2002). Principles for the design of performance-oriented interaction

techniques. In K. Stanney (Ed.), Handbook of Virtual Environments (pp. 277–300).

Mahwah, NJ: Lawrence Erlbaum.

Boyer, B., Campbell, M., May, P., Merwin, D., & Wickens, C. D. (1995). Three-

Dimensional Displays for Terrain and Weather Awareness in the National Airspace

System. Proceedings of the Human Factors and Ergonomics Society Annual Meeting,

39(1), 6–10.

Brickner, M. S., & Foyle, D. C. (1990). Field of View Effects on a Simulated Flight Task

with Head-Down and Head-Up Sensor Imagery Displays. Proceedings of the Human

Factors and Ergonomics Society Annual Meeting, 34(19), 1567–1571.

Brown, L. (1992). A survey of image registration techniques. ACM computing surveys

(CSUR), 24(4), 325 –639.

Burros, R. (1951). The application of the method of paired comparisons to the study of

reaction potential. Psychological review(2), 60–66.

Cadwallader, M. (1979). Problems in Cognitive Distance: Implications for Cognitive

Mapping. Environment and Behavior, 11(4), 559–576.

Canter, D. (1977). The Psychology of Place. London: Architectual Press.

109

Carver, E. (1990). Search of imagery from airborne sensors-implications for selection of

sensor and method of changing field of view. In D. Brogan (Ed.), Visual Search.

London, United Kingdom: Taylor & Francis.

Crebolder, J., Unruh, T., & McFadden, S. (2003). Search performance using imaging

displays with restricted field of view (Tech. Rep.). Toronto, Canada: Defence R&D

Canada.

Croft, J., Pittman, D., & Scialfa, C. (2007). Gaze behavior of spotters during an air-to-ground

search. Human Factors, 49(4), 671–678.

David, H. (1988). The method of paired comparisons (2nd ed.). London: Griffin.

Draper, M., & Ruff, H. (2000). Multi-sensory displays and visualization techniques

supporting the control of unmanned air vehicles. In IEEE International Conference on

Robotics and Automation. San Fransico, California.

Drury, J. L., Riek, L., & Rackliffe, N. (2006). A decomposition of UAV-related situation

awareness. In Proceeding of the 1st ACM SIGCHI/SIGART conference on human-robot

interaction - HRI ’06 (pp. 88–94). New York, New York, USA: ACM Press.

Dunn-Rankin, P., Knezek, G. A., Wallace, S. R., & Zhang, S. (2004). Scaling methods.

Mahwah, NJ: Lawrence Erlbaum.

Durbin, J. (1951). Incomplete blocks in ranking experiments. British Journal of Statistical

Psychology, 4(2), 85–90.

Edwards, A. (1957). Techniques of attitude scale construction. New York, NY: Appleton-

Century-Crofts, Inc.

Ellis, S., Mcgreevy, M. W., & Hitchcock, R. J. (1987). Perspective traffic display format and

airline pilot traffic avoidance. Human Factors: The Journal of the Human Factors and

Ergonomics Society, 29(4), 371–382.

Evans, G. W., Fellows, J., Zorn, M., & Doty, K. (1980). Cognitive mapping and architecture.

Journal of Applied Psychology, 65(4), 474–478.

Ferguson, E. L., & Hegarty, M. (1994). Properties of cognitive maps constructed from texts.

Memory & cognition, 22(4), 455–73.

Fujita, N., Klatzky, R. L., Loomis, J. M., & Golledge, R. G. (2010). The Encoding-Error

Model of Pathway Completion without Vision. Geographical Analysis, 25(4), 295–314.

Gärling, T., Böök, A., & Lindberg, E. (1985). Adults’ memory representations of the spatial

properties of their everyday physical environment. In R. Cohen (Ed.), The development

of spatial cognition (pp. 141–184). Hillsdale, NJ: Lawrence Erlbaum.

110

Gärling, T., Böök, A., Lindberg, E., & Nilsson, T. (1981). Memory for the spatial layout of

the everyday physical environment: Factors affecting rate of acquisition. Journal of

Environmental Psychology, 1(4), 263–277.

Golledge, R. (1993). Geographical perspectives on spatial cognition. In T. Garling & R. G.

Golledge (Eds.), Behavior and environment psychological and geographical

approaches (pp. 16–46). Amsterdam: Elsevier.

Golledge, R. (1999). Human wayfinding and cognitive maps. In R. G. Golledge (Ed.),

Wayfinding behavior: Cognitive mapping and other spatial processes (pp. 5–45).

Baltimore, MD: The Johns Hopkins University Press.

Golledge, R. G. (1978). Representing, Interpreting and Using Cognized Environments.

Papers in Regional Science, 41(1), 169–204.

Gridgeman, N. (1963). Significance and adjustment in paired comparisons. Biometrics, 19(2),

213–228.

Halpern, D. (2000). Sex differences in cognitive abilities (3rd ed.). Mahwah, NJ: Lawrence

Erlbaum.

Haskell, I., & Wickens, C. (1993). Two-and three-dimensional displays for aviation: A

theoretical and empirical comparison. The International Journal of Aviation

Psychology, 3(2), 87–109.

Hodgson, M. (1998). What size window for image classification? A cognitive perspective.

Photogrammetric Engineering & Remote Sensing, 64(8), 797–807.

Hopcroft, R., Burchat, E., & Vince, J. (2006). Unmanned aerial vehicles for maritime patrol:

human factors issues (DSTO-GD-0463) (Tech. Rep.). Victoria, Australia: DSTO

Defence Science and Technology Organisation.

Irani, M., Anandan, P., Bergen, J., Kumar, R., & Hsu, S. (1996). Efficient representations of

video sequences and their applications. Signal Processing: Image Communication, 8(4),

327–351.

Irani, M., & Peleg, S. (1991). Improving resolution by image registration. CVGIP: Graphical

Models and Image Processing, 53(3), 231–239.

Jackson, J., & Fleckenstein, M. (1957). An evaluation of some statistical techniques used in

the analysis of paired comparison data. Biometrics, 13(1), 51–64.

Jeon, S., & Kim, G. J. (2008). Providing a Wide Field of View for Effective Interaction in

Desktop Tangible Augmented Reality. In 2008 IEEE Virtual Reality conference (pp. 3–

10). IEEE.

111

Kearns, M. J., Warren, W. H., Duchon, A. P., & Tarr, M. J. (2002). Path integration from

optic flow and body senses in a homing task. Perception, 31(3), 349–374.

Kendall, M., & Smith, B. (1940). On the method of paired comparisons. Biometrika, 31(3),

324–345.

Kitchin, R. (1996). Methodological convergence in cognitive mapping research:

Investigating configurational knowledge. Journal of Environmental Psychology, 16,

163–185.

Kitchin, R., & Blades, M. (2002). The cognition of geographic space. New York, NY: I.B.

Tauris & Co Ltd.

Kleiner, M., Brainard, D., & Pelli, D. (2011). Psychtoolbox Wiki. Retrieved June 14, 2013,

from http://psychtoolbox.org/

Kuipers, B. (1978). Modeling Spatial Knowledge. Cognitive Science, 2, 129–153.

Lavigne, V., & Ricard, B. (2005). Step-Stare Image Gathering for High-Resolution

Targeting - RTO-MP-SET-092 (Tech. Rep.). Neuilly-sur-Seine, France: RTO. Retrieved

from

http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA4720

01

Liben, L. S. (1982). Children’s large-scale spatial cognition: Is the measure the message?

New Directions for Child and Adolescent Development, 1982(15), 51–64.

Linn, M. C., & Petersen, A. C. (1985). Emergence and characterization of sex differences in

spatial ability: a meta-analysis. Child development, 56(6), 1479–98.

Lloyd, R., & Heivly, C. (1987). Systematic Distortions in Urban Cognitive Maps. Annals of

the Association of American Geographers, 77(2), 191–207.

Lo, H. M. H. (2008). ImProViSur : An Image Processing System for Improving Visualization

for Laparoscopic Surgery (Unpublished doctoral dissertation). University of Toronto.

Loomis, J. M., Klatzky, R. L., Golledge, R. G., & Philbeck, J. W. (1999). Human navigation

by path integration. In R. G. Golledge (Ed.), Wayfinding behavior: Cognitive mapping

and other spatial processes (pp. 125–151). Baltimore, MD: Johns Hopkins University

Press.

Lowrey, R. (1970). Distance concepts of urban residents. Environment and Behavior, 2, 52–

73.

Lynch, K. (1960). The Image of the City. Harvard, MA: The MIT Press.

112

Mann, S. (2002). Intelligent image processing. Wiley-IEEE Press.

Michael, N., Scaramuzza, D., & Kumar, V. (2012). Special issue on micro-UAV perception

and control. Autonomous Robots, 33, 1–3.

Montello, D. (1991). The measurements of cognitive distance: methods and construct

validity. Journal of Environmental Psychology, 11, 101–122.

Montello, D., Lovelace, K. L., Golledge, R. G., & Self, C. M. (1999). Sex-Related

Differences and Similarities in Geographic and Environmental Spatial Abilities. Annals

of the Association of American Geographers, 89(3), 515–534.

Morse, B. S., Gerhardt, D., Engh, C., Goodrich, M. a., Rasmussen, N., Thornton, D., &

Eggett, D. (2008). Application and evaluation of spatiotemporal enhancement of live

aerial video using temporally local mosaics. In IEEE conference on computer vision and

pattern recognition (pp. 1–8). IEEE.

Mosteller, F. (1951). Remarks on the method of paired comparisons: I. The least squares

solution assuming equal standard deviations and equal correlations. Psychometrika,

16(1), 3–9.

O’Brien, J., & Wickens, C. (1997). Free flight cockpit displays of traffic and weather: Effects

of dimensionality and data base integration. In Proceedings of the Human Factors and

Ergonomics Society Annual Meeting (pp. 18–22).

Pietriga, E., Appert, C., & Beaudouin-Lafon, M. (2007). Pointing and beyond: an

operationalization and preliminary evaluation of multi-scale searching. In Proceedings

of the SIGCHI conference on Human factors in computing systems (pp. 1215–1224).

Pollio, J. (1968). Stereo-Photographic Mapping From Submersibles. In C. N. DeMund (Ed.),

Underwater photo-optical instrumentation applications ii.

Ross, R. (1934). Optimum orders for the presentation of pairs in the method of paired

comparisons. Journal of Educational Psychology, 375–382.

Russell, J. A., & Ward, L. M. (1982). Environmental Psychology. Annual review of

psychology, 32 , 651–688.

Schmid, C., Mohr, R., & Bauckhage, C. (2000). Evaluation of Interest Point Detectors.

International Journal of Computer Vision, 37(2), 151–172.

Shum, H., & Szeliski, R. (2000). Systems and experiment paper: Construction of panoramic

image mosaics with global and local alignment. International Journal of Computer

Vision, 36(2), 101–130.

113

Siegel, A., & White, S. (1975). The development of spatial representations of large-scale

environments. In H. W. Reese (Ed.), Advances in child development and behavior.

Academic Press, Vol. 10.

Stager, P. (1974). Visual search capability in Search And Rescue (SAR) – DCIEM report no.

74-R-1009 (Tech. Rep.). Toronto: DCIEM: Defence and Civil Institute of

Environmental Medicine.

Stager, P., & Angus, R. (1975). Eye-movements and related performance in SAR visual

search - DCIEM report no. 75-X11 (Tech. Rep.). Toronto: DCIEM: Defence and Civil

Institute of Environment Medicine. Retrieved from http://pubs.drdc-

rddc.gc.ca/BASIS/pcandid/www/engpub/DDW?W%3DSYSNUM=93393

Stager, P., & Angus, R. (1978). Human Factors : The Journal of the Human Factors and

Ergonomics Society.

Starks, T., & David, H. (1961). Significance tests for paired-comparison experiments.

Biometrika, 48(1), 95–108.

Szeliski, R. (1994). Image mosaicing for tele-reality applications. In Proceedings of the

Second IEEE Workshop on Applications of Computer Vision (pp. 44–53). Cambridge:

IEEE Computer Society Press.

Szeliski, R. (1996). Video mosaics for virtual environments. IEEE Computer Graphics and

Applications, 16(2), 22–30.

Szeliski, R. (2006). Image Alignment and Stitching: A Tutorial. , 273–292.

Tan, D. S., Gergle, D., Scupelli, P. G., & Pausch, R. (2004). Physically large displays

improve path integration in 3D virtual navigation tasks. Proceedings of the 2004

conference on Human factors in computing systems - CHI ’04 , 6(1), 439–446.

Thorndyke, P. W., & Hayes-Roth, B. (1982). Differences in Spatial Knowledge Acquired

and Navigation from Maps and Navigation. Cognitive Psychology, 14(4), 560–589.

Thurstone, L. (1927). A law of comparative judgment. Psychological Review (34), 273–286.

Thurstone, L. (1932). Stimulus dispersions in the method of constant stimuli. Journal of

Experimental Psychology, 15(3), 284–297.

van Breda, L., & Veltman, H. A. (1998). Perspective information in a cockpit as a target

acquisition aid. Journal of Experimental Psychology: Applied, 4(1), 55–68.

van Erp, J. B. (2000). Controlling unmanned vehicles: The human factors solution

(ADPO10325) (Tech. Rep.). Soesterberg, Netherlands: TNO Human Factors Research

Institute.

114

Vos, J. (1990). Visual search: Trade off between magnification and field width. In D. Brogan

(Ed.), Visual search. London, United Kingdom: Taylor and Francis.

Wang, W. (2005). Human navigation performance using 6 degree of freedom dynamic

viewpoint tethering in virtual environments (Unpublished doctoral dissertation).

University of Toronto.

Warner, H., & Hubbard, D. (1992). Area-of-Interest Display Resolution and Stimulus

Characteristics Effects on Visual Detection Thresholds (Report No. AL-TR-1991-0134)

(Tech. Rep.). Williams Air Force Base: Armstrong Laboratory.

Wickens, C. D., & Hollands, J. G. (1999). Engineering psychology and human performance

(3rd ed.). Upper Saddle River, NJ: Prentice Hall.

Wickens, C. D., Liang, C. C., Prevett, T., & Olmos, O. (1996). Electronic maps for terminal

area navigation: effects of frame of reference and dimensionality. The International

journal of aviation psychology, 6(3), 241–71.

Wickens, C. D., Todd, S., & Seidler, K. (1989). Three-dimensional displays: Perception,

implementation, and applications: CSERIAC state-of-the-art report (Tech. Rep.). (Tech.

Rep. No. ARL-89-11/CSERIAC-89-1). Savoy, IL, Aviation Research Laboratory:

Aviation Research Laboratory. Retrieved from

http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA2599

37

Woods, D. (1984). Visual momentum: a concept to improve the cognitive coupling of person

and computer. International Journal of Man-Machine Studies, 21, 229–244.

Woods, R. L., Satgunam, P., & Bronstad, E. P. (2010). Statistical analysis of subjective

preferences for video enhancement. In Proceedings of SPIE – Human Vision and

Electronic Imaging XV (pp. 1–10).

115

Appendix 1. Statistical Outputs (Descriptive measures and

ANOVA results)

A1.1 Experiment 1 results

Descriptive Statistics

Mean Std.

Deviation

N

localm_D1 .8878 .10341 9

localm _D2 .7344 .15396 9

localm _D3 .7222 .18827 9

globalm _D1 2.5589 .68490 9

globalm _D2 2.3289 .50513 9

globalm _D3 2.4511 .34916 9

116

117

A1.2 Experiment 3 results

Descriptive Statistics

Mean Std. Deviation N

a45sFOV .7802 .03525 13

a45mFOV .7821 .03990 13

a45dFOV .7930 .03280 13

a90sFOV .7766 .03577 13

a90mFOV .7875 .03566 13

a90dFOV .7875 .03431 13

118

119

120

Appendix 2. Parameters for the long river for Experiment 2

The long river consisted of four sine functions superimposed to form a continuous winding

path. The total path length of the river was 18750m. The amplitude, frequency and phase

shift values of the four sine waves were adjusted to achieve an overall terrain with a number

and variety of curves along its path.

The following function was selected for the long river:

R(x) = A1*sin(2pi*f1*x+p1) + A2*sin(2pi*f2*x +p2) + A3*sin(2pi*f3*x +p3) +

A4*sin(2pi*f4*x +p4)

The parameters are listed in the table below.

Sinusoidal

Component (i)

Amplitude (Ai) Frequency (fi) Phase shift (pi)

1 0.5 1/14 = 0.0714 0.2

2 0.04 2.7*2.97/10 = 0.8 0

3 0.25 2.97/10 = 0.3 0

4 0.03 4.5*2.97/10 = 1.34 0

121

Appendix 3. Aggregated Route Selections by Participants

A3.1 Experiment 2 Routes

Height 1 = 20m

Height 2 = 56m

122

Height 3 = 128m

Height 3 = 164m

123

Appendix 4. Instructions for the set of paired comparisons

A4.1 Experiment 2 Paired comparisons instructions and form

Copy of instructions for Paired comparisons method

124

Samples of diagram pairs in the Paired comparisons method

125

A4.2 Experiment 3 Paired comparisons instructions and interface

You'll be asked to respond to a series of 90 comparisons, each comparison showing two diagrams. It should take less than 20 minutes to complete. An example is shown in Figure 1.

Figure 1

Each diagram contains a single dotted red line as well as a set of solid black lines. The red line

represents the shape of part of a river that I asked participants to identify from a very long winding

river in a perceptual experiment. They first watched a video flyover over a small section of the river,

and then had to identify which section they flew over. Here’s a video showing one trial of the

experiment:

http://www.youtube.com/watch?v=dr266Kin2ws The two diagrams represent participant responses under different experimental conditions; however, the red lines in the two diagrams will always be identical. The black lines represent the responses that the participants provided. So you will see a set of black lines whose shapes approximate the shape of the red line. But, as you can see in the diagrams, they made some errors in responding to the shape of the line. For each comparison, please indicate which collective set of black lines (TOP or BOTTOM) more closely resembles the shape of the red line.

In each set of diagrams, you may see a few black lines that are “outliers”, so that at first glance they

may seem very different from the red shape. Here are some descriptions of possible outliers.

http://www.youtube.com/watch?v=dr266Kin2ws

126

1 - The shapes might be “mirrored” so that the turns end up reversed (ie. turning to the left when the

river actually turned to the right), such as shown in Figure 2. In other words, you may decide that this

is not such a large error if, for example, everything else about that particular shape is very close to

the actual (red) shape.

Figure 2

2 - Only a small portion of the route may be incorrect, which may nevertheless make the overall

shape look very different. For example, participants may have misjudged the length of only one

straight segment, which may have had the effect of causing the overall path to appear to deviate

greatly after that part. Or, similarly, they may have misjudged the sharpness of only one curve, which

may have caused an apparently large change to the overall shape. In other words, as before, you

have to decide whether deviations from the ideal red shape due to such factors should be weighted

strongly or weakly.

3 - Similar to the point above, in some cases participants may have committed a relatively small error in selecting the starting point for their chosen shape. Such a slight error could have resulted in an apparently large discrepancy between the chosen and the ideal (red) shape, even if the shapes were to match almost perfectly, simply due to the fact that they are ‘out of phase.” This is shown in Figure 3.

127

Figure 3

It is imperative, therefore, that you make your selection based on the ensemble of all the routes. In

other words, make sure you consider the collective behaviour of all of the black lines, including those

for which there are apparently large deviations that may have been a consequence of some of the

relatively minor errors discussed above. Conversely, do not make your selection based on only the

smallest number of apparent outliers.

In summary, for each of the 90 comparisons, please indicate which collective set of black lines (TOP or BOTTOM) more closely resembles the shape of the red line.

Thank you for your time!

==========

The judges were presented with pairs of ensembles of selected routes (the black curves) and

were instructed for each pair of ensembles to indicate which of the two ensembles in general

more closely matches the corresponding correct route (the red curve). All judges completed

the full set of 90 comparisons by visiting a Website that presented the 90 pairs in a

128

randomised sequence. The judges responded by clicking on buttons indicating their selection

of either the TOP or BOTTOM set of lines, before moving onto the next comparison.

129

Appendix 5. Calculations for the linear scales using the

Paired Comparisons Method (PCM)

A5.1 Experiment 2: Paired comparisons without Route 5

Raw matrix values for the paired comparisons method without the Route 5 comparisons.

H1 H2 H3 H4

H1 - 71 73 98

H2 34 - 43 74

H3 32 62 - 75

H4 7 31 30 -

The raw matrix values were converted into proportions of the total number of judgements,

5*21 = 105. Note that the diagonal entries of the matrix are filled with a proportion of 0.5, as

it is assumed that being presented two sets of the same Routes, the judges overall would

select one of those sets 50% of the time.

H1 H2 H3 H4

H1 0.500 0.676 0.695 0.933

H2 0.324 0.500 0.410 0.705

H3 0.305 0.590 0.500 0.714

H4 0.067 0.295 0.286 0.500

The proportions in are then converted to Z-score values using the standard normal tables. The

columns in the confusion matrix are then summed and averaged over the number of stimuli

(4 in this case) to obtain the mean Z-scale for performance at each of the four Heights.

Because this is an equal interval scale, a shift of all the values does not affect the distances

between the scale values. Thus, as a last step, the scale values are shifted so that the lowest

scale value acts an anchor at a value of 0 for the scale.

130

H1 H2 H3 H4

H1 0 0.457 0.511 1.501

H2 -0.457 0 -0.229 0.538

H3 -0.511 0.229 0 0.566

H4 -1.501 -0.538 -0.566 0

Sums -2.469 0.148 -0.284 2.605

Means -0.617 0.037 -0.071 0.651

Means + 0.617 0 0.654 0.546 1.269

The final equal interval scale is presented below.

A5.2 Experiment 3: Judge Paired comparisons

45,sFOV 45,mFOV 45,dFOV 90,sFOV 90, mFOV 90, dFOV

45, sFOV - 97 100 91 120 108

45, mFOV 47 - 68 64 80 57

45, dFOV 44 76 - 56 95 88

90, sFOV 53 80 88 - 108 88

90, mFOV 24 64 49 36 - 45

90, dFOV 36 87 56 56 99 -

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(b)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(c)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(a)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(b)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

(c)Height

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(a)

131

Proportions of the total number of judgements, 6*24 = 144 in this case.

45, sFOV 45, mFOV 45, dFOV 90, sFOV 90, mFOV 90, dFOV

45, sFOV 0.500 0.674 0.694 0.632 0.833 0.750

45, mFOV 0.326 0.500 0.472 0.444 0.556 0.396

45, dFOV 0.306 0.528 0.500 0.389 0.660 0.611

90, sFOV 0.368 0.556 0.611 0.500 0.750 0.611

90, mFOV 0.167 0.444 0.340 0.250 0.500 0.313

90, dFOV 0.250 0.604 0.389 0.389 0.688 0.500

Z-score values using the standard normal tables, with calculations of adjusted Z-score values

anchored to lowest value.


45, sFOV 0 0.450 0.508 0.337 0.967 0.674

45, mFOV -0.450 0 -0.070 -0.140 0.140 -0.264

45, dFOV -0.508 0.070 0 -0.282 0.412 0.282

90, sFOV -0.337 0.140 0.282 0 0.674 0.282

90, mFOV -0.967 -0.140 -0.412 -0.674 0 -0.489

90, dFOV -0.674 0.264 -0.282 -0.282 0.489 0

Sums -2.937 0.784 0.027 -1.042 2.682 0.486

Means -0.490 0.131 0.005 -0.174 0.447 0.081

Means +

0.490 0 0.620 0.494 0.316 0.937 0.571

Means*√2 0 0.877 0.699 0.447 1.32 0.807

The final equal interval scale and 2D plots for the Judge Paired comparisons in Experiment 3

are shown below.

132

A5.3 Experiment 3: Participant Paired comparisons

45, sFOV 45, dFOV 45, mFOV 90, sFOV 90, dFOV 90, mFOV

45, sFOV - 11 10 2 4 5

45, dFOV 2 - 8 1 4 2

45, mFOV 3 5 - 1 3 1

90, sFOV 11 12 12 - 12 11

90, dFOV 9 9 10 1 - 6

90, mFOV 8 11 12 2 7 -

Proportions of the total number of judgements, 13 in this case.

0 0.2 0.4 0.6 0.8 1 1.2 1.4

45,sFOV 45,mFOV45,dFOV

90,sFOV 90,mFOV90,dFOV

sFOV dFOV mFOV0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Display size

Scale

valu

e

EA = 45

EA = 90

45° 90°0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Elevation angle

Scale

valu

e

sFOV

dFOV

mFOV

133


45, sFOV 0.5 0.85 0.78 0.15 0.31 0.38

45, dFOV 0.15 0.5 0.62 0.08 0.31 0.16

45, mFOV 0.23 0.38 0.5 0.08 0.23 0.08

90, sFOV 0.85 0.92 0.92 0.5 0.92 0.85

90, dFOV 0.69 0.69 0.77 0.08 0.5 0.46

90, mFOV 0.62 0.85 0.92 0.15 0.54 0.5

Z-score values using the standard normal tables, with calculations of adjusted Z-score values

anchored to lowest value.


45, sFOV 0 1.02 0.74 -1.02 -0.50 -0.29

45, dFOV -1.02 0 0.29 -1.43 -0.50 -1.02

45, mFOV -0.74 -0.29 0 -1.43 -0.74 -1.43

90, sFOV 1.02 1.43 1.43 0 1.43 1.02

90, dFOV 0.50 0.50 0.74 -1.43 0 -0.10

90, mFOV 0.29 1.02 1.43 -1.02 0.10 0

Sums 0.06 3.68 4.62 -6.32 -0.22 -1.82

Means 0.01 0.61 0.77 -1.06 -0.04 -0.30

Means +

1.06 1.07 1.67 1.82 0 1.02 0.76

The final equal interval scale and 2D graphs for the Participant Paired comparisons in

Experiment 3 are shown below.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

45,sFOV 45,dFOV 45,mFOV

90,sFOV 90,dFOV90,mFOV

134

sFOV dFOV mFOV0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Display size

Scale

valu

e

EA = 45

EA = 90

45° 90°0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Elevation angle

Scale

valu

e

sFOV

dFOV

mFOV

135

Appendix 6. Statistical tests for assumptions of

Thurstone’s Case V method

A6.1 Experiment 2: Paired comparisons without Route 5

Edwards (1957) details a procedure to check the assumptions of Thurstone’s Case V method

for Paired comparisons method (PCM), namely the property of additivity of the obtained

scale values. For example, consider three stimuli with the scales values ZS1 < ZS2 < ZS3 with

equal discriminal dispersions. If additivity holds, then obtaining D21 as the distance between

ZS1 and ZS2 on the scale and D23 as the distance between ZS2 and ZS3, implies that the

distance between ZS1 and ZS3 should equal D12 + D23. To test this assumption, Mosteller

(1951) developed a χ2 test of significance that is sensitive to this property of additivity as

well as the other assumptions of the Case V model (Edwards, 1957).

The test involves the discrepancies between the ‘observed’ and ‘theoretical’ proportions of

the experimental data using an arcsine transformation given by:

√

is approximately normally distributed with variance equal to:

The ‘observed’ values refer to the proportions obtained from the paired comparisons while

the ‘theoretical’ proportions are determined by taking the mean scale values obtained from

Thurstone’s PCM, and computing back to proportions for each of the entries in the confusion

matrix (Edwards, 1957).

Note that special consideration must be taken for the value of N with regards to Experiment

2. There were 21 judges performing the paired comparisons for the 4 stimuli (different

heights), meaning pairs. Furthermore, each of the 6 pairs of comparisons

136

was repeated for each of the six Routes, for a total of 36 paired comparisons per judge. It is

believed that the original interpretation of N representing the number of judges is somewhat

misleading in this case, as each judge in essence provides a multitude of sets of judgements

for the PCM, albeit for pairs representing different instances of the same perceptual task. In

other words, five of the six sets of judgments would not be accounted for under the original

interpretation of the statistical test. For this reason, the value was modified to be

.

First, the observed proportions are taken from the lower half of the confusion matrix,

representing paired comparisons.

Observed proportions, pij

H1 H2 H3 H4

H1

H2 0.324

H3 0.305 0.590

H4 0.067 0.295 0.286

The arcsine transformation is applied to all the values to form Θij.

Arcsine transformed observed proportions, Θij

H1 H2 H3 H4

H1

H2 34.68

H3 33.51 50.21

H4 14.96 32.91 32.31

Next, the theoretical proportions p’ij are calculated from the scale values obtained from the

PCM, Zij = {0, 0.654, 0.546, 1.27}. To find each entry,

137

For example Z’12, designating the theoretical Z score for the preference of stimuli 2 over

stimuli 1 is:

The matrix of theoretical scale values is filled in for the lower half of the matrix.

Theoretical scale values, Z’ij

H1 H2 H3 H4

H1

H2 -0.654

H3 -0.546 0.108

H4 -1.270 -0.616 -0.724

The theoretical Z’ij values are then converted to back to proportions:

Theoretical proportions, p’ij

H1 H2 H3 H4

H1

H2 0.26

H3 0.29 0.54

H4 0.10 0.27 0.23

138

The theoretical proportions are converted using the arcsine transformation:

Arcsine transformed theoretical proportions,

H1 H2 H3 H4

H1

H2 30.43

H3 32.74 47.47

H4 18.63 31.24 28.97

Using the and matrices, the test statistic can be computed as:

∑

Under this test, rejecting the null hypothesis indicates that the assumptions of the Case V

method are tenable (Edwards, 1957). For the data in Experiment 2, χ2 = 8.23. This was

compared to a critical value χ2

C(3, N =21*6) = 7.82. As such, χ2 > χ

2C, and the assumptions

of the Case V method involved in finding the scale values for the Experiment 2 data were not

tenable.

It should be noted that using the original value of N = 21 would have resulted in the test

statistic χ2 = 1.37, and therefore χ

2 < χ

2C. In this case, the assumptions of the Case V method

would have been tenable. However, follow up calculations of the discriminal dispersions

revealed that they were in fact quite different from each other (as described in Appendix 7),

which was in violation of one of the assumptions of the Case V model. Although that test is

most sensitive to the property of additivity, it has also been shown in some cases to detect

violations of equal dispersions between the stimuli (Edwards, 1957). Taking into

consideration the interpretation proposed by Edwards (1957), the modification of the value of

N was deemed to be appropriate for the data in Experiment 2.

139

A6.2 Experiment 3: Participant Paired comparisons



45, sFOV

45, dFOV 0.15

45, mFOV 0.23 0.38

90, sFOV 0.85 0.92 0.92

90, dFOV 0.69 0.69 0.77 0.08

90, mFOV 0.62 0.85 0.92 0.15 0.54



45, sFOV

45, dFOV 1.33

45, mFOV 1.69 1.82

90, sFOV 1.80 1.92 2.19

90, dFOV 1.01 1.21 1.60 1.71

90, mFOV 0.93 1.14 1.55 1.67 0.75

140



45, sFOV

45, dFOV -0.52

45, mFOV -0.69 -0.18

90, sFOV 0.92 1.44 1.62

90, dFOV 0.03 0.55 0.72 -0.90

90, mFOV 0.35 0.87 1.04 -0.57 0.32



45, sFOV

45, dFOV 0.30

45, mFOV 0.24 0.43

90, sFOV 0.82 0.93 0.95

90, dFOV 0.51 0.71 0.77 0.18

90, mFOV 0.64 0.81 0.85 0.28 0.63



45, sFOV

45, dFOV 33.33

45, mFOV 29.58 40.99

90, sFOV 65.08 74.17 76.73

90, dFOV 45.64 57.28 61.00 25.47

90, mFOV 52.95 63.98 67.37 32.12 52.32

141


∑



compared to the χ2

C(6, N =13) = 18.31. As such, χ2 < χ

2C, and the assumptions of the Case V

method involved in finding the scale values for the Experiment 3 data were tenable.

A6.3 Experiment 3: Judge Paired comparisons



45, sFOV

45, mFOV 0.33

45, dFOV 0.31 0.53

90, sFOV 0.37 0.56 0.61

90, mFOV 0.17 0.44 0.34 0.25

90, dFOV 0.25 0.60 0.39 0.39 0.69

142



45, sFOV

45, mFOV 34.84

45, dFOV 33.56 46.59

90, sFOV 37.35 48.19 51.42

90, mFOV 24.09 41.81 35.69 30.00

90, dFOV 30.00 51.01 38.58 38.58 56.01



45, sFOV

45, mFOV -0.62

45, dFOV -0.49 0.13

90, sFOV -0.32 0.30 0.18

90, mFOV -0.94 -0.32 -0.44 -0.62

90, dFOV -0.57 0.05 -0.08 -0.25 0.37



45, sFOV

45, mFOV 0.27

45, dFOV 0.31 0.55

90, sFOV 0.38 0.62 0.57

90, mFOV 0.17 0.38 0.33 0.27

90, dFOV 0.28 0.52 0.47 0.40 0.64


143


45, sFOV

45, mFOV 31.15

45, dFOV 33.87 47.88

90, sFOV 37.82 51.92 49.06

90, mFOV 24.69 37.81 35.00 31.14

90, dFOV 32.21 46.13 43.25 39.20 53.30


∑



compared to the χ2

C(6, N=24*6) = 18.31. As such, χ2 < χ

2C, and the assumptions of the Case

V method involved in finding the scale values for the Experiment 3 data were tenable.

144

Appendix 7. Estimates of the Discriminal Dispersions for

Paired Comparison Method (PCM)

Estimates of the standard deviations of each of the psychological stimuli, referred to as

discriminal dispersions, can be calculated from the Paired comparisons Z score matrix. A

derivation of the formulas used to calculate these dispersions is offered in Edwards (1957). In

essence, the equations for the scale values are rearranged to relate the dispersion terms:

√

√

These equations can be manipulated such that all variables can be related through

expressions containing the subscripts i and k only. In addition, a substitution Vi is used to

represent the standard deviation of the ith row of the confusion matrix. A series of

calculations are carried out, shown in the table below to calculate the discriminal dispersions.

H1 H2 H3 H4

H1 0 0.457 0.511 1.501

H2 -0.457 0 -0.229 0.538

H3 -0.511 0.229 0 0.566

H4 -1.501 -0.538 -0.566 0

(1) ∑Zij2 2.723 0.551 0.634 2.863

(2) ∑Zij -2.469 0.148 -0.284 2.605

(3) (∑Zij)2/n 1.524 0.005 0.020 1.697

(4) ∑Zij2-

(∑Zij)2/n

1.199 0.545 0.613 1.166

(5) V2 0.300 0.136 0.153 0.292

(6) V 0.548 0.369 0.392 0.540

(7) 1/V 1.826 2.708 2.554 1.852

(8) σ 0.634 1.423 1.285 0.657

145

Taking the values from Row 7, the value of the constant ‘a’ can be calculated:

∑ ( )

The discriminal dispersions for each of the column entries of the Z scale matrix can be

computed as:

(

)

As shown in Row 8, the discriminal dispersions of the four stimuli were quite different,

ranging from 0.657 to 1.423. This was consistent with the results of testing the assumptions

of the Case V model for the Experiment 2 data, which indicated that the assumptions were

not tenable. The scale values using the Case III method can now be computed using the

estimates of the discriminal dispersions and the scale values from the Case V method

(Edwards, 1957). For each entry in the corrected Z matrix denoted Zc, the new scale value is

computed as:

√

Using the corrected Z matrix, the new scale values are computed.

146

Corrected Z matrix, Zc

H1 H2 H3 H4

H1 0 0.712 0.732 1.371

H2 -0.712 0 -0.439 0.844

H3 -0.732 0.439 0 0.817

H4 -1.371 -0.844 -0.817 0

(1) Sums -2.815 0.307 -0.524 3.032

(2) Means -0.704 0.077 -0.131 0.758

(3) Means +

0.704 0 0.781 0.573 1.462

The corrected Z scale values for the four heights, as shown in Row 3, are Zc = {0, 0.781,

0.573, 1.462}. The equal interval scale and 2D plot for the Experiment 2 data under the Case

III model are shown in the figure, top and bottom respectively.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(b)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(c)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

H1 H2H3 H4

(a)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(b)

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

(c)Height

H1 H2 H3 H40

0.5

1

1.5

2

Scale

valu

e

Height(a)

147

Appendix 8 –Contrasts for Paired comparisons

A8.1 Experiment 2: Height Paired comparison contrasts

The contrast method creates a Q2 test statistic, matched against a critical χ

2 value, based on

the aggregated column data of the raw matrix PCM data. For detailed proofs of the method,

please refer to Starks and David (1961).

The sum of each column is computed as ai.

H1 H2 H3 H4

H1 - 71 73 98

H2 34 - 43 74

H3 32 62 - 75

H4 7 31 30 -

ai 242 151 169 68

For each desired treatment contrast, compute Q2 using the difference between the two values

of ai for the respective treatments:

( )

⁄

The value of Q2 is compared against the D-statistic developed in the one-way statistical

analysis for difference among the treatments. For example, at α = 0.01 and α = 0.05:

Contrasts whose value of Q2 exceeds the critical value of 22.89 are considered statistically

significant.

148

H1 H2 H3 H4

H1 - 65.72** 42.29** 240.29**

H2 - 54.67**

H3 2.57 - 80.96**

H4 -

* indicates that the contrast between column and row elements were found to be statistically

significant at α = 0.05, with the column element being dominant over the row element.

** indicates that the contrast between column and row elements were found to be statistically


A8.2 Experiment 3: Judge Paired comparison contrasts

45,sFOV 45,mFOV 45,dFOV 90,sFOV 90, mFOV 90, dFOV

45, sFOV - 97 100 91 120 108

45, mFOV 47 - 68 64 80 57

45, dFOV 44 76 - 56 95 88

90, sFOV 53 80 88 - 108 88

90, mFOV 24 64 49 36 - 45

90, dFOV 36 87 56 56 99 -

ai 204 404 361 303 502 386

149

45°, sFOV 45°, mFOV 45°, dFOV 90°, sFOV 90°, mFOV 90°, dFOV

45°, sFOV - 185.19** 114.12** 45.38** 411.13** 154.35**

45°, mFOV

-

44.46**

45°, dFOV

8.56 -

92.04** 2.89

90°, sFOV

47.23** 15.57 - 183.34** 31.89**

90°, mFOV

-

90°, dFOV

1.5

62.30** -





A8.3 Experiment 3: Participant Paired comparison contrasts


45, sFOV - 11 10 2 4 5

45, dFOV 2 - 8 1 4 2

45, mFOV 3 5 - 1 3 1

90, sFOV 11 12 12 - 12 11

90, dFOV 9 9 10 1 - 6

90, mFOV 8 11 12 2 7 -

ai 32 17 13 58 35 40

150

45°, sFOV 45°, dFOV 45°, mFOV 90°, sFOV 90°, dFOV 90°, mFOV

45°, sFOV - 11.54 18.51

45°, dFOV

- 0.82

45°, mFOV

-

90°, sFOV 34.67** 86.21** 103.85** - 27.13* 16.62

90°, dFOV 0.46 16.62 24.82*

-

90°, mFOV 3.28 27.13* 37.38**

1.28 -





151

Appendix 9 – Calculation of Number of Mosaiced Frames for

Equivalent Size to dFOV Condition

In order to determine the number of frames ‘N’ to create an image mosaic whose size was

approximately equal to that of the double size field of view (dFOV) condition, a Matlab

program was written to compute the number of screen pixels contained in an image frame.

Image frames from the single field of view (sFOV) and dFOV conditions were fed through

the algorithm. Next, the same subroutine was run for a series of videos using a mFOV

display condition using different values of N, to empirically determine size of mFOV that

was reasonably close to the size of the dFOV condition. It was determined that an image

mosaic composed of 10 frames was equivalent to the display size of the dFOV.

152

Appendix 10 – Additional approaches and pilot tests

A number of alternatives for the experimental tasks, response methods, experimental factors

and analyses were investigated throughout the present study. The following is a brief account

of those considerations.

A10.1 Experiment 1

Display sizes

Three Display sizes were selected for the experiments in the present study: {sFOV, mFOV,

dFOV}. The double size FOV was selected to be the same size as that of the mFOV, but

without the unique shape property showing the shape of the camera’s path on the screen.

Both the dFOV and mFOV were twice the size of the single FOV.

Prior to selecting these three Display sizes, a number of alternatives were considered, shown

in the table below, taking into account both the resolution of the camera sensor and the

resolution of the monitor on which the information is being display. As well, the table

includes Display conditions where the speed and height of traversal are also manipulated.

The displays selected for investigation in the present study are Cases A, B and F for the

sFOV, mFOV and dFOV, respectively.

153

Case Description

Monitor Camera

No. of

monitor

pixels

Monitor

size

No. of

sensor

pixels

Distance

covered by

FOV

A Control condition MR MS K L

B Add mosaicing 2MR 2MS 2K 2L

C ½ live FOV + ½ mosaic MR MS K L

D Add mosaicing + 2X

speed 2MR 2MS 2K 2L

E Add mosaicing + 2X

height 2MR 2MS 2K 4L

F 2X screen size + wide

angle camera system 2MR 2MS 2K 2L

G 2X screen size + keep

narrow FOV 2MR 2MS K L

H Same screen size +

wide angle camera

system

MR MS 2K 2L

I Same screen size + ½X

speed MR MS K L

J Same screen size + ½X

height MR MS K ½ L

Monitor

MR = No. of monitor pixels, in units

MS = monitor size, in cm

Camera

K = no. of camera sensor pixels, in units

L = distance covered by camera, in metres

154

The Display condition in Case C was pilot tested but ultimately abandoned. Case C

represents a Display whose total display area equalled the sFOV, but was composed of a

fixed size display of half the length of the sFOV with the remaining half as a mosaic. It was

of academic interest since it provided a display exhibiting properties of a mosaic, but equal in

size to the single FOV. However, in pilot testing Case C, it was observed that, for the most

part, the displayed information looked identical to that of the sFOV. In the absence of a

surmised benefit of this admittedly contrived display configuration, Case C was abandoned.

Targets

Although stationary targets were ultimately used in Experiments 1 and 3, considerable effort

was put into investigating the effect of moving targets on spatial performance. Because the

mosaic operates on images of the recent past, objects moving in the “live FOV” appear

frozen in the mosaic. However, because the objects continue to move in the terrain, the

mosaic shows “stale” information, in that the display no longer displays accurate spatial

information of the objects in the terrain. It was surmised that observers tasked with detecting

and localising moving targets might benefit from the extra time to locate targets, but their

ability to localise them in the real world would suffer upon the basis of the mosaic presenting

“stale” imagery.

Another interesting consequence is that objects leaving the “live FOV” and entering the

mosaiced portion of the display appear distorted in the mosaic, shown below, creating a

number of artefacts in the displayed information. This is cited in a number of papers,

including Morse et al. (2008), as a source of error in the mosaicing algorithm. The extent to

which the object appears distorted depends on the relative velocity (i.e. both magnitude and

direction) between the FOV and the object. It was discovered through several iterations of

pilot testing that the effects of distortion were too difficult to control for, because of the

interaction between the magnitude and direction of travel of the FOV and objects over

straight and curved trajectories. For this reason, the investigation of moving objects was put

aside as important future work for investigating the effect of mosaicing on target detection.

155

Target detection task

Participants responded in the target detection task in Experiment 1 by first hitting the

spacebar on the keyboard and then indicating the location of the target using the mouse. This

was done to ensure that participants were actually responding to targets they detected in the

environment, as opposed to simply guessing that a target was present. This was one of

several alternatives considered for the response method for the target detection task.

One method that was considered was to use an N-alternative forced choice model of signal

detection. In a force choice model, each event (i.e. a portion of the flyover route) is divided

into N areas, or ‘alternatives’, only one of which contains a target. The figure below shows

an event divided into alternatives A and B with a target located in alternative A. The

participant is asked to indicate whether the target appears in A or B, after having flown over

the terrain. In other words, he is forced to select one of the alternatives. The rationale is that

the participant must indicate with some granularity, where along the event the target was

detected. One can imagine subdividing the event into any number of events so that the

participant provides information about the approximate location of the target while

performing the task.

156

This approach was abandoned for two reasons. First, it became clear that the attention paid to

the target detection task would vary depending on if and when the participant had detected

the target. For example, if the participant detected the target towards the start of the zone, he

would no longer need to search for the target until the start of the next event. In this case, the

participant would be able to allocate his attention to the route identification task. However, if

the participant had not detected a target, because no target was present up until that point or

the target was missed, then attention would have to be devoted to target detection. In other

words, the attentional demands in accomplishing the detection task would vary throughout

the task, which would have varying effects on performance in the route identification task.

Second, it was unclear how to score a situation where the participant detected a target in one

zone, when in fact it appeared in a different zone. For example, if the participant selects

alternative B, when the correct answer was alternative A, it could be scored as a Miss (since

he missed the target in the first half of the event) or a False Alarm (since he detected a target

in the second half of the event when there was none). Because of the ambiguity resulting

from this approach, the N-alternative forced choice model was abandoned.

Analysis of route identification data

A number of analysis methods were attempted for the route identification data in Experiment

1. The methods described in the main text relate to “distance errors” between the selected

route and the Correct route.

157

One alternative approach was to allow some error tolerance in the participants’ responses.

For example, errors below some threshold distance away from the Correct route could be

treated as being a correct response. The figure below shows examples of ‘Hit zones’, for

what would be considered correct route selections for distances of less than 2, 3 and 4 units

away from the Correct route.

The graph below shows the results of using hit zones to determine the percentage of routes

that were correctly identified for each of the three Displays. Unfortunately, this approach

revealed no differences between the three Displays.

158

A10.2 Experiment 2

With the realisation that the routes in Experiment 1 may not have required the participant to

pay continual attention to the route identification task, a set of more complex routes was

designed. A number of alternatives were considered for the response method before deciding

on the one-dimensional route identification task used in Experiments 2 and 3.

One alternative that was considered was the two-dimensional grid that was used in

Experiment 1, with more complex routes. The figure below shows an 8 x 8 grid of more

complex routes varying along two dimensions, manipulating one of the component sinusoids:

frequency along the horizontal axis, and phase along the vertical axis.

159

This response method was eventually abandoned, since the presentation of a large set of

complex routes ultimately made the task of recognising the route too difficult. With a large

set of alternatives, the participant was forced to serially search through each route in the grid,

and ultimately began to forget the shape of the route he was trying to identify. Pilot

participants reported being frustrated by this particular response method, and it was dropped

in favour of the one-dimensional response method.

As described in the study, the final response method in the form of the long river retains

many of the same benefits as the two-dimensional grid, including high granularity, recording

of the entire route, and the affordance of recognising rather than recalling the shape of the

route. Furthermore, the presentation of the long river has ecological validity, as participants

have experience in using maps containing winding roads and rivers.

A10.3 Experiment 3

In Experiment 3, pilot studies were used to redesign the search environment for the target

detection task. Whereas it was initially thought that the terrain features and targets from

Experiment 1 could be reused in Experiment 3, it quickly became clear that the experimental

160

parameters selected would cause problems concerning the visibility of the targets. In

Experiment 1, the 3D modelled trees placed in the terrain were viewed only from a top down

view. Targets were placed so as to ensure that the trees did not occlude the targets in the

environment from the top down view.

However, Experiment 3 investigated two Camera Elevation angles: the top down and angled

viewpoints. It was discovered that from the angled viewpoint, the trees occluded many of the

targets that would have been visible from the top down view, presenting an unfair advantage

to the top down view. For this reason, the trees were removed in Experiment 3.

Furthermore, pilot testing revealed that the targets used in Experiment 1 were much easier to

detect in Experiment 3, due to the fact that the Height was lower in Experiment 3 (325 m

compared to 92 m). For this reason, the targets had to be redesigned for the fixed height of

92m.

Date post:	02-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

An Investigation of the Use of Real-time Image Mosaicing ... · 2.3 Evaluation of cognitive maps...

Documents