+ All Categories
Home > Documents > Comparing the usefulness of video and map information in...

Comparing the usefulness of video and map information in...

Date post: 15-Sep-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
7
Comparing the usefulness of video and map information in navigation tasks Curtis W. Nielsen Brigham Young University 3361 TMCB Provo, UT 84601 [email protected] Michael A. Goodrich Brigham Young University 3361 TMCB Provo, UT 84601 [email protected] ABSTRACT One of the fundamental aspects of robot teleoperation is the ability to successfully navigate a robot through an envi- ronment. We define successful navigation to mean that the robot minimizes collisions and arrives at the destination in a timely manner. Often video and map information is pre- sented to a robot operator to aid in navigation tasks. This paper addresses the usefulness of map and video information in a navigation task by comparing a side-by-side (2D) rep- resentation and an integrated (3D) representation in both a simulated and a real world study. The results suggest that sometimes video is more helpful than a map and other times a map is more helpful than video and from a design perspec- tive, an integrated representation seems to help navigation more than placing map and video side-by-side. Keywords Human Robot Interaction, HRI, User studies, Integrated display, Information presentation 1. INTRODUCTION One of the fundamental aspects of robot teleoperation is the ability to successfully navigate a robot through an en- vironment. We define successful navigation to mean that the robot minimizes collisions with obstacles and arrives at a destination in a timely manner. In order to support an operator in navigational tasks it is important to present navigation-relevant information to the operator. In remote, mobile robot navigation, it is common to use video and/or range information to inform the operator of obstacles and available directions of travel [1, 3, 6, 7, 19]. Both video and range information provide distinct sets of in- formation that have advantages and disadvantages for nav- igation tasks. For example, a video stream provides a vi- sually rich set of information for interpreting the environ- ment and comprehending obstacles, but it is usually limited by a narrow field of view and it is often difficult to com- Human Robot Interaction ’06 Salt Lake City, Utah USA prehend how the robot’s position and orientation relate to an environment. In contrast, range information is typically generated from IR sensors, laser range finders, or sonar sen- sors which detect distances and directions to obstacles, but do not provide more general knowledge about the environ- ment. Advancements in map-building algorithms allow the integration of multiple range scans into maps which help an operator visualize how the robot’s position and orientation relate to the environment. In previous studies we used both video and range informa- tion (current readings or a map) to navigate a robot [13, 15]. During the experiments we observed that operators sometimes focused their attention on the map section of the interface and other times focused their attention on the sec- tion that contains the video. These anecdotal observations lead to the question of whether video or map information is more useful for teleoperation. Although the ways to combine maps and visualization tools have been studied in other domains such as aviation (see, for example [4, 16]) this problem has not been well studies in human-robot operation with occupancy grid maps. This paper seeks to understand the usefulness of video and map information in navigation by comparing a prototypical 2D interface and a 3D augmented virtuality interface [13]. Specifically we hypothesized that for navigational tasks the video will hinder performance with the 2D interface, but minimally affect performance with the 3D interface. Fur- ther, we hypothesized that map information is more helpful to navigation than video information for both types of in- terface. 2. MOTIVATION During the World Trade Center disaster in September 2001, Casper and Murphy used robots to search the rubble for vic- tims [5]. Their robots were primarily operated via a video stream from a camera on the robot. One of their observa- tions was that it was very difficult for operators to handle the navigation and the exploration of the environment with only video information. In a separate study, Yanco and Drury had first responders search a mock environment using a robot that had camera, and map-building capabilities. One of their conclusions is that some participants considered the map useless because they felt it did not help them understand the robot’s loca-
Transcript
Page 1: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

Comparing the usefulness of video and map information innavigation tasks

Curtis W. NielsenBrigham Young University

3361 TMCBProvo, UT 84601

[email protected]

Michael A. GoodrichBrigham Young University

3361 TMCBProvo, UT 84601

[email protected]

ABSTRACTOne of the fundamental aspects of robot teleoperation isthe ability to successfully navigate a robot through an envi-ronment. We define successful navigation to mean that therobot minimizes collisions and arrives at the destination ina timely manner. Often video and map information is pre-sented to a robot operator to aid in navigation tasks. Thispaper addresses the usefulness of map and video informationin a navigation task by comparing a side-by-side (2D) rep-resentation and an integrated (3D) representation in both asimulated and a real world study. The results suggest thatsometimes video is more helpful than a map and other timesa map is more helpful than video and from a design perspec-tive, an integrated representation seems to help navigationmore than placing map and video side-by-side.

KeywordsHuman Robot Interaction, HRI, User studies, Integrateddisplay, Information presentation

1. INTRODUCTIONOne of the fundamental aspects of robot teleoperation isthe ability to successfully navigate a robot through an en-vironment. We define successful navigation to mean thatthe robot minimizes collisions with obstacles and arrivesat a destination in a timely manner. In order to supportan operator in navigational tasks it is important to presentnavigation-relevant information to the operator. In remote,mobile robot navigation, it is common to use video and/orrange information to inform the operator of obstacles andavailable directions of travel [1, 3, 6, 7, 19].

Both video and range information provide distinct sets of in-formation that have advantages and disadvantages for nav-igation tasks. For example, a video stream provides a vi-sually rich set of information for interpreting the environ-ment and comprehending obstacles, but it is usually limitedby a narrow field of view and it is often difficult to com-

Human Robot Interaction’06 Salt Lake City, Utah USA

prehend how the robot’s position and orientation relate toan environment. In contrast, range information is typicallygenerated from IR sensors, laser range finders, or sonar sen-sors which detect distances and directions to obstacles, butdo not provide more general knowledge about the environ-ment. Advancements in map-building algorithms allow theintegration of multiple range scans into maps which help anoperator visualize how the robot’s position and orientationrelate to the environment.

In previous studies we used both video and range informa-tion (current readings or a map) to navigate a robot [13,15]. During the experiments we observed that operatorssometimes focused their attention on the map section of theinterface and other times focused their attention on the sec-tion that contains the video.

These anecdotal observations lead to the question of whethervideo or map information is more useful for teleoperation.Although the ways to combine maps and visualization toolshave been studied in other domains such as aviation (see,for example [4, 16]) this problem has not been well studiesin human-robot operation with occupancy grid maps.

This paper seeks to understand the usefulness of video andmap information in navigation by comparing a prototypical2D interface and a 3D augmented virtuality interface [13].Specifically we hypothesized that for navigational tasks thevideo will hinder performance with the 2D interface, butminimally affect performance with the 3D interface. Fur-ther, we hypothesized that map information is more helpfulto navigation than video information for both types of in-terface.

2. MOTIVATIONDuring the World Trade Center disaster in September 2001,Casper and Murphy used robots to search the rubble for vic-tims [5]. Their robots were primarily operated via a videostream from a camera on the robot. One of their observa-tions was that it was very difficult for operators to handlethe navigation and the exploration of the environment withonly video information.

In a separate study, Yanco and Drury had first responderssearch a mock environment using a robot that had camera,and map-building capabilities. One of their conclusions isthat some participants considered the map useless becausethey felt it did not help them understand the robot’s loca-

Page 2: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

tion [18]. Further, in an analysis of a robot competition,Yanco, Drury and Scholtz observed that many operatorsdemonstrated a lack of awareness of the robot’s locationand surroundings [19].

Most mobile robot interfaces implement some aspect of videoand/or range information to inform the operator of the en-vironment around the robot. Some of these approachespresent the information in a 2D, side-by-side approach [1,3, 19] and others present the information integrated into asingle 3D display [12, 7]. In previous work we developed anintegrated display and found it more useful for some navi-gation tasks in comparison to a side-by-side display [3, 13,15].

To test the usefulness of map and video information in 2Dand 3D interfaces, we next present two user studies: one insimulation and one using a real robot.

3. EXPERIMENT 1In the first experiment we look at the usefulness of video andmap information as aids for navigation with both a side-by-side approach (2D) and an integrated approach (3D).We hypothesized that with 2D interfaces video may nega-tively influence an operator’s ability to perform a navigationtask because it does not provide sufficient lateral informa-tion and it may draw the operator’s attention away frommore useful places on the interface such as map or rangeinformation [9]. Furthermore, we hypothesized that with a3D interface, video information will not hinder navigationwhen other range information is present. To explore the ef-fect of range and video information on navigation, we assessan operator’s ability to navigate a maze environment withtwo interfaces (2D and 3D) and three conditions for eachinterface (map-only, video-only, and map+video).

3.1 FrameworkFor this experiment we used a simulator based on the popu-lar Unreal Tournament game engine as modified by MichaelLewis and colleagues at the University of Pittsburgh [11,17]. Their modifications were originated with the intent ofproviding an inexpensive yet realistic simulator for studyingurban search and rescue with mobile robots. The UnrealTournament game engine provides a rich visual environment,which when combined with accurate models of common re-search robots and the game’s physics engine provides for avery good mobile robot simulator [10].

We used the Unreal Tournament level editor to create mazeenvironments that have the appearance of concrete bunkerswhich are filled with pipes, posters, windows, cabling, andelectronic devices to provide a detailed environment for therobot to travel through. Some examples are shown in Fig-ure 1.

The environment we created has seven separate mazes whichare designed to explicitly test low-level navigation skills.There is only one path through each maze and no dead-ends, but it takes considerable teleoperation skill to navi-gate a maze from start to finish without crashing the robot.One of the mazes is used for training and the other 6 mazesare used for testing. The training maze contains a continu-

Figure 1: Images from the Unreal Tournament en-vironment used for Experiment 1.

ous path without an exit so that participants can practicedriving the robot as long as desired.

Each maze is an 8x8 grid where each cell in the grid is 2x2meters for a total maze area of 256m2. Each maze is de-signed to have 42 turns and 22 straight cells to minimizedifferences in results from different mazes (see Figure 2).The simulated robot used for this experiment is a modelof the ATRV-Jr robot and has a width and length of 0.6meters.

Figure 2: A map of one of the mazes used in Exper-iment 1.

3.1.1 ProcedureOperators were instructed on how to drive the robot and howto perform the experiment through speakers on a headset,and they were told that their goal was to get the robot outof the maze as quickly as possible without hitting too manywalls.

Page 3: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

Before testing, operators were given a chance to practicedriving the robot with both the 2D and the 3D interfaces.Each interface displayed both map and video information.The operators were asked to drive at least once throughthe training maze to ensure a minimum amount of train-ing. Once an operator had completed the training maze theywere asked to continue practicing until they felt comfortablecontrolling the robot with the interface (most participantsstopped training at this point). Following each training ses-sion and each experiment, participants were given a ques-tionnaire to evaluate their performance. The purpose of thequestionnaires after the training sessions was to familiarizethe operators with the questions we would ask after eachexperiment.

Once training was complete, each participant was asked ifthey had any questions and they were told that the exper-iments would be very similar to the training, except thatthere would be an exit to the maze and that they wouldhave different sets of information visible on the interface foreach test. Specifically, participants were given conditionsof video-only, map-only, and map+video for both the 2Dand 3D interfaces. For testing, we used a within-subjectscounter-balanced design where each operator performed onetest with each of the six conditions which were presented in arandom order with the constraints that the 2D and 3D inter-faces were used alternately and the conditions were counterbalanced on which order they were used. The interfaces forthe map+video conditions are shown in Figure 3.

Figure 3: The 2D interface (top) and the 3D inter-face (bottom) used for Experiment 1.

3.2 ResultsTwenty-four participants were paid to navigate a simulatedrobot with six different conditions of information presenta-

2D Interface 3D InterfaceMap-only 258 196Video-only 366 351

% Change 42% 79%p 7.8e−4 1.6e−7

Table 1: Time to completion in Experiment 1.

2D Interface 3D InterfaceMap-only 9.83 1.25Video-only 19.10 22.71

% Change 94% 18xp 1.3e−3 1.3e−6

Table 2: Number of collisions in Experiment 1.

tion. Participants were recruited from the Brigham YoungUniversity community with most subjects enrolled as stu-dents. Two participants terminated the experiment priorto completion of the six conditions, but completed portionsof the experiment were used for our analysis. Throughoutthe discussion of the results significance was obtained witha paired, two-tailed t-test with n = 24 samples unless oth-erwise specified.

3.2.1 Map-only vs. Video-onlyThe results indicate that the video-only condition took sig-nificantly longer than the map-only condition for both the2D(42%) and the 3D(79%) interfaces (see Table 1). Addi-tionally, there were nearly twice as many collisions with thevideo-only condition in 2D than with the map-only condi-tion and there were 18x the collisions with the 3D video-onlycondition in comparison to the 3D map-only condition (seeTable 2). There was no statistical difference between the2D video-only condition and the 3D video-only condition ineither time to completion or number of collisions, this is aswe expected.

3.2.2 Map+videoWe found that with both the 2D and 3D interfaces, themap+video condition had results that were most similar tothe map-only condition in comparison to the video-only con-dition. In particular we found that, on average, there wereexactly the same number of collisions with the 3D interfacefor the map-only and map+video conditions and that therewas no significant difference between the 2D map-only andmap+video conditions. Figure 4 shows the average numberof collisions for each of the six conditions.

On average there was an insignificant change in time tocompletion when video information was added to map in-formation for both the 2D and 3D interfaces. However, wenoticed a learning effect that took place with the 2D map-only condition and the 3D map+video condition. In partic-ular, the participants that used the 2D map-only conditionafter the 2D map+video condition finished the task 14%faster than the participants that used the 2D map-only con-dition before the 2D map+video condition (x̄2Dmap1 = 278,x̄2Dmap2 = 238, p = .0953, n = 12, unpaired t-test, seeTable 3).

Page 4: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

Figure 4: Number of collisions in Experiment 1.

First Second % Change p2D map-only 278 238 -14% 0.0953

2D map+video 269 273 1.7% 0.849

%change -3.1% 15%p 0.736 0.0909

Table 3: Time to completion in 2D after adjustingfor learning.

.

Similarly, the participants that used the 3D map+video con-dition after the 3D map-only condition finished the task 15%faster than the participants that used the 3D map+videocondition before the 3D map-only condition (x̄3Dmap+video1 =225, x̄3Dmap+video2 = 191, p = .0115, n = 12, unpaired t-test, see Table 4). We did not notice a learning effect be-tween any of the other conditions.

When we compare the set of experiments in 2D where themap-only and map+video conditions were used first (Ta-ble 3), we find that adding video to the map has an in-significant effect. However in the set of experiments wherethe map-only and map+video conditions were used second,we find the time to completion of the task increases by14.8% with the map+video condition in comparison to themap-only condition, which suggests that after accounting forlearning, adding video to the map hurts navigation by in-creasing the time it takes an operator to navigate the robotout of a maze.

When we compare the set of experiments in 3D where themap-only and map+video conditions are used first (Table 4),we find that adding video to the map increases the time tocompletion by 15.2%. However, in the set of experiments

First Second % Change p3D map-only 195 196 0.32% 0.961

3D map+video 225 191 -15% 0.0115

% Change -15% -2.7%p 0.0357 0.626

Table 4: Time to completion in 3D after adjustingfor learning.

.

Figure 5: Time to completion after accounting forlearning in Experiment 1.

where the map-only and map+video conditions are used sec-ond, we find the difference in the time to complete the task isinsignificant, which suggests that after accounting for learn-ing, adding video to the map in the 3D interface does notaffect the time it takes to navigate the robot out of the maze(see Figure 5).

3.3 DiscussionThese results suggest that video can hurt navigation whenthe video does not contain sufficient navigational cues andvideo and map information are placed side-by-side. Evenwhen map information is present and more useful than videofor navigating, a novice operator’s attention tends to bedrawn towards the video, which, in this case, negatively af-fects their ability to navigate. These results make sensein light of research done by Kubey and Csikszentmihalyiwhich has shown that television draws attention because ofthe constantly changing visual scene [9]. It is interestingthat even though it took longer to navigate, there were notmore collisions with the 2D map+video condition than the2D map-only condition, which implies that operators werenot bumping into walls more, just moving slower throughthe maze.

4. EXPERIMENT 2Experiment 1 provided an initial analysis of the usefulness ofvideo and map information for performing navigation taskswith a remote, mobile robot in simulation. It is also usefulto verify that the results and conclusions in simulation carryover and are applicable to environments and robots in thereal world. For this purpose we have designed the secondexperiment to compare the usefulness of video and map in-formation when navigating a robot in the real world. Wehypothesized that the results would be similar to the resultsin simulation.

4.1 FrameworkFor this experiment we converted part of the second floorof the Computer Science Department at Brigham YoungUniversity into an obstacle course for our robot to travelthrough. The normal hallway width is 2 meters and weused cardboard boxes, Styrofoam packing, and other obsta-cles to create a 50 meter course which has a minimum width

Page 5: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

of 1.2 meters. Figure 6 shows images of the robot and thetwo hallways used in the experiment.

Figure 6: Images of the environment and the robotused for Experiment 2.

4.1.1 The RobotThe robot we used for the experiment is an ATRV-Jr devel-oped by IRobot which is approximately 0.6 meters in widthand 0.7 meters in length (see Figure 6) . The robot usesartificial intelligence algorithms developed at the Idaho Na-tional Laboratory (INL) to safeguard it from colliding withwalls and obstacles as it is teleoperated [2]. Additionally, therobot uses a map-building algorithm developed by Konoligeat the Stanford Research Institute (SRI) to represent theenvironment and localize the robot within the map [8].

An operator controls the ATRV-Jr with a Microsoft Sidewinder2 joystick1 and range and video information from the robotare presented to the operator via our 3D interface [13, 14].The 3D interface is integrated with the INL base stationwhich handles the communication of movement commandsand general information between the operator and the robotvia radio modems. Live video from the robot is transmit-ted to our interface via 802.11b wireless Ethernet. The in-terfaces used for this experiment have been modified fromthe previous experiment by including icons which indicatewhere the robot’s intelligence identifies obstacles that mightinterfere with robot movement. The interfaces used for thisexperiment are shown in Figure 7.

4.1.2 Procedure1The INL base station does not support the steering wheelwe used in Experiment 1

Before using the real robot, operators were trained to drivethe robot with the Unreal Tournament training maze weused in the first experiment. While training, operators drovethe simulated robot with a joystick for a few minutes witheach of the five conditions that they would be tested on(2D map-only, 2D map+video, video-only2, 3D map-only,and 3D map+video). Upon completion of the training, theoperators were moved to a different base station which wascommunicating with the real robot.

For testing, we used a within-subjects counter-balanced de-sign where each operator used all five conditions in a pseudo-random order with the constraints that the 2D and 3D inter-faces were used alternately and the conditions were counter-balanced on the order in which they were used. The experi-ment was setup such that an operator would drive the robotthrough the obstacle course with one condition, then at theend of the course an assistant would change the condition,turn the robot around, reset the map information, and startthe next test. After every two runs the robot was pluggedin for three to five minutes to keep the batteries charged.

4.2 ResultsFifteen participants were paid to navigate the ATRV-Jr robotwith five different conditions of information presentation.Participants were recruited from the Brigham Young Uni-versity community with most subjects enrolled as students.The first three participants were used as part of a pilot studyto determine a sufficient complexity of the obstacle courseand to determine how best to use the robot while main-taining a sufficiently high charge on the batteries, therefore,there results were not included as part of the analysis. Addi-tionally, during four of the runs with the robot, the batterylevel got too low on the last condition and adversely affectednavigation, hence this data was also discarded in the analy-sis.

One of the differences between this experiment and the pre-vious is that the real robot has intelligence on board to pro-tect itself from hitting obstacles. For each test we recordthe number of times the robot acts to protect itself and dis-cuss these results as robot initiative. Statistical significancewas determined using a paired, two-tailed t-test with n = 12samples except as otherwise noted3.

4.2.1 Map-only vs. Video-onlyWith the 3D interface, there was not a significant differencein the time to completion with the map-only and video-only conditions, however, the robot took initiative to protectitself nearly twice as much with the video-only conditionthan with the map-only condition (x̄map = 18.7, x̄video =36.6, p = .0378, see Table 6).

With the 2D interface, there was not a significant differ-ence in the times the robot took initiative to protect it-self with the map-only and video-only conditions, however,

2We did not compare 2D and 3D video-only conditions be-cause we found in the previous experiment that the video-only condition was similar for both the 2D and 3D interfaces.3Only twelve samples is not really sufficient to discuss sig-nificance, however, we will use the results to discuss generaltrends.

Page 6: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

2D Interface 3D InterfaceMap-only 287 209Video-only 239 239

% Change -17% 15%p 8.0e−3 .394

Table 5: Time to completion in Experiment 2.

2D Interface 3D InterfaceMap-only 39.3 18.7Video-only 36.6 36.6

% Change -6.7% 96%p .853 .0378

Table 6: Number of times the robot took initiativeto protect itself in Experiment 2.

there was a significant difference in the time to complete thetask. In fact, the results were opposite those from the simu-lated experiment. In particular it was 17% faster to use thevideo-only condition as opposed to the map-only condition(x̄map = 287s, x̄video = 239s, p = 8.0e−3, see Table 5).

Most likely the reason these results differ from the previousexperiment is that the environment in the second experi-ment provided more navigational cues available to the videothan the environment in the first experiment. In the firstenvironment it was often the case that the video image wasfilled by a wall and none of the edges of the wall were visible.In contrast, for this second experiment, the edges of obsta-cles were nearly always visible through the camera and theoperator could see future parts of the map as most obstacleswere not taller than the height of the camera.

4.2.2 Map+videoWhen map and video information were combined with the2D interface, we found the results to be similar to the video-only condition with negligible difference in the time to com-pletion and the number of collisions (see Figures 8 and 9).

When map and video information were combined with the3D interface, we found the number of collisions to be nearlyidentical to the map-only condition but we found that op-erators finished the obstacle course 9.4% faster with themap+video condition in comparison to the map-only con-dition (x̄map+video = 189s, x̄video = 209s, p = .0519, seeFigure 8).

This result is interesting because it suggests that when use-ful information is available in both the map and the video,the 3D interface supports the complementary nature of theinformation and can lead to an improved performance overthe individual parts. In contrast, performance with the 2Dinterface seems to be constrained by the best one can dowith an individual set of information.

4.3 Discussionin contrast to the previous experiment, these results suggestthat video can help navigation when placed side-by-side orintegrated with map information. When we consider thedifference in the environments used in the current and pre-

Figure 7: The 2D interface(top) and 3D interface(bottom) used for Experiment 2.

Figure 8: Time to completion for the five conditionsin Experiment 2.

Figure 9: Time to completion for the five conditionsin Experiment 2.

Page 7: Comparing the usefulness of video and map information in ...usarsim.sourceforge.net/documents/publications/HRI2006_0142_5a45b6d3.pdfproviding an inexpensive yet realistic simulator

vious experiments, this result makes sense because in thisenvironment the robot’s camera can see over most obstaclesand there are useful navigational cues in the video streamsuch as edges of obstacles that help an operator navigatethe environment. In Experiment 1, most of the walls wentall the way to the ceiling and the path had many 180 degreeturns that made it impossible to see very far ahead in themap.

In Experiment 2 with the 2D interface, the map is not veryuseful in comparison to the video. The most common com-plaints among participants were that the map was too small(although it was the same relative size as the previous ex-periment) and that it was difficult to correlate the directionof the joystick movement with how the robot would movebecause the robot icon in the map was not always point-ing up. Further, the map+video condition had results mostsimilar to the video-only condition because the video tendsto “pull” an operator’s attention and hold it more than themap [9]. Additionally, in our questions following the exper-iments, operators claimed that most of their time was spentfocused on the video

5. CONCLUSIONMobile robot navigation depends on the ability to see andcomprehend information in the environment surroundingthe robot. Typically information from the environment ispresented to the operator via range and/or video, however,the manner in which this information is presented to an op-erator may affect navigational performance.

This paper has shown that map-only conditions can be moreuseful than video-only conditions if the map resolution is ofsufficient quality. Additionally, we have shown that video ishelpful in environments where there are navigational cues inthe video information, but video can diminish performancewhen there are no navigational cues and it is place side-by-side to video information.

For design purposes, integrating maps with video in a 3Dperspective seems much better than presenting map andvideo side-by-side in a 2D perspective. Most likely this isbecause the maps are always visible, even if the operatorpays too much attention to the video. These results areconsistent with previous results [13, 15].

In the future we plan to look at how delay affects navigationwith both the 2D and 3D interfaces. Additionally we planto look at exploration tasks using different interfaces anddifferent sets of information.

6. REFERENCES[1] M. Baker, R. Casey, B. Keyes, and H. A. Yanco. Improved

interfaces for human-robot interaction in urban search andrescue. In Proceedings of the IEEE Conference on Systems,Man and Cybernetics, October 2004.

[2] D. J. Bruemmer, J. L. Marble, D. Dudenhoeffer,M. Anderson, and M. McKay. Mixed-initiative control forremote characterization of hazardous environments. InProceedings of the Hawaii International Conference onSystem Sciences, Waikoloa, Hawaii, January 2003.

[3] D. J. Bruemmer, J. L. Marble, D. A. Few, R. L. Boring,M. C. Walton, and C. W. Nielsen. Let rover take over: Astudy of mixed-initiative control for remote robotic search

and detection. IEEE Transactions on Systems, Man andCybernetics—Part A: Systems and Humans,35(4):494–504, July 2005.

[4] G. L. Calhoun, M. H. Draper, M. F. Abernathy,F. Delgado, and M. Patzek. Synthetic vision system forimproving unmanned aerial vehicle operator situationawareness. In J. G. Verly, editor, Proceedings of SPIE Vol.5802. p. 219-230, Enhanced and Synthetic Vision 2005.May 2005.

[5] J. Casper and R. R. Murphy. Human-robot interactionsduring the robot-assisted urban search and rescue responseat the world trade center. IEEE Transactions on Systems,Man, and Cybernetics Part B, 33(3):367–385, June, 2003.

[6] T. W. Fong and C. Thorpe. Vehicle teleoperationinterfaces. Autonomous Robots, 11(1):9–18, July 2001.

[7] T. W. Fong, C. Thorpe, and C. Baur. Advanced interfacesfor vehicle teleoperation: Collaborative control, sensorfusion displays, and remote driving tools. AutonomousRobots, 11(1):77–85, July 2001.

[8] K. Konolige. Large-scale map-making. In Proceedings of theNational Conference on AI (AAAI), San Jose, CA, 2004.

[9] R. Kubey and M. Csikszentmihalyi. Television addiction isno mere metaphor. Scientific American, 286(2):62–68, 2002.

[10] M. Lewis and J. Jacobson. Game engines in research.Communications of the ACM, 45(1):27–48, 2002.

[11] M. Lewis, K. Sycara, and I. Nourbakhsh. Developing atestbed for studying human-robot interaction in urbansearch and rescue. In 10th International Conference onHuman-Computer Interaction, Crete, Greece, 2003.

[12] R. Meier, T. Fong, C. Thorpe, and C. Baur. A sensorfusion based user interface for vehicle teleoperation. InInternational conference on field and service robotics(FSR), 1999.

[13] C. W. Nielsen, M. A. Goodrich, and R. J. Rupper. Towardsfacilitating the use of a pan-tilt camera on a mobile robot.In Proceedings of the 14th IEEE International Workshopon Robot and Human Interactive Communication(RO-MAN), Nashville, TN, 2005.

[14] C. W. Nielsen, B. Ricks, M. A. Goodrich, D. J. Bruemmer,D. A. Few, and M. C. Walton. Snapshots for semanticmaps. In Proceedings of the 2004 IEEE Conference onSystems, Man, and Cybernetics, The Hague, TheNetherlands, 2004.

[15] B. W. Ricks. An Ecological Display for Robot Teleoperation.PhD thesis, Brigham Young University, August 2004.

[16] L. C. Thomas and C. D. Wickens. Eye-tracking andindividual differences in off-normal event detection whenflying with a synthetic vision system display. To appear inthe Proceedings of the Human Factors and ErgonomicsSociety 48th Annual Meeting. Santa Monica: HumanFactors and Ergonomics Society, 2004, 2004.

[17] J. Wang, M. Lewis, and J. Gennari. A game engine basedsimulation of the NIST urban search and rescue arenas. InProceedings of the 2003 Winter Simulation Conference,2003.

[18] H. A. Yanco and J. L. Drury. “Where am I?” Acquiringsituation awareness using a remote robot platform. InProceedings of the IEEE Conference on Systems, Man, andCybernetics, October 2004.

[19] H. A. Yanco, J. L. Drury, and J. Scholtz. Beyond usabilityevaluation: Analysis of human-robot interaction at a majorrobotics competition. Journal of Human-ComputerInteraction, 19(1 and 2):117–149, 2004.


Recommended