+ All Categories
Home > Documents > Touch and gesture: mediating content display, inscriptions...

Touch and gesture: mediating content display, inscriptions...

Date post: 26-Jun-2020
Category:
Upload: others
View: 7 times
Download: 1 times
Share this document with a friend
15
ORIGINAL ARTICLE Touch and gesture: mediating content display, inscriptions, and gestures across multiple devices Gerard Oleksik Natasa Milic-Frayling Rachel Jones Received: 14 January 2013 / Accepted: 25 July 2013 / Published online: 11 December 2013 Ó Springer-Verlag London 2013 Abstract Recent advances in computer design and tech- nology have broadened the range of devices enabled for inscription and touch-based interaction and increased their adoption in collaborative work settings. Since most of the past research has focused on optimal use of individual devices, we now need to expand our understanding of how these devices are used in concert, particularly in collabo- rative work settings where touch and gesture facilitate communication and may interfere with the touch-based input. We conducted in situ observations of team meetings that involve the use of a tabletop computer, tablet personal computers (tablet PCs) with handwriting support, and a vertical display. The study shows how inscriptions and gestures naturally emerge around the content displayed on the devices and how important it is to maintain their spatial congruence. Furthermore, it reveals that the combination of the tablet PCs and the tabletop computer encourages the use of gestures and touch across devices as part of sense- making. While discussing the content, the users apply sequential and synchronous gestures to bind content and inscriptions across devices. Our observations of binding gestures extend the gesture taxonomies from previous studies and expand the notion of multi-touch beyond individual devices. We stipulate that designing support for touch and gestures across devices requires a holistic approach. Only through coordinated design of touch, inscription, and gesture input and consideration of broader usage scenarios, we can ensure minimal interference with naturally emerging touch and gestures and provide effec- tive mechanism for disambiguating general user behavior from device input actions. Keywords Gesture Á Inscription Á Touch Á Tabletop Á Tablets Á Binding gestures 1 Introduction Touch-enabled displays and user interfaces have long been researched as the means of facilitating natural interaction with computing devices. Recently, the com- mercialization and take-up of touch-enabled mobile phones, slate computers, and tabletop computers have increased the use of touch interactions and opened up opportunities for studying emerging user practices and experiences. Particularly interesting are scenarios where multiple devices are used to facilitate collaborative work, each device contributing its specific interaction facilities and adding to the spectrum of gesture and touch actions that arise. Generally, multi-device settings have been studied [10, 20], but only few have looked at real usage scenarios [3]. As new devices and form factors proliferate, there is a greater need to understand how such devices are adopted and used in concert in authentic work settings. Design of touch and inscription support on individual devices often attempts to provide intuitive interaction models that G. Oleksik (&) Dovetailed Ltd, Studio 151, 23 Kings Street, Cambridge, UK e-mail: [email protected] N. Milic-Frayling Microsoft Research Ltd, 7 J J Thomson Avenue, Cambridge, UK e-mail: [email protected] R. Jones Instrata Ltd, 12 Warkworth Street, Cambridge, UK e-mail: [email protected] 123 Pers Ubiquit Comput (2014) 18:1243–1257 DOI 10.1007/s00779-013-0724-5
Transcript
Page 1: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

ORIGINAL ARTICLE

Touch and gesture: mediating content display, inscriptions,and gestures across multiple devices

Gerard Oleksik • Natasa Milic-Frayling •

Rachel Jones

Received: 14 January 2013 / Accepted: 25 July 2013 / Published online: 11 December 2013

� Springer-Verlag London 2013

Abstract Recent advances in computer design and tech-

nology have broadened the range of devices enabled for

inscription and touch-based interaction and increased their

adoption in collaborative work settings. Since most of the

past research has focused on optimal use of individual

devices, we now need to expand our understanding of how

these devices are used in concert, particularly in collabo-

rative work settings where touch and gesture facilitate

communication and may interfere with the touch-based

input. We conducted in situ observations of team meetings

that involve the use of a tabletop computer, tablet personal

computers (tablet PCs) with handwriting support, and a

vertical display. The study shows how inscriptions and

gestures naturally emerge around the content displayed on

the devices and how important it is to maintain their spatial

congruence. Furthermore, it reveals that the combination of

the tablet PCs and the tabletop computer encourages the

use of gestures and touch across devices as part of sense-

making. While discussing the content, the users apply

sequential and synchronous gestures to bind content and

inscriptions across devices. Our observations of binding

gestures extend the gesture taxonomies from previous

studies and expand the notion of multi-touch beyond

individual devices. We stipulate that designing support for

touch and gestures across devices requires a holistic

approach. Only through coordinated design of touch,

inscription, and gesture input and consideration of broader

usage scenarios, we can ensure minimal interference with

naturally emerging touch and gestures and provide effec-

tive mechanism for disambiguating general user behavior

from device input actions.

Keywords Gesture � Inscription � Touch � Tabletop �Tablets � Binding gestures

1 Introduction

Touch-enabled displays and user interfaces have long

been researched as the means of facilitating natural

interaction with computing devices. Recently, the com-

mercialization and take-up of touch-enabled mobile

phones, slate computers, and tabletop computers have

increased the use of touch interactions and opened up

opportunities for studying emerging user practices and

experiences. Particularly interesting are scenarios where

multiple devices are used to facilitate collaborative work,

each device contributing its specific interaction facilities

and adding to the spectrum of gesture and touch actions

that arise.

Generally, multi-device settings have been studied [10,

20], but only few have looked at real usage scenarios [3].

As new devices and form factors proliferate, there is a

greater need to understand how such devices are adopted

and used in concert in authentic work settings. Design of

touch and inscription support on individual devices often

attempts to provide intuitive interaction models that

G. Oleksik (&)

Dovetailed Ltd, Studio 151, 23 Kings Street,

Cambridge, UK

e-mail: [email protected]

N. Milic-Frayling

Microsoft Research Ltd, 7 J J Thomson Avenue,

Cambridge, UK

e-mail: [email protected]

R. Jones

Instrata Ltd, 12 Warkworth Street, Cambridge, UK

e-mail: [email protected]

123

Pers Ubiquit Comput (2014) 18:1243–1257

DOI 10.1007/s00779-013-0724-5

Page 2: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

resemble touch and inscription activities in the physical

environment. In usages scenarios bound to individual

devices, we expect that touch and inscriptions input will

not significantly interfere with the broader use of similar

actions. However, in settings with multiple devices, par-

ticularly in group work scenarios, where touch and ges-

tures are essential for communication, the possibility of

collusion is much increased. Indeed, the concern is that

touch and gesture design of individual devices may, in

aggregate, impede on touch and gestures that naturally

emerge in user interaction with the environment. In par-

ticular, communication around displayed content increases

the spectrum of gestures and touch actions to support

sense-making and may clash with those reserved to

operate the devices and interact with computer

applications.

In order to deepen our understanding of the issues that

arise, we conducted in situ observations of collaborative

meetings as a team of researchers evolved their practices in

using a touch-based tabletop computer, tablet personal

computers (tablet PCs) with handwriting support, and a PC

with a large vertical display. Our observations took place

over a period of 6 months, starting with a setup that first

included only tablet PCs and a large vertical display, and

then was extended with a tabletop computer. Based on the

previous studies of the tabletop use, we anticipated

increased use of gestures [17, 24]. However, through our

research, we gained new insights about the emergence and

purpose of touch and gestures across the devices and the

issues that arise purely from the physical placement of

multiple devices in use.

First, our study shows that the spatial configuration of

the devices may cause a tension between the need for a

group of participants to view a specific content on a

shared device and the need of a speaker to use gestures

and inscriptions on a device that is not visually accessi-

ble to all the participants. This leads to two observed

phenomena:

• A split of the participants’ attention between two focal

points that are not in the same visual space and

compromise the effectiveness of the meeting setup

• A concerted effort by the participants to maintain the

congruence of the content, inscriptions, and gestures

that are used in the group communication.

In order to address the problem, we advise to consider

technologies and methods such as 3D gesture tracking, 3D

modeling, and animations that could be used to mediate the

issue of spatial separation. For example, a solution may

involve capturing 3D gestures above the speaker’s device

through gesture tracking technologies such as Microsoft

Kinect and visually representing gestures on the shared

display.

Furthermore, we provide:

• Evidence of binding gestures that involve multiple

touch points on a single device or across devices and

can interfere with touch support for operating the

devices and applications

• Recommendations for the design of touch support on a

single device and across devices.

We characterize the binding gestures as sequential and

synchronized deictic gestures that are used to indicate

connections among related resources on the same device

and across devices. Implicitly, the binding gestures extend

the notion of multi-touch across devices and call for design

considerations to prevent interference with touch support

on individual devices. Making provisions for binding ges-

tures illustrates the need for a unified approach to designing

support for touch, gestures, and inscriptions. Only by

considering all three, we will reduce the risk of impeding

on the natural occurrences of gestures and touch actions, as

observed in communication and sense-making.

In the following sections, we first introduce the basic

notions and reflect on the work that relates to the phe-

nomena observed in the study. We then describe the study

findings in detail and put forward design recommendations

for improved multi-device meeting environments. While

our study considers sense-making activities within a spe-

cific work setting, our findings provide insights that can be

applied more broadly, promoting an integrated approach

to supporting touch, gestures, and inscriptions across

devices.

2 Background and related research

In the quest to enable natural interaction with computing

devices, the design of input mechanisms have expanded

from support for stylus-based handwriting to sophisticated

touch gestures involving multiple touch points on the dis-

play surface. Most recent advances in 3D gesture tracking

and recognition, such as Microsoft Kinect technology [21,

22], have opened up new possibilities. At each stage of

innovation, one has to reconcile possible conflicts of the

new mode of interaction with the previous design choices

[1]. However, more importantly, one has to understand

how our effort to support intuitive touch, inscriptions, and

gesture interaction with devices may affect the natural user

behavior in the broader context [15, 16]. For that reason,

we observe the use of touch technologies on multiple

devices as part of the collaborative meetings. From previ-

ous studies, we expected that meeting conversations will be

strongly supported by gestures and inscriptions [2, 5, 20].

Thus, we paid a particular attention to how that is mani-

fested in the interaction with devices that aspire to support

1244 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 3: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

natural interaction through touch. Here, we first reflect

upon the basic notions of touch, gesture, and inscription

starting with gestures as the most encompassing one.

2.1 The meaning and role of gesture

Gestures refer to physical movements of hands, head, and

other parts of the body used in information exchange and

interaction. They are fundamental to our communication.

Indeed, McNeill [20] finds a close semiotic relationship

between speech and naturally occurring gestures and

argues that speech and gesture form a unified communi-

cative unit between speaker and listener. As their meaning

is co-defined, speech and accompanying gestures cannot

easily be separated.

The breadth of human gesture is broad, and they have

been studied in conjunction with inscriptions, touch, and

speech. Kendon [15] proposes a 5-point continuum for

describing the degree of formalism underpinning the

human gestures. That continuum ranges from free-form

gesticulation that accompanies speech to the sign language,

complete with vocabulary and grammar. In between are

‘language like gestures’ in which the speaker uses a gesture

in place of a word, ‘pantomimes’ in which the gesturer

physically mimics what is being referenced in speech, and

‘emblems’ which are ‘codified gestural expressions that are

not governed by any formal grammar.’

Focusing on discursive human gestures, McNeil [20]

identified 5 categories: (1) iconic gestures that relate to ‘the

semantic content of the speech’ and provide a visual

backup for what is being said, (2) metaphoric gestures that

are pictorial but present an abstract idea rather than a

concrete object or event, (3) beat gestures that are rhythmic

accompaniments to speech and may emphasize the

importance of particular words, (4) cohesive gestures that

bind together what is being said, and (5) deictic or pointing

gestures that direct listeners’ attention to specific objects as

they are mentioned.

In addition to the classification of individual gestures,

Bekker et al. [2] point out that users often perform multiple

gestures in sequence. Such sequenced gestures are intended

to work in concert with each other rather than indepen-

dently. The authors describe how the designers in their

study string kinetic gestures into walkthrough sequences

that mimic the interactions that they expect between the

user and a designed product.

2.1.1 Gestures in collaborative multi-device environments

While McNeill’s [20] and Kendon’s [15] explorations are

grounded in psycholinguistics, research in collaborative

work has considered the function and importance of ges-

tures in collaborative environments. Bekker et al. [2]

studied the use of gesture in face-to-face meetings among

10 design teams and isolated 4 different types of gesture:

(1) kinetic gestures as body movements that execute all or

part of an action; (2) spatial gestures as movements that

indicate distance or location or size, (3) point gestures,

normally made with fingers to point to a person, an object

or a place, thus denoting an attitude, attribute, affect,

direction, or location, and (4) other gestures that have a

purpose but do not fall into the above categories.

Most of the research concerning multi-device environ-

ments focused on support for sharing and replication of

data across devices [31]. They involved prescribed inter-

actions that users needed to follow in order to achieve a

given objective.

For example, Toss-It facilitates transfer of data between

PDAs and mobile devices through simple ‘throwing’ ges-

tures between mobile devices [33]. Point&Connect enables

users to pair mobile devices by moving the devices closer

together [25]. With Touch and Interact, the user can pass

data from a mobile device to a large display by touching

the screen with the phone [9]. Some include a pen to enable

users to move data and pair devices. For example, Pick Up

and Drop allows the user to use a pen to touch a digital

object on a display and drop it onto another display or a

different part of the same display [26]. The system by Lee

et al. [19] enables users to connect mobile, large screen,

and tabletop devices and share data between them through

semaphore and pointing gestures. In order to share digital

objects, the user touches the item and then points to a

screen or a device where data need to be transported.

Hinckley et al. [11] use synchronous gestures to enable

users to establish connections between tablet devices by

bumping them together. Through a titling gesture, the user

can then ‘pour’ data from one device to another. Similarly,

Dachselt and Buchholz [5] facilitate data transfer between

a phone and large displays through device tilting and

throwing gestures. Other studies explore new interaction

techniques to improve collaboration in multi-device envi-

ronments. For example, Seifert et al. [14] seek to overcome

problems associated with people using individual devices

to complete shared tasks. They enable users to transfer data

to and from personal devices to a large shared tabletop

display using simple touch gestures.

2.2 Inscription

Inscription refers to persistent marks, sketches, or images

made through the act of writing, drawing, printing, and

engraving onto a surface. In the case of tablet PCs, many

applications aim to record and recognize hand-written

inscriptions.

Cortina et al. [4] report on the importance of inscriptions

in support of mathematical learning and problem solving.

Pers Ubiquit Comput (2014) 18:1243–1257 1245

123

Page 4: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

They describe how the inscription of a mathematical

problem in the classroom becomes a representation of the

problem and a scaffold for collective reasoning and atten-

tion. Student comments are added to the inscription by the

teacher in order to capture the direction and development

of thought and be further used in the class-wide discussion.

The work by Goodwin [6] underlines the importance of

placing inscriptions in close proximity of their focal point,

e.g., an archeological artifact that cannot be physically

moved. Goodwin describes how students and supervisors

make sense of the dig site by creating inscriptions in the

dirt around archeological features which must not be dis-

turbed: ‘the inscription creates a special kind of liminal

representation. Unlike situations when the pattern is further

transduced, say into a map, here the representation and the

entity being represented co-exist within the same percep-

tual field, and thus remain in a state where each can be used

to judge the other.’ This highlights the importance of

positioning inscription within the same ‘perceptual field’ as

the object since it enables the two to be juxtaposed. The

interpretative action of inscription is realized within the

same visual field as the content which inspired it.

2.2.1 Interplay between inscription and gesture

Furthermore, research shows a fine interplay between

inscriptions and gestures. Streeck and Kallmeyer [30] state

that, because of their persistent nature, ‘inscriptions can

become the targets or components of further symbolic

actions,’ including physical gestures. Goodwin [6] points

out that inscriptions made around archeological features

become an immediate resource for speech and gesture.

These gestures enter the dialog that unfolds around

inscriptions, enabling archeologists to make directed sug-

gestion and comments. In the work by Cortina et al. [4], we

see how gestures around inscriptions function as key ele-

ments of collaborative learning in the classroom, serving to

direct attention and illustrate mathematical relationships. In

fact, inscriptions, gestures, and speech seem to be tightly

intertwined. Streeck and Kallmeyer [30] argue that

inscriptions are part of the mise-en-scene of interaction and

become ‘interfaces-mediating structures’ between people

during face-to-face interaction. To paraphrase Streeck and

Kallmeyer’s [30] assertion, the social interactions can be

seen as a ‘vociferous process,’ hungrily consuming

inscription, speech, and gesture, and weaving them into

meaning.

2.3 Research objectives

The aim of our research is to explore (1) interactions that

naturally arise in multi-device environments and (2) the

role and meaning of user gestures that emerge. Based on

the findings, we provide design guidelines for addressing

the observed issues and averting those that may arise from

further development of touch and gesture technologies.

As noted in the previous section, the research of multi-

device scenarios have been focused on pairing of the

devices and sharing data between them and lack in deep,

empirical understanding of the user needs in these envi-

ronments [18]. In our approach, we observe the real

meeting settings and interactions that unfold as the sup-

porting technology evolves, without interference or

instrumentation from our part. Because of the incompati-

bility of several software applications with the multi-touch

support on the tablet computer, we could observe the touch

gestures that emerge as part of the user communication and

sense-making, uninhibited by the multi-touch input fea-

tures. That enabled us to identify the risk of interference

with the natural user behavior, should similar touch ges-

tures be reserved for device input and interaction with

applications.

3 Method

We conducted in situ observations of technology use dur-

ing meeting sessions at a university research center. While

we seek to understand the broader context, our primary

goal is to study the technology framework that evolves

over time to support communication and interaction with

devices during meetings. Thus, our observations over the

6-month period are timed to follow the changes in the

technology setup of the meeting place.

3.1 Environment

The observed environment is a nano-photonic research

center, focused on devising and analyzing characteristics of

materials that arise from various molecular structures and

properties. Researchers use high-precision scientific

instruments to collect measurements and apply specific

software packages to visualize and analyze experimental

data. Their practices involve various forms of meaning-

making as part of the individual and group work.

Meetings are held in the research leader’s office and are

attended by the research leader, post doctorate staff, and

doctoral students from two closely related research groups.

Meetings occurred on a bi-weekly basis, and their purposes

were threefold: to present progress to the research leader

and other group members, to discuss and interpret the

findings collectively, and to set next steps in terms of

experiments or different approaches to pursue.

The computing infrastructure of the meeting setup

involved several networked computing devices: a static PC

with large display, HP Tablet PCs with stylus input and

1246 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 5: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

handwriting recognition, and Microsoft Surface v.1.0

tabletop computer. Each researcher was equipped with

their own tablet PC and skilled in touch-based interaction

and inscriptions using a stylus. The tabletop computer

provided touch interaction with those software applications

that take advantage of the touch capability. Otherwise, the

content was accessed using the mouse. In fact, that was the

case with the documents used in most meetings. The ver-

tical display was primarily used to project content for

group viewing. These multiple devices were used in con-

cert to facilitate the meeting.

A typical meeting involved PostDocs and PhD students

presenting overviews of their recent work in the form of

summaries, succinct documents created using MS Power-

Point slides to visualize graphs, and data generated during

experiments. The slides included experimental measure-

ments and statistical analyses that were produced using

specialized tools during and after the lab experiments.

They adapted MS OneNote software to serve as a lab book

and record explanatory descriptions and information while

conducting experiments. The same software was used to

take notes during meetings.

The format of the meetings was consistent. Students

present their findings in turn, describing the graphs and

figures in their summary slides and raising questions and

concerns with the research leader. During presentations,

the research leader typically asked questions about the

collected experimental data. This naturally led to in-depth

discussions of the results and often required from the stu-

dent to access additional data about the experiments. This

was typically facilitated by the student’s tablet PC.

Throughout the meeting, the research leader took notes on

his tablet on behalf of the group and produced sketches to

illustrate solutions to problems, explain concepts, or

describe experimental setups. The notes included outlines

of further explorations and specific tasks for the students.

Thus, they served as records of the meetings and resources

for further work.

These group meetings were meant to facilitate learning

and support innovative and ground-breaking research

through a collective inquiry, as described by the research

leader’s own words: ‘We sit round this table just figuring

out what on earth is going on here [within the presented

summaries].’ This practice of collective sense-making is

considered essential for the success of individual students

and the research group as a whole. Thus, the research

leader continues to optimize the meeting space to increase

the productivity and effectiveness of the teams. This

includes experimentation with new hardware and software

technologies. For that reason, he decided to introduce

tabletop computer and tried various spatial configurations

to optimize the effectiveness of the meeting place.

3.2 Study design

3.2.1 Data collection

Our study involved in situ observations and video recording

of 7 separate meetings including 13 researchers from two

research teams. The resulting data amounted to 10-h

recordings and notes taken by a study researcher who

attended each meeting to observe, take notes to aid later

analysis, and oversee the video recording process.

The data collected in the meeting space were, in fact, part

of a wider collaborative effort to understand the entire work

environment at the research center and identify opportunities

for improvements [23]. This background provided us with an

in-depth understanding of the work practices by the study

participants and proved beneficial for interpreting meeting

behaviors. Furthermore, the months of user research pre-

ceding the current study meant that the study researcher was

already known to the group and his presence at the meetings

could be placed within a meaningful context. We feel that

this familiarity reduced any ‘observer effect’ that this pre-

sence might have had on meetings.

All the meetings were held in the same physical location

but in 3 different meeting setups as the meeting space

evolved over time and assumed different spatial configu-

rations of devices (Fig. 1):

Setup 1 Attendees sat around the research leader’s desk,

with their tablet PCs. A 26-inch vertical monitor was

used to display content for group viewing.

Setup 2 Attendees sat around the tabletop computer

located next to the leader’s desk, with their tablet PCs.

The tabletop computer and the vertical monitor on the

leader’s desk were used for the group viewing of the

meeting material.

Setup 3 Attendees sat around the research leader’s desk

with the integrated tabletop computer, with their tablet

PCs. The vertical monitor and the tabletop computer

were used for group viewing of the meeting material.

The applications used on the tabletop were not touch

enabled.

3.2.2 Data analysis

We started our analyses by selecting two videos, one from

the first and one from the second setup. We analyzed them

manually by reviewing the video tapes and the notes from

the meeting observations. We focused on the participants’

interactions among themselves and with the technology,

looking for those user actions that were attempted and

revealed suboptimal support in the two meeting configu-

rations. This initial analysis confirmed prominent use of

Pers Ubiquit Comput (2014) 18:1243–1257 1247

123

Page 6: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

gestures, use of inscriptions, and reproduction of content

across computing devices. They were applied in various

ways to communicate and direct attention to specific con-

tent during the exchange and interpretation of information.

Based on these initial observations, we focused our ana-

lysis on the interplay of these phenomena and analyzed the

remaining videos. By studying the specific user actions, we

uncovered the patterns in emerging practices and oppor-

tunities for their better support.

3.3 Participants

In total, 13 researchers taken from two separate research

groups at the Center took part in the study. Table 1 pro-

vides anonymized details of the participants, their roles,

research group, and the meeting setups in which they were

observed. The research staff includes students at various

stages of their PhD program, PostDocs, visiting research-

ers, and the research leader. The observations focused on

two research groups within the laboratory: Group A and

Group B. The three Group A’s observations occurred in

meeting setup 1, and two of the four Group B’s observa-

tions occurred in meeting setup 2, and the remaining two in

meeting setup 3.

In the following section, we describe in detail several

meetings, selected to illustrate the typical interactions that

unfold and highlight both the issues that the participants

experience and the workarounds they adopted to deal with

a suboptimal support.

4 Findings

In this section, we discuss the core insights gained from the

study. The introduction of the tabletop to the meeting

environment afforded the participants the opportunity to

establish new planar and spatial configurations of meeting

content. These new content arrangements opened up fresh

collaborative activities that pivoted around touch and

gesture across one or more meeting devices. We begin to

characterize these changes in collaborative activities by

analyzing gestural behavior between different setups. We

then consider the issues surrounding content replication

and the meaning of observed touch gestures that cross the

boundaries of individual devices.

4.1 Gesturing in different meeting setups

Comparison of the gestural behaviors across three meeting

setups shows that the gestural language differs markedly

between the meeting setups with and without the tabletop

computer. It changes in the type and complexity of

gestures.

As expected from previous research [2], deictic gestures

were used extensively across all meeting setups and fell

into two main types: indirect deictic gestures used, for

example, to indicate a part of the screen with the mouse

cursor or by pointing a finger to a distant display, and direct

Setup 1 Setup 2 Setup 3

Fig. 1 Three different device configurations used in the meetings.

Setup 1 Meeting attendees sit around the research leader’s desk with a

vertical monitor connected to the local area network. Participants

bring to the meeting tablet computers that they regularly use in their

everyday work. Setup 2 Meeting attendees sit around the tabletop

computer next to the leader’s desk. The vertical monitor on the desk is

also used, as well as the tablet computers brought in by the

individuals. Setup 3 The tabletop computer is integrated into the

research leader’s desk. Both the vertical surface and the tablet PCs are

used in the meetings

Table 1 Anonymized study participant

Participants Role Group Meeting setup

John Research leader All All

Ali PostDoc A 1

Steven PostDoc B 2, 3

Darren Visiting student A 1

Josh Visiting student B 2

Paul 1st year PhD A 1

Anthony 1st year PhD B 2, 3

James 2nd year PhD A 1

Mike 2nd year PhD A 1

George 2nd year PhD B 2, 3

Charles 3rd year PhD A 1

Keith 3rd year PhD A 1

Ralph 3rd year PhD A 1

1248 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 7: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

deictic gestures exhibited, for example, when the user

touched the surface to point to an artifact [7].

Most frequent use of indirect deictic gestures was

observed in the meeting setup 1 where the vertical display

provided the shared view of the content. Here, John, the

research leader, frequently used mouse gestures to indicate

areas of content to which he was referring and, on occa-

sions, other participants directed attention to content in a

similar way by taking control of the mouse. Furthermore,

John often performed kinetic mouse gestures over dis-

played content, sweeping over parts of the screen to

describe a concept or, more specifically, indicate the

movement of particles in the discussed experiments.

Due to the participants’ position relative to the shared

display, a high proportion of the deictic gestures in setup 1

were made over a distance (Fig. 2, left). Participants sitting

closer to the display were able to point directly to the parts

of the screen to indicate what they were referring to (Fig. 2,

right).

In the meeting setups 2 and 3, shared content was dis-

played on the tabletop computer. Gesturing to the content

was quite markedly different from setup 1, with high

incidence of direct deictic gesture from both the meeting

leader and other meeting participants. The gestural

language increased in complexity to include one- and two-

handed gestural walkthroughs and finger tracing over

content to support verbal explanations. Figure 3 shows a

sequence of multi-touch gestures on the tabletop used in

communication about the content.

Such direct deictic gestures and gestural walkthroughs

were also observed with the research leader across all

three setups on his tablet PC, as he used them to elaborate

and clarify the meaning of sketches he made on the

device.

4.2 Content display across devices

In addition to the gestures, we observed how participants

managed content displays and inscriptions across devices

and how that affected the type and the role of the used

gestures. The use of multiple devices in meetings amplified

the importance of spatial configurations of devices and

participants. By describing two specific scenarios (Sce-

narios 1 and 2 below), we illustrate how, in some config-

urations, the replication of content across devices led to a

split in the participants’ attention, causing the individuals

to miss explanatory gestures in the attempt to optimize the

view of the content.

Fig. 2 Distant and direct deictic gesture. Proximity to the display enabled more precise pointing gestures (right)

Fig. 3 Example of gestural walkthrough on MS Surface. The applications were not touch sensitive. Thus, gestures could be freely used for

pointing and thus supporting communication

Pers Ubiquit Comput (2014) 18:1243–1257 1249

123

Page 8: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

4.2.1 Content replication and spatial congruence

with gestures

Throughout the meetings, the research leader took notes on

behalf of the group using MS OneNote application on his

tablet. These inscriptions often consisted of sketches that

depicted experimental setups, graphs, or physical pro-

cesses. In order to make them visible to the group, he

would display them on the shared monitor. While this

action increased the visibility of the content, it had a knock

off effect on certain interpretative gestures:

Scenario 1 (Setup 1) Paul and John are in a one-to-one

meeting, discussing a series of graphs, which Paul gener-

ated in the laboratory and laid out on a PowerPoint slide for

the meeting. Paul is puzzled by an aspect of the graph, and

John is explaining what the graph lines depict, commenting

on their shapes. In order to elaborate further, John displays

the contents of his tablet on the shared vertical monitor

and, using his stylus, begins to make an explanatory sketch

on his tablet. The sketch, now also displayed on the desk

monitor, facilitates explanation and learning.

At first Paul views John’s sketching on the monitor.

John talks about the sketch while producing it and, after

several seconds, begins to gesture to specific parts of the

graph with his stylus, explaining what they represent. At

this point, Paul switches his attention from the desk mon-

itor to the tablet and keeps it there for the rest of the dis-

cussion, as John continues to sketch and intersperse with

explanatory gestures (Fig. 4).

We can extract several important points from this

example scenario:

• When creating a sketch to help explain the contents of

Paul’s summary slides, John uses his tablet as an

inscription device.

• The inscription becomes a resource for interpretation

and action itself. This is evident from the periodic

gesturing to parts of the inscriptions and the interpre-

tative speech acts that occur throughout the creation of

the inscription.

• As John begins gesturing to the inscription on the

tablet, a split between content and interpretation is

created. Interpretative gestures made by John are not

available on the desk display.

• Paul switches attention to the tablet to mediate the

bifurcation of his attention that is caused by two

displays. He chooses the one which unifies inscription

and gestures.

Scenario 2 (Setup 3) A group meeting is taking place

between the research leader, first- and second-year PhDs

students Anthony and George, and a Post Doc named

Steven. John and George sit on opposite sides of the

meeting desk, facing each other over the tabletop com-

puter. Steven and Anthony sit immediately to the right of

John (Fig. 5).

As in the previous example, John is sketching on the

tablet and periodically making gestures to explain his

sketch and its relevance to the discussion at hand.

Throughout, Steven and Anthony, sitting immediately to

the right of John, view the sketches on the tablet itself,

where they can see both the sketch and the gestures.

George, due to his distance from and orientation to the

tablet, views John’s sketches on the vertical monitor,

occasionally looking to view John’s gestures over the

tablet. Furthermore, the main slides of the meeting are

presented on the tabletop incorporated into the office desk.

The meeting is approximately 30 min in when, due to

the failure of the wireless network (an occasional event),

the connection between the vertical display and the tablet is

severed. Thus, the tablet becomes the sole display of the

meeting notes and sketches. To compensate, John reposi-

tions the tablet, placing it on the tabletop in such a way that

all meeting members can view the content.

This movement of the tablet to a more central and

accessible location has an immediate effect on the flow of

the meeting. Because George is now closer to the tablet, he

can see the content and the gestures within the same visual

space. He can also gesture over the content himself (Fig. 5,

right). Indeed, Fig. 5 shows John motioning over part of

the diagram with his stylus (left), to which George

responds, with an accompanying gesture (right image).

This example provides further insight into the users’

response to spatial separation of the content and the

interpretative gesture as multiple displays are used. The

participants are mediating the tension between the opti-

mal view of the content on a display closer to them and

the important gestures and inscriptions that may occur

outside that visual space. They refocus their attention to

Fig. 4 Viewing gesture on the tablet. The participant’s attention is

switched from the larger view of the shared content and inscriptions

to the tablet PC where content and gestures are unified

1250 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 9: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

the area where content, inscriptions, and gestures are

unified and thus provide higher value than the content

display itself. In addition, when the content and

inscription are brought into the user’s physical proxim-

ity, as the mobility of the tablet PC allows, the person

can actively participate in gesturing and inscriptions.

This resonates with Hawkey et al.’s [8] finding that the

proximity of content expands the user’s interpretive

modes of action.

4.2.2 Content binding

In the observed meetings, resources used in the discussions

were typically displayed on different devices: sketches on

the tablet PCs, notes on the vertical display, and slides on

the tabletop. Explanations often required repeated cross-

referencing of distinct resources using gestures to direct the

user attention to a specific resource. Our analysis revealed

a central role of the specific gesture patterns that we refer

to as binding gestures. They serve to indicate associations

and make explicit the connections among displayed

resource items.

Binding gestures manifest themselves differently across

the meeting setups, ranging from sequences of direct and

indirect gestures that indicate relationships, to synchro-

nous-directed gestures that are used to bind two objects and

emphasize their link (summarized in Table 2). We provide

example scenarios (Scenarios 3, 4, and 5), where four

different types of binding gestures occur.

Scenario 3 (Setup 1). Hybrid sequential binding John and

Peter are in a one-to-one meeting, and John has produced a

sketch to describe a process that Peter needs to follow in

his next experiment. On completing the sketch, John turns

to Peter’s slides on the shared vertical display and uses the

mouse to gesture to a graph which Peter included in the

presentation. Immediately upon doing so, John points

directly to a specific part of his sketch on the tablet and

then back to the monitor, emphasizing the connection

between the two. This form of sequential binding is a

hybrid of indirect gesture via mouse and direct touch on the

tablet.

Scenario 4 (Setup 2). Direct sequential binding As in the

previous example, meeting participants are discussing

Fig. 5 Content, inscriptions, and gestures occur above the tablet that is brought centrally to bridge the spatial gap between the content and the

participants

Table 2 Observed binding gestures

Setup Type of binding What happens Form of gesture Function of gesture

1 Hybrid sequential

binding

Binding between summary on

monitor and sketch on tablet

Mouse point to summary,

followed by direct point to

sketch

Associate graph and sketch

2 Direct sequential

binding

Binding between sketch on tablet and

summary displayed on tabletop

Direct point to sketch,

followed direct point to

summary

Associate graph and sketch—explain

graph by way of sketch

2, 3 Direct synchronous

binding

Binding between sketch on tablet and

summary displayed on tabletop

Synchronous direct point to

tablet and tabletop

Associate sketch and summary

image—explain sketch with

reference to image

2, 3 Direct synchronous

binding on single

device

‘Binding’ between two different parts

of the summary displayed on

tabletop

Synchronous pointing to two

points on the tabletop

To explain the relationship between

two elements of the summary

Pers Ubiquit Comput (2014) 18:1243–1257 1251

123

Page 10: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

summary slides prepared by a student. However, in this

setup, the slides are displayed on the tabletop computer and

John is resting his tablet on the tabletop surface. He and

Zak, a PostDoc, are tracing their fingers over a graph dis-

played on the tabletop as they talk. John begins to describe

a solution and starts to sketch it on his tablet as he talks.

As he explains the finished sketch, he motions over it

extensively. He then makes a number of binding gestures

between parts of the slide on the tabletop and the sketch on

the tablet. He does this by tracing over parts of the sum-

mary with his hands as he talks and then points to parts of

the tablet sketch saying: ‘this [pointing to the summary on

the tabletop] is this [pointing to part of the tablet sketch]’

(Fig. 6). Unlike the previous example, this binding is

achieved through a sequence of two direct deictic gestures.

Scenario 5 (Setup 3). Direct synchronous binding John is

meeting with two PhD students and one PostDoc and dis-

cussing a set of experimental results displayed on the

tabletop. The results include an overexposed image taken

during a recent experiment. John moves the tablet from his

lap to a position close to the image and begins to sketch on

the tablet. He gestures over the completed sketch with his

stylus while explaining its meaning. He holds the stylus on

a part of the sketch and simultaneously places and holds his

finger on the tabletop image, stating: ‘it depends if the

camera saturates—you can be measuring here to here

[indicating one part of the tablet sketch] or here to here

[indicating another part of the tabletop slide]’ (Fig. 7).

As in the previous example, this binding gesture helps

John elaborate the relationship between displayed content,

i.e., the image in the presentation and the graph he sket-

ched. However, the mode of the interaction differs in that

the direct deictic gestures occur synchronously across the

two devices.

4.2.3 Single-device content binding

In addition to the cross-device binding gestures, we

observed a similar pattern on a tabletop computer.

Scenario 6 (Setup 2). Direct synchronous binding on a

single device John is meeting with two PhD students and

one PostDoc, discussing a slide presented on the tabletop.

Addressing a particular graph within the slide, John

simultaneously places the index fingers of both hands on

Fig. 6 Direct sequential binding of the content displayed on the tabletop and the sketch created on the tablet PC

Fig. 7 Examples of direct synchronous gesture binding the inscription created on the tablet and the content displayed on the tablet

1252 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 11: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

two parts of the graph saying ‘there should be no overlap

between this transition and this transition. What is the

energy gap between these transitions?’ (Fig. 8, left image).

Through this gesture, John binds two aspects of the sum-

mary content to illustrate a relationship between two dif-

ferent points in the image and pose a question about that

relationship.

4.2.4 Implications of the binding gestures

The ease with which binding gestures occur is strongly

influenced by the setup of the meeting place. In setup 1,

binding occurred through two different modes of gesture:

indirect gesturing to content via the mouse, followed by the

direct deictic gesture to the tablet. Compared to other

meeting setups, this appeared cumbersome due to the dif-

ferent modes of interaction and the wide visual frame that

gestures need to cover, from the vertical monitor to the

horizontal tablet and back again.

In contrast, in meeting setups 2 and 3, both tabletop and

tablet are on the same horizontal plane when the binding

occurs. This appears to increase the ease with which

binding gestures could be performed. Furthermore, the

mobility of the tablet PC appears to play a significant role,

enabling the inscription space to be brought into the close

proximity of the content and, consequently, increase the

ease of content binding. Overall, the introduction of the

tabletop in setups 2 and 3 has extended the gestural lan-

guage and became instrumental in providing the scope for

such gestures to emerge.

4.3 Summary

The manner in which participants interacted around

meeting content differed significantly between meeting

setup 1 and setups 2 and 3.

Table 3 illustrates the differences in the use of touch and

inscription and the nature of gestural language between the

setups. It shows that, in setup 1, participants resorted to

indirect mouse-based gestures and, overall, employed a

narrower range of gestures. In setups 2 and 3, the partici-

pants utilized a wider set of direct gestures, encompassing

content on one and two devices.

We attribute these changes between setup 1 and setups 2

and 3 to the fact that the screen size and horizontal aspect

of the tabletop computer display:

• Afforded a more complex gestural language

• Showed content large enough for participants to perform

binding gestures within a single item of content

• Coupled with the mobility of the tablet, afforded

participants the opportunities to position devices in

such a way to align content, gesture, and inscription in

the same visual frame.

Taken together, these factors appear to have enriched

the collaborative sense-making behavior during the

meetings.

Drawing connections and conclusions is an important

aspect of sense-making. Supporting the binding gestures in

Fig. 8 Direct synchronous binding on a single device, occurred on the tabletop only

Table 3 Differences in observed gestures across meeting setups

Touch, gesture, and

inscription actions

Setup 1 Setups 2

and 3

Indirect deictic 4 9

Indirect hybrid binding 4 9

Direct deictic 4 4

Explanatory over inscription 4 4

Tablet inscriptions 4 4

Gestural walk-though over

meeting content

9 4

Two-handed gestures over

meeting content

9 4

Direct sequential binding 9 4

Direct synchronous binding on one,

or across two devices

9 4

Pers Ubiquit Comput (2014) 18:1243–1257 1253

123

Page 12: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

a multi-device environment would increase the ability to

pull together, view, compare, and discuss data from dis-

parate sources and thus aid the scientific discovery. This

would be an important extension of the projects that have

attempted to do so, e.g., in the context of scientific work

through the use of databases [29], electronic lab books

[32], and tabletop computing [28].

5 Design recommendations

5.1 Alignment of content, inscriptions, and gestures

Our in situ study of collaborative meetings revealed the

complementary nature of gestures and inscriptions as a

support for discussions and collective sense-making.

However, use of multiple devices presents a significant

challenge to preserving the congruence between the display

of the content and the visibility of inscriptions and gestures

that unfold during interactions with the content and among

the meeting participants. As noted, the gestures are likely

to occur above the device used by the speaker while the

content may be projected on a shared display for the par-

ticipants’ convenience. This causes a split of the partici-

pants’ attention between two focal points that are not in the

same visual space and can impact the effectiveness of the

meeting. In the observed meetings, the individuals chose to

trade the quality of the content display for the unified view

of the gestures, inscriptions, and content.

This suggests techniques to project or simulate gestures

on the shared display. While the touch gestures could be

easily captured and overlaid over the content, capturing the

gestures above the display surfaces would require more

sophisticated approaches. They could be registered and

represented digitally with the use of the 3D gesture

detection and tracking technologies. C-slate by Izadi et al.

[12] demonstrate that gesture tracking can be achieved in

real time and overlaid on top of the content to support turn-

taking in the remote authoring scenarios. Generally, the

existing techniques for projecting gestures into remote

collaboration spaces [17] could be adopted for that pur-

pose. At the same time, the literal gesture projection may

obstruct the view of the content and, thus, we may need to

support different trade-offs between the content visibility

and the gesture visibility. A plausible approach would be to

simulate gesture effects by first identifying the type and

objective of the gesture, e.g., an indirect deictic gesture

aimed at drawing attention to a specific graph in the doc-

ument or a kinetic gesture used to underscore a formula in

the paragraph. Once the gesture characteristics are known,

various display strategies can be used to highlight or ani-

mate the elements that are the focus of the gestures, touch,

and inscription.

In relation to the adoption of the tabletop computer

within the meeting space, we should consider methods to

address the content orientation issue inherent to horizontal

surfaces (Fig. 5). While tablet PCs can be temporarily re-

orientated to improve the content view for an individual,

that is harder to achieve for a group of individuals around

the tabletop. In some instances, it would be appropriate to

create a projection onto a vertical display, showing the

first-person view of the contents referred to by the speaker.

Added benefit would be to bring into the same visual view

multiple documents that are subject to content binding

across horizontal surfaces involving the tabletop and tablet

PCs. In other instances, it may be plausible to replicate that

first-person view from the tabletop onto each tablet PC.

5.2 Touch and gesture space

The user activities observed during the study revealed

intricate connections between touch, gesture, and inscrip-

tions. Thus, designing adequate support requires a proper

care to avoid possible conflicts and inhibitions of one type

of user activities over the others.

Inscriptions and gestures are essential for communica-

tion [4, 6, 30]. Both fine-level sketches on the tablets and

coarser touch-based annotations on the tabletop support

discussions. It is, thus, essential that the display surfaces

preserve the space for inscription and gestures to occur.

Inscription should be treated as a general facility, easily

accessible across devices, and supporting different types,

style, and purposes of inscription. Since inscriptions often

refer to a specific piece of content, it is important to pre-

serve the associations and make them easily retrievable.

Furthermore, adjacency of the content and inscriptions

within the same plane appears to help with their binding

through gestures and should be supported.

Similarly, gestures above and in-front of the display

surfaces are essential for communication. There have been

attempts to use that space for additional display function-

alities [13]. However, that has to be considered with the

full understanding of the gesture roles, particularly in the

multi-device environments. Our observations of sequential

and synchronized deictic gestures are good examples of

how new gestures form as an integral part of the sense-

making process. In contrast to the standard touch gestures

associated with specific commands on the tablets and

tabletops, the observed gestures have the function of

binding-related content. They extend the notion of multi-

touch across the boundaries of individual devices to convey

the association among different content pieces.

This practice raises an important requirement for the

design of touch support. First, it calls for the consistency

and coordination of the touch commands across devices to

optimize the user experience. Second, it requires that the

1254 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 13: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

selection of touch commands does not overlap with the

direct deictic gestures that are likely to evolve during

sense-making and may be confused with the touch com-

mands. In our observations, the conflict did not occur

because the MS Office applications used by the participants

were not touch enabled. Otherwise, the binding gestures

that naturally evolved, as shown in Fig. 7, would have

caused unintended movements of documents and activation

of application windows. That would have likely deterred

the participants from using touch and gesture and thus

diminished the value of technology in support of the

meeting communication.

5.3 Designing for interaction across multiple devices

Support for intuitive and natural use of devices and appli-

cations has been a long pursued goal in the computer system

design. Design efforts range from techniques for soliciting

natural user gestures to ensure easy-to-learn interaction

models to adoption of user-defined sets of gestures [34].

However, this very objective makes it likely that touch and

gestures reserved for operating the devices and applications

resemble user actions that naturally occur in a broader con-

text. Confined to individual devices, such design choices

may not interfere with the user’s interaction with the envi-

ronment. However, when the user’s actions involve multiple

devices and contexts with increased interactions, as in the

observed meeting setting, the possibility of that are higher.

Thus, designing for the use of multiple devices in concert

requires deeper design considerations.

The study highlights two aspects that appear critical for

success, i.e., the need for design practices that

• Define the input and interaction mechanism for a device

and across devices by considering naturally emerging

user behavior around and beyond the technology

• Provide clear delineation of the boundary, spatial or

otherwise, within which these input and interaction

mechanism have influence.

Essentially, we promote a holistic approach to designing

input and interaction mechanisms for devices that consid-

ers relevant aspects of the human behavior, including

touch, gesture, speech, and alike, and pro-actively design

for possible clashes and impedance on such behavior in

common contexts. The observed pointing and binding

gestures, for example, could have been wrongly interpreted

as standard operational gesture and caused un-intentional

movement of application windows, if all the applications

were multi-touch enabled on the tablet computer.

Since it is difficult to eliminate potential conflicts in all

circumstances, we propose that the scope of device-specific

gestures, touch, and speech commands is clearly explicated

in the design. This has two desired effects. First, it

empowers the users to work around the limitations imposed

on their behavior. Second, it enables other designers to

develop intervening technologies.

The problem of conflicting input interactions has been

an issue even within a single device. For example, early

handheld devices involved manual switching between the

handwriting input mode and the standard touch mode. In

order to enable use of multiple devices, we need to gen-

eralize the approach of discrete mode switches to methods

of adjusting the scope of interaction mechanism for each

device. These methods would, at minimum, take into

account the co-presence of other devices and applications

and the scope of their interaction mechanisms.

For example, suppose we plan to implement the 3D

gesture tracking mechanism to capture hand movement

above the tablet PC and project them on the vertical display

alongside the content that is shown on the tablet. This may

be achieved with the camera integrated with the tablet or

Kinect cameras positioned on the table used by the meeting

participants. Each of these solutions will narrow the space

around technology that is normally available for human-to-

human interaction. In order to minimize this effect, the

scope of the Kinect cameras and gesture tracking, for

example, should be adjustable to focus on the relevant

device within the time of gesticulation.

Similar adjustment and tuning of interaction mechanism

is illustrated in setups 2 and 3 by the use of the tablet PCs

and tabletop computer, both touch enabled. During sense-

making activities, the touch input on the two devices was

not required except for handwriting on the tablet. Thus, a

mechanism for adjusting the scope of input functionality on

both devices would be very much desired.

6 Conclusions and future work

In this paper, we present findings of the observation study

of technology use in collaborative settings, such as meeting

places, where individuals use multiple devices and engage

around shared content through discussions and sense-

making. We focus on the role and the meaning of the

emerging gestures, touch, and inscriptions, some of which

are primarily defined by the functionality of the devices

and others that result from communication and engagement

around the content. The observed environment enabled us

to study the changes in the user behavior as the meeting

practices changed from using HP tablets with stylus input

and a shared vertical display to an extended setup that also

included MS Surface 1.0, a multi-touch tabletop computer.

We outline two types of challenges that the use of

multiple devices poses in collaborative settings.

First is related to the configuration of devices to provide

optimal view of the content display and the gestures, touch,

Pers Ubiquit Comput (2014) 18:1243–1257 1255

123

Page 14: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

and inscriptions that can be initiated by any of the partic-

ipants. We observe the ways the participants tried to

resolve the separation of gestures and content by bringing

them into the same focal point. That was successfully

managed in the case of horizontal surface and tablets that

are mobile and easy to reposition. However, other situa-

tions, e.g., those that involve fixed vertical displays, require

a different approach, such as gesture tracking and repre-

sentation along the content.

Second challenge involves a possible conflict between

the gesture, touch, and inscription support on the devices

and the user behavior that naturally occurs in communi-

cation and sense-making activities. Because the latter may

include gestures and touch actions that resemble those

incorporated in the device features, there is a risk of mis-

interpreting and inhibiting the natural user behavior. Our

study uncovered important role of binding gestures across

all the observed meeting settings. They, for example, could

not be freely applied by the users without interfering with

the current touch support on the tabletop computer.

As we continue to enhance the use of computing devices

through intuitive interaction modes that include gesture,

touch, and speech, it is important to address the issue of

reconciling the design choices across different modes of

interaction and minimizing the possible inhibition of the

natural behavior in the space where technology is used.

The latter is particularly important when technology has a

supportive role and, thus, it is meant to enhance rather than

limit the user actions.

Using the example of binding gestures, we illustrate the

need for a unified approach that takes into consideration all

three modes touch, gestures, and inscriptions simulta-

neously to arrive at the optimal design within individual

device and across devices. Furthermore, as they occur

naturally outside the use of technology, it is important to

design for a clear boundary, i.e., space of influence for each

interaction mechanism. By delineating and exposing the

scope of each interaction mechanism, we enable users to

manage the potential conflicts effectively and we provide

opportunities for the designers to devise mediating solu-

tions, such as context-sensitive adjustment of the interac-

tion mechanisms. The latter is an extension of the mode-

switching mechanisms that have been adopted in the past

when similar gestures may be used for different actions on

the device, e.g., touch in handwriting mode and touch in

operational mode on early handhelds. In contrast to the

single-device management of the mode, the adjustment of

interaction mechanism across devices requires a more

sophisticated model that takes into account the spatial

configuration and the context of use.

The presented research can be extended by considering

devices with 3D gesture and speech input in collaborative

settings. We anticipate that they will further emphasize the

importance of adjustable scope of input mechanism as

multiple devices are used in concert. Furthermore, both of

these modes more pervasively invade the space around

technology, leaving less room for the physical movement

and voice communication among the participants. Investi-

gation into these issues is beyond the standard usability

work and requires observations of real environments where

natural user behavior unfolds around the technology. We

expect that the latest efforts behind commercialization of

dialog-enabled systems, such as Siri [27] for interaction

with online services and applications that utilize 3D gesture

tracking, such as games for Microsoft Xbox Kinect [22],

will provide opportunities to observe adoption of such

technologies in a variety of usages scenarios. They will

enable us to study the breakdowns and workarounds that

emerge to address the conflicting aspects of different input

and interaction technologies as they are used in the same

setting.

References

1. Andreychuk D, Ghanam Y, Maurer F (2010) Adapting existing

applications to support new interaction technologies: technical

and usability issues. In: Proceedings of the 2nd ACM SIGCHI

symposium on engineering interactive computing systems (EICS

‘10). ACM, New York, NY, USA, pp 199–204

2. Bekker MM, Olson JS, Olson GM (1995) Analysis of gestures in

face-to-face design teams provides guidance for how to use

groupware in design. In: Proceedings of the conference on

designing interactive systems, pp 157–166

3. Biehl JT, Baker WT, Bailey BP, Tan DS, Inkpen KM, Czerwinski

M (2008) Impromptu: a new interaction framework for support-

ing collaboration in multiple display environments and its field

evaluation for co-located software development. In: Proceeding

of SIGCHI conference on human factors in computing systems

4. Cortina LJ, Zhao Q, Cobb P, McClain K (2003) Supporting

students’ reasoning with inscriptions. In: Proceedings of 6th

international conference of learning science, pp 124–149

5. Dachselt R, Buchholz R (2009) Natural throw and tilt interaction

between mobile phones and distant displays. In: Proceedings of

the 27th international conference extended abstracts on human

factors in computing systems 2009 (CHI ‘09). ACM,

pp 3253–3258

6. Goodwin C (2003) Pointing as situated practice. In: Kito S (ed)

Pointing: where language, culture, and cognition meet. Lawrence

Erlbaum Associates, Inc., Mahwah, NJ, pp 217–242

7. Ha V, Inkpen KM, Whalen T, Mandryk RL (2006) Direct

intentions: the effects of input devices on collaboration around a

tabletop display. In: Proceedings of the first IEEE international

workshop on horizontal interactive human-computer systems,

pp 177–184

8. Hawkey K, Kellar M, Reilly D, Whalen T, Inkpen KM (2005)

The proximity factor: impact of distance on co-located collabo-

ration. In: Proceedings of the 2005 international ACM SIG-

GROUP conference on supporting group work

9. Hardy R, Rukzio E (2008) Touch & interact: touch-based inter-

action of mobile phones with displays. In: Proceedings of the

10th international conference on human computer interaction

with mobile devices and services, Amsterdam, The Netherlands

1256 Pers Ubiquit Comput (2014) 18:1243–1257

123

Page 15: Touch and gesture: mediating content display, inscriptions ...static.tongtianta.site/paper_pdf/c3306fe6-e5a3-11e9-addd-00163e08bb86.pdfTouch and gesture: mediating content display,

10. Hinckley K, Ramos G, Guimbretiere F, Baudisch P, Smith M

(2004) Stitching: pen gestures that span multiple displays. In:

Proceedings of the working conference on advanced visual

interfaces, Gallipoli, Italy

11. Hinckley K (2003) Synchronous gestures for multiple persons

and computers. In: Proceedings of the 16th annual ACM sym-

posium on user interface software and technology, pp 149–158

12. Izadi S, Agarwal A, Criminisi A, Winn J, Blake A, Fitzgibbon A

(2007) C-Slate: exploring remote collaboration on horizontal

multi-touch surfaces. Proc IEEE Tabletop 2007:3–10

13. Izadi S, Hodges S, Taylor S, Rosenfeld D, Villar N, Butler A,

Westhues J (2008) Going beyond the display: a surface tech-

nology with an electronically switchable diffuser. Proc ACM

UIST 08(269–278):2008

14. Seifert J, Simeone A, Schmidt D, Holleis P, Reinartz C, Wagner

M, Gellersen H, Rukzio E (2012) MobiSurf: improving co-

located collaboration through integrating mobile devices and

interactive surfaces. In: Proceedings of the 2012 ACM interna-

tional conference on interactive tabletops and surfaces (ITS ‘12).

ACM, New York, NY, USA

15. Kendon A (1988) How gestures can become like words. In:

Potyatos F (ed) Crosscultural perspectives in nonverbal commu-

nication. Hogrefe, Toronto, pp 131–141

16. O’Hara K, Harper R, Mentis H, Sellen A, Taylor A (2013) On the

naturalness of touchless: putting the ‘‘interaction’’ back into NUI.

ACM Trans Comput Hum Interact 20(1):1–25

17. Kirk D, Crabtree A, Rodden T (2005) Ways of the hands. In:

Proceedings of the ninth conference on european conference on

computer supported cooperative work, September 18–22, 2005,

Paris, France, pp 1–21

18. Kray C, Nesbitt D, Dawson J, Rohs M (2010) User-defined

gestures for connecting mobile phones, public displays, and

tabletops. In: Proceedings of the 12th international conference on

human computer interaction with mobile devices and services,

Lisbon, Portugal

19. Lee H, Jeong H, Lee J, Yeom K, Park J (2009) Gesture-based

interface for connection and control of multi-device in a tabletop

display environment. In: Proceedings of the 13th international

conference on human-computer interaction, pp 216–225

20. McNeill D (1992) Hand and mind: what gestures reveal about

thought. University of Chicago Press, Chicago

21. Microsoft Kinect. http://research.microsoft.com/en-us/collaboration/

focus/nui/kinect-windows.aspx

22. Microsoft Kinect for Xbox. http://www.xbox.com/en-GB/KINECT

23. Oleksik G, Milic-Frayling N, Jones R (2012) Beyond data shar-

ing: artifact ecology of a collaborative nanophotonics research

centre. In: Proceedings of CSCW ‘12. ACM, New York, NY,

USA, pp 1165–1174

24. Olsen J, Olsen G (2004) Why distances matters. Hum Comput

Interact 15:139–179

25. Peng C, Shen G, Zhang Y, Lu S (2009) Point&Connect: inten-

tion-based device pairing for mobile phone users. ACM/USENIX

MobiSys, Krakow

26. Rekimoto J (1997) Pick-and-drop: a direct manipulation tech-

nique for multiple computer environments. In: Proceedings of

UIST 1999, pp 31–39

27. Siri, Apple Corporation. http://www.apple.com/ios/siri

28. Shaer O, Kol G, Strait M, Fan C, Grevet C, Elfenbein S (2010)

G-nome surfer: a tabletop interface for collaborative exploration

of genomic data. ACM CHI 2010, Atlanta

29. Smalheiser NR, Torvik VI, Zhou W (2009) Arrowsmith two-node

search interface: a tutorial on finding meaningful links between

two disparate sets of articles in MEDLINE. Comput Methods

Programs Biomed 94(2):190–197

30. Streeck J, Kallmeyer W (2004) Interaction by inscription.

J Pragmat 33(4):465–490

31. Streitz NA, Geißler J, Holmer T, Konomi S, Muller-Tomfelde C,

Reischl W, Rexroth P, Seitz P, Steinmetz R (1999) i-LAND: an

interactive landscape for creativity and innovation. In: Proceed-

ings of the SIGCHI conference on human factors in computing

systems, May 15–20, 1999, Pittsburgh, Pennsylvania, United

States, pp 120–127

32. Tabard A, Eastmond E, Mackay WE (2008) From individual to

collaborative: the evolution of prism, a hybrid laboratory note-

book. CSCW’08, November 8–12, 2008, San Diego, California,

USA

33. Yatani K, Tamura K, Hiroki K, Sugimoto M, Hashizume H

(2006) Toss-it: intuitive information transfer techniques for

mobile devices using toss and swing actions. IEICE Trans Inf

Syst 89:150–157

34. Wobbrock JO, Morris MR, Wilson AD (2009) User-defined

gestures for surface computing. In: Proceedings of the 27th

international conference on Human factors in computing systems

CHI 09, pp 1083–1092

Pers Ubiquit Comput (2014) 18:1243–1257 1257

123


Recommended