Collaborative Manipulation in Natural Language ...

Post on 24-Mar-2022

1 views 0 download

transcript

Spatial References and Perspective in Natural Language Instructions for

Collaborative Manipulation

Rosario Scalise, Shen LiHenny Admoni, Stephanie Rosenthal, Siddhartha S. Srinivasa

1

2

● Background, why tabletop is important● Problem: object uniqueness

○ Solution 1: spatial reference○ Solution 2: perspective

● Study 1○ Image generation○ Study design○ Result

■ Human vs robot■ Visual search + word frequencies■ Difficulty

● Study 2○ Data coding○ Study design○ Result

■ Block ambiguity■ Perspective

● Discussion○ 3 approaches to give instructions○ Block ambiguity and perspective ambiguity○ Neither perspective is the best○ Future work - interactivity

3

Herb image courtesy of Pittsburgh Post-Gazette

4

5

“I am going to pick up the cup on the right!”

6

Key Issue: Ambiguity

Question by Jessica Lock from the Noun Project

7

Key Issue: Ambiguity

As scene complexity increases, so does the difficulty in specifying an object.

8

Key Issue: Ambiguity

As scene complexity increases, so does the difficulty in specifying an object.

Natural language is inherently ambiguous.

Forms of Ambiguity

9

Visual Appearance

“Pick up the coffee cup.”

Forms of Ambiguity

10

Visual Appearance

“Pick up the coffee cup.”

Which one?

Forms of Ambiguity

11

Perspective

“Pick up the coffee cup on the right.”

Forms of Ambiguity

12

Perspective

“Pick up the coffee cup on the right.”

Whose right?

Forms of Ambiguity

13

Proximity

“Pick up the coffee cup next to the donuts.”

Forms of Ambiguity

14

Proximity

“Pick up the coffee cup next to the donuts.”

How close is ‘next to’?

15

16

Can youuniquely

describethis block?

How can we best overcome ambiguity when grounding our references while keeping communication natural?

17

Approach

Learn by observing what humans do and extract best-practices from the examples that are most successful.

18

19bender by Jordan Díaz Andrés from the Noun Project

20bender by Jordan Díaz Andrés from the Noun Project

Collect Corpus

21bender by Jordan Díaz Andrés from the Noun Project

Collect CorpusGain Insights

22bender by Jordan Díaz Andrés from the Noun Project

Collect CorpusGain Insights

Evaluate Corpus

23bender by Jordan Díaz Andrés from the Noun Project

Collect CorpusGain Insights

Evaluate CorpusExtract Guidelines

24bender by Jordan Díaz Andrés from the Noun Project

Collect CorpusGain Insights

Evaluate CorpusExtract Guidelines

+ Analysis Tools

25

26

Study 1 : Collecting Instructions for Corpus

27

Study 1 : Collecting Instructions for Corpus

28

person

person

Study 1 : Collecting Instructions for Corpus

29

robot

Study 1 : Collecting Instructions for Corpus

30

robot

1400 Total

Evaluating

31

How do we tell how good any specific instruction is?

“Pick up the blue block”

Evaluating

32

Given an instruction and the stimulus it corresponds to, can people infer the correct block?

“Pick up the blue block”

Evaluating

33

Given an instruction and the stimulus it corresponds to, can people infer the correct block?

“Pick up the blue block”

Study 2 : Corpus Evaluation

34

Metrics

35

For each instruction, we calculate:

Metrics

36

Accuracy: # of successful block selections

For each instruction, we calculate:

total # of times instruction is shown

Metrics

37

Accuracy: # of successful block selections

For each instruction, we calculate:

total # of times instruction is shown

Avg. Completion time: How long it takes to select the indicated block on average

38

Full investigation and results TBR in:

“Spatial References and Perspective in Natural Language Instructions for Collaborative Manipulation”

at IEEE Ro-Man 2016 (Late August)

Perspectives

39

40

Partner

Participant (Speaker)

Types of Perspective:

41

Partner

Participant (Speaker)

Partner:“Pick up the blue block on your left”

Types of Perspective:

42

Partner

Participant (Speaker)

Participant:“Pick up the blue block on my right”

Partner:“Pick up the blue block on your left”

Types of Perspective:

43

Partner

Participant (Speaker)

Participant:“Pick up the blue block on my right”

Partner:“Pick up the blue block on your left”

Neither:“Pick up the blue block closest to the orange block.”

Types of Perspective:

44

Partner

Participant (Speaker)

Participant:“Pick up the blue block on my right”

Partner:“Pick up the blue block on your left”

Neither:“Pick up the blue block closest to the orange block.”

Unknown:“Pick up the blue block to the right of the orange block.”

Types of Perspective:

Perspective vs

Accuracy and Completion Time

45

46

Pick up the box furthest to your left.

Partner perspective

Partner

Participant

47

Pick up the box furthest to your left.

Partner

Participant

48

Pick up the box furthest to your left.

Partner

Participant

49

Pick up the orange block closest to my right hand side.

Participant perspective

Partner

Participant

50

Pick up the orange block closest to my right hand side.

Partner

Participant

51

Pick up the orange block closest to my right hand side.

Partner

Participant

52

Please pick up the orange block that is closest to me.

Partner

Participant

Neither perspective

53

Please pick up the orange block that is closest to me.

Partner

Participant

54

Pick up the rightmost orange block

Partner

Participant

Right to ???

55

Pick up the rightmost orange block

Partner

Participant

Unknown perspective

Hypothesis:

Neither Perspective is better

56

57

58

Result:

Prefer Neither Perspective

59

Other Factors

60

61

Pick the blue block that is closer to you and right next to the yellow block

Partner

Participant

Neither perspective

62

Pick the blue block that is closer to you and right next to the yellow block

Partner

Participant

Neither perspective

63

Pick the blue block that is closer to you and right next to the yellow block

Partner

Participant

Neither perspective

64

Pick up the blue block on your far right.

Partner

Participant

Partner perspective

65

Pick up the blue block on your far right.

Partner

Participant

Partner perspective

Tradeoff

66

Robot Partner vs Human Partner

67

68

Robot Partner

Human Partner

Pick up the third blue block from your left

Spatial References and Perspective in Natural Language Instructions for

Collaborative Manipulation

Rosario Scalise, Shen Lirscalise@andrew.cmu.edu, shenli@cmu.edu

69

70

Thank You!

Learn More @ Poster Session

Investigated

Visual features

Perspectives

Dataset will be made available soon!

Perspectives