FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 1 of 42
Deliverable D4.5.2 – Context-aware Virtual 3D Display
Final Report
Contract number : 247772
Project acronym : SRS-EEU
Project title : Multi-Role Shadow Robotic System for Independent Living –
Enlarged EU
Deliverable number : D4.5.2
Nature : R – Report
Dissemination level : PU – Public
Delivery date : 31-03-13 (month 38)
Author(s) : Michal Spanel, Zdenek Materna, Vit Stancl, , Tomas Lokaj,
Pavel Smrz, Pavel Zemcik
Partners contributed : BUT
Contact : [email protected]
The SRS-EEU project was funded by the European
Commission under the 7th Framework Programme (FP7) –
Challenges 7: Independent living, inclusion and
Governance
Coordinator: Cardiff University
SRS-EEU
Multi-Role Shadow Robotic
System for Independent Living –
Enlarged EU
Small or Medium-Scale Focused Research
Project (STREP)
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 2 of 42
DOCUMENT HISTORY
Version Author(s) Date Changes
V1 Michal Spanel 23rd January 2013 First draft version
V2 Michal Spanel, Vit Stancl,
Zdenek Materna March 2013
Document structure updated, description of new parts added
V3 Michal Spanel 30th March 2013 Second draft version
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 3 of 42
EXECUTIVE SUMMARY
SRS (Multi-Role Shadow Robotic System for Independent Living) focuses on the development and
prototyping of remotely-controlled, semi-autonomous robotic solutions in domestic environments
to support elderly people. SRS solutions are designed to enable a robot to act as a shadow of its
controller. For example, elderly parents can have a robot as a shadow of their children or carers. In
this case, adult children or carers can help them remotely and physically with daily living tasks as if
the children or carers were resident in the house. Remote presence via robotics is the key to
achieve targeted SRS goal.
The SRS-EEU extension of the SRS project focuses on multi-modal HRI support aimed at usability,
safety and situation awareness for remote users. The task T4.7 (WP4) aims at the development and
integration of a novel Context Aware Virtual 3D Display that will be able to cope with the project
visualisation needs resulting from the existing requirement specifications over the thin links and
possibly unreliable connection. The new work will produce Virtual Display on the client side based
on predefined 3D maps. The Virtual map will be updated online employing the combination of real-
time 3D robot perception, real-time 2D video processing and remote operator perception.
Deliverable D4.5.2 (M38) comprises full report on specification and performance of the developed
software components.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 4 of 42
TABLE OF CONTENTS
1 Introduction ............................................................................................................................................. 5
2 Virtual 3D Display .................................................................................................................................. 6
2.1 GUI primitives for HRI .......................................................................................................................... 7
2.2 Stereoscopic visualization ............................................................................................................... 14
2.3 Assisted arm navigation .................................................................................................................... 16
2.4 Assisted Grasping ................................................................................................................................ 18
2.5 Space Navigator .................................................................................................................................... 19
2.6 Assisted detection ............................................................................................................................... 19
3 Dynamic environment model and 3D mapping ........................................................................ 22
3.1 Related work ......................................................................................................................................... 22
3.2 Environment model ............................................................................................................................ 23
3.3 Discussion .............................................................................................................................................. 25
4 Virtual 3D Display in RViz ................................................................................................................ 27
4.1 RViz plugins for assisted arm manipulation and grasping .................................................. 27
4.2 3D environment mapping plugins ................................................................................................. 30
4.3 Stereoscopic visualization ............................................................................................................... 32
5 UI_PRO user tests ................................................................................................................................. 34
6 Prerequisities ....................................................................................................................................... 35
7 Documentation of Packages ............................................................................................................ 36
7.1 GUI primitives for HRI ....................................................................................................................... 36
7.2 Assisted arm manipulation and trajectory planning ............................................................. 38
7.3 Assisted grasping ................................................................................................................................. 39
7.4 Dynamic environment model ......................................................................................................... 39
References ....................................................................................................................................................... 42
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 5 of 42
1 INTRODUCTION
Based on the study of SRS project requirements and principles of user interaction with remotely
controlled robots, a detailed proposal of technical solution has been prepared. Key components of
the designed solution are
● dynamic environment model that fuses different sources of data (point-clouds from Kinect
RGB-D sensor, results of object detection and environment perception, results of trajectory
prediction, etc.) and builds a more complex model of the environment – 3D map;
● proposal of principles how to visualize the environment putting all the data sources
together and proposal of new HRI (Human-Robot Interaction) patterns and corresponding
graphic primitives;
● real-time 3D visualization of the environment, so called Virtual 3D Display, that will be able
to cope with the needs of the SRS project;
● new assisted arm manipulation module that allows remote operator to plan and execute
arm trajectory manually;
● and new assisted grasping module for grasping of unknown objects.
The proposed 3D display is an extension to the current SRS concept that increases its usability. A
user benefits from
● improved field of view by means of exocentric camera and 3D visualization of the
environment inside RViz;
● stereoscopic visualization of the 3D environment using NVidia technology based on shutter
glasses;
● single view of the environment fusing different kinds of data (point-clouds, laser scans,
images from camera, results of the object detection, etc.);
● visualization of distance indicators, trajectories, etc. using the newly defined HRI graphic
primitives;
● ability to adjust position of the robot and the robot arm manually using in-scene primitives;
● ability to use the Space Navigator device (3D mouse) to move the robot or adjust position of
the robot’s gripper.
In order to prototype the system, BUT develops the display as a new part of the existing ROS utility
called Rviz. The new Virtual 3D Display will be a part of the UI_PRO interface.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 6 of 42
2 VIRTUAL 3D DISPLAY
General result of the development is the concept of the simplified real-time 3D visualization of the
environment, the dynamic environment model that fuses different sources of data (point-clouds,
video stream, etc.) and built a more complex model (i.e. 3D map) of the environment, principles of
how to visualize the environment putting all the data together, proposal of basic interaction
patterns working with the display; advanced visualization and interaction techniques based on the
NVidia stereoscopic technology and the Space Navigator (i.e. 3D mouse); and a functional concept
of the display integrated into the existing SRS interfaces.
Required features of the display can be summarized as follows:
● nearly exocentric view + ability to adjust the camera,
● single view of the 3D scene (point-clouds, laser scans, etc.),
● visualization of how the robot perceives the environment: simplified avatars of detected
objects and output from the human sensing module (i.e. icons, textured billboards,
bounding boxes, etc.),
● virtual 3D map (voxel-based, basic geometry, etc.) built in runtime,
● visualization of the robot itself (URDF model),
● ability to load predefined pre-recorded 3D map of the environment,
● special markers illustrating real dimensions of objects and distances (e.g. distance indicator
from gripper to next object),
● stereoscopic visualization of the 3D environment based on NVidia 3D technology,
● ability to use Space Navigator as the input device instead of the common mouse,
● visualization of predicted future movement and trajectories if needed,
● and advanced interaction patterns (e.g. assign jobs by clicking on highlighted objects, place
an object model into the scene, etc.).
In order to prototype the system and demonstrate its functionality, BUT has developed concept of
the display as a new part of the existing ROS/Rviz utility and the display will be a new part of the
UI_PRO interface.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 7 of 42
Figure 1: Basic scheme of the existing ROS/SRS packages and newly proposed modules.
Alternative communication schemes not based on standard ROS communication scheme will be
explored and tested. Where appropriate, these alternative communication schemes will be
integrated into the UI_PRO interfaces to increase their robustness when running over Wifi network.
2.1 GUI PRIMITIVES FOR HRI
To visualize the environment and interact with detected objects, new GUI primitives have been
defined. These primitives are based on Interactive Markers [Gos11], an existing part of ROS system.
Primitives can be created either manually or by using predefined services that insert newly created
primitives into Interactive Marker Server.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 8 of 42
Figure 2: Virtual 3D Display based on the developed HRI primitives (left); and detail of distance indicators
(right).
Most of the following primitives are able to show their own context menu (right click in the
interactive mode) and some of them can be grouped together (e.g. bounding box parameter
object_name).
BILLBOARD Billboard is a simple object created as a plane mesh with texture representing a real world object.
The billboard is facing the camera and can illustrate the movement of the represented object.
Currently, there are four predefined objects available: chair, table, person and milk.
Figure 3: Sketchy visualization of detected objects using the billboard GUI primitive.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 9 of 42
Figure 4: The billboard can also illustrate movement of an object, e.g. walking person.
PLANE Plane is a primitive that visualizes a simple un-textured plane without any interaction allowed. The
plane can be tagged.
Figure 5: Tagging a plane in the 3D scene.
BOUNDING BOX Bounding Box allows interaction with the selected object like the movement or the rotation. All
actions are available and configurable from the menu (right mouse click on the bounding box).
Moreover, the bounding box primitive is able to show real object dimensions.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 10 of 42
Figure 6: The bounding box primitive in two different modes: the object manipulation and visualization of the real dimensions.
UNKNOWN OBJECT
“Unknown objects” may represent obstacles in the environment or manually added objects that
weren’t detected automatically. The unknown object can be arbitrary rotated, translated and
scaled.
Figure 7: An obstacle manually inserted into the scene using the unknown object primitive.
REGULAR OBJECT
Regular object represents a detected or real-world object which has its mesh in an object database.
The object can show its bounding box (if specified) and it can be manually rotated, translated and
scaled in the scene. Possible pre-grasp positions can also be shown around the regular object. The
visualization of pre-grasp positions aids the operator to move the gripper to a correct position for
the grasping.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 11 of 42
Figure 8: Visualization of detected objects and possible pre-grasp positions which are stored in the SRS object
database.
The mesh can be specified in a resource file which can be any type of surface meshes supported by
Rviz – .stl model, Ogre’s .mesh version 1.0, or COLLADA .dae format version 1.1. The resource file
must be specified using URI-form syntax, see the ROS package resource_retriever [] for detailes,
including the package:// specification.
PREDICTED ROBOT POSITION
For visualization of predicted robot's movement positions, a special primitive has been prepared.
Predicted robot positions after 1, 2 and 3 seconds are visualized using interactive markers.
Figure: 9 Visualization of the predicted robot position in RViz.
VELOCITY LIMITED MARKER
In many real-world situations the robot might prevent to move or rotate the platform in some
directions because the platform or the arm is very close to either moving or static obstacles. In
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 12 of 42
these situations it is very frustrating if the remote operator cannot easily decide in which directions
the movement is allowed and in which direction the robot cannot be moved.
Figure 10: Velocity limited marker shown when the robot cannot move in a particular direction (left) and rotate
in place (right).
To help the remote operator to decide how he can manually drive the robot while avoiding the
obstacles, we have prepared another HRI primitive - velocity limited marker. When the robot is
close to an obstacle it automatically reduces its maximum velocity until zero in this particular
direction to avoid the collision. This is the standard behaviour of the robot’s low level motion
interface. The velocity limited marker shows special markers around the robot in the 3D scene to
show in which directions the velocity of the robot is limited (see Figure 10). This helps the remote
operator to quickly decide what is the problematic obstacle and how to drive the robot.
IN-SCENE TELEOP In order to provide an intuitive way to to drive the Care-O-bot directly from Virtual 3D display we
have developed a special in-scene marker, COB Interactive Teleop, that is based on ROS Interactive
markers.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 13 of 42
Figure 11: Driving the robot using the in-scene teleop. Driving forward and backward is done using red arrows.
Strafe to the left and to the right is done using green arrows. Rotating is done using the blue circle. You can also
drive to specified position by moving with the yellow disc.
The in-scene teleop allows to move the robot, rotate the the robot in place and you can also grab a
yellow disc in the middle (Figure 11) and the robot will automatically start to follow the disk trying
to reach its position.
FOV AND LIVE KINECT DATA VISUALIZATION An important question was how to combine the “historical” data stored in the 3D map of the
environment with the live RGB-D data coming from the Kinect device (i.e. colored point cloud). It is
obviously important to show the remote operator the latest data and don’t obstruct the view with
any artifacts stored in the 3D environment map - the previous recordings. Moreover, the resolution
of the 3D map is lower than the resolution of the live Kinect data. Even our 3D mapping module is
able to filter out an outdated data, it is straightforward to present the data in the maximum
available quality.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 14 of 42
Figure 12: Visualization of the current FOV (yellow lines) and combination of live Kinect data rendering inside
the FOV and historical data from the 3D voxel-based map outside of the field of view.
The final solution, tested during UI_PRO user tests in February, uses the information about the
position of the robot and its torso to cut out the part of the 3D map inside the current field view and
show the live Kinect data there. The maximum distance from the camera, in which the points are
filtered, can be limited because the effective range of the Kinect sensor is limited too.
To make clear the difference between the live and the historical data, the current field of view of the
Kinect sensor is visualized using two thin lines so they don’t obstruct the view.
2.2 STEREOSCOPIC VISUALIZATION
Stereoscopic visualization strongly increases user experience and feeling of "being in". Also it
simplifies some common tasks that depends on (the best possible) the operator orientation in
space - for example grasping, robot navigation in the room, robot arm position, scene objects
mutual position and orientation, obstacle avoiding, better and faster helper geometry use without
any need to use more scene views and of course much better distances perception. All these tasks
are very simplified with the third dimension added. Without using the stereo visualization the
operator often needs to manipulate with view direction to see the scene from more angles.
Visualization of the robot is already solved in ROS by RViz program that allows us to represent all
the required elements like point clouds and user defined geometry. An easy way to display the
stereo visualization is to adapt this program to take advantage of some of the available and used
solutions. The scene in RViz is made using Ogre library, which, however, in the used version (1.7.3)
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 15 of 42
is not ready for any use of the stereoscopic display in the Linux operating system. Thus it was
necessary to modify the Ogre library as well as RViz itself.
There are several commercial solutions for the stereo display in computer graphics. We are using
NVidia 3D vision technology to achieve the stereoscopic effect in the RViz. The NVidia 3D vision is a
stereoscopic kit from NVidia. This kit consists of LC shutter glasses and driver software. The glasses
use wireless (IR) protocol to communicate with emitter connected to the USB port. This emitter
provides timing signal. The stereo driver software performs the stereoscopic conversion by using
3D models submitted by the application and rendering two separate views from two slightly shifted
points. Fast stereo LCD monitor (120Hz) shows this two images periodically and shutter glasses
controlled by the emitter presents the image intended for the left eye while blocking the right eye's
view and vice versa.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 16 of 42
Figure 13: The stereoscopic visualization principle.
2.3 ASSISTED ARM NAVIGATION
As an alternative to the fully autonomous manipulation operations a new semi-autonomous
solution has been designed and developed by BUT as a part of UI_PRO interface. Assisted arm
navigation can be used in cases when automated planning of the arm trajectory fails or is not
applicable.
This alternative solution is based on set of packages and offers complete pipeline for the
manipulation tasks consisting of the object detection (see Chapter 2.6), arm trajectory planning and
grasping (Chapter 2.3 and 2.4). The arm trajectory planning is based on functionality of the
arm_navigation stack (standard ROS stack) and BUT’s Environment model. The voxel-based 3D
map of the environment is used to provide collision-free arm planning.
When needed, the human operator can set a goal position of the arm end effector in the 3D virtual
environment. The goal position can be set using interactive markers or more intuitively by a 3D
positioning device – Space Navigator. While adjusting the virtual end effector position the real
manipulator does not move. Interface indicates if the desired position is reachable by the arm and if
there are no collisions with the environment model or object representations. A collision-free
trajectory from a start position to a goal one is planned automatically. Before executing the planned
motion on the robot, the operator can run its visualization several times and decide if the motion
plan is safe.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 17 of 42
Figure 14: The arm goal position visualization and the planned trajectory animation.
The assisted arm navigation has been prepared with generality in mind so, it can be used on any
robot supporting the arm_navigation stack. The solution is divided into several ROS packages to
separate API definition (messages, services and actions), backend, GUI (RVIZ plugins) and SRS
integration. The integration into SRS ecosystem has been developed in the form of SMACH generic
states (one generic state for each scenario). These generic states utilize the assisted arm navigation
API and provides integration with other components (object DB, grasping etc).
There are basically three scenarios when the assisted arm navigation can be used.
● First is the case when there is a known object but the robot is not able to plan the arm
trajectory fully autonomously for some reason (too complex environment for instance). In
this case, the operator is asked to select one of the pre-computed pre-grasp positions (by
srs_grasping), simulate the movement, execute it and give the control back to the decision
making module.
● Second scenario is the situation when the robot cannot finish some manipulation task and
the operator might be asked to move the arm to a safe position.
● The third scenario is grasping of an object which is not stored in the object database or
cannot be automatically detected for instance because of poor lighting or occlusion. In this
case, the remote operator is first asked to manually specify rough bounding box of the
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 18 of 42
object, then to navigate the arm, grasp the object (using assisted grasping module) and
finally navigate the arm to put the object on the robot’s tray.
Figure 15: Example of collision checking against the environment. The goal position cannot be
reached because it is in a collision.
2.4 ASSISTED GRASPING
Assisted grasping has been developed to allow safe and robust grasping of unknown or
unrecognized objects by SDH gripper equipped with tactile sensors. It has a separate API definition
(i.e. actionlib interface), code and GUI in form of RVIZ plugin.
When calling grasp action there is possibility to specify a target configuration (angles) for all joints
of the gripper, the grasp duration and maximum forces for all tactile pads. Then, for each joint,
velocities, acceleration and deceleration ramps are automatically calculated in such a way that all
the joints will reach the target configuration at the same time.
If the measured force on a pad will exceed requested maximal force, movement of a corresponding
joint will be stopped. With different target configurations and appropriate forces, a wide range of
objects can be grasped - squared, rounded, thin, etc. The assisted grasping package also offers node
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 19 of 42
for preprocessing of the tactile data by using median and Gaussian filtering with configurable
parameters.
2.5 SPACE NAVIGATOR
Space Navigator (SN) is a 6 DOF positioning device. It has been integrated into numerous
professional interfaces to make specific 3D manipulation tasks more intuitive and faster. In the
assisted arm navigation scenarios, it is used to set the goal position and the orientation of the
virtual end effector. Control input from the SN is recomputed to make the changes in position of the
end effector view-aligned. All the changes are made in the user perspective (i.e. viewing camera
coordinate system). This helps to lower mental load of an operator and leads to very intuitive way
of the interaction.
Figure 16: Space Navigator - an alternative UI_PRO input device.
Moreover, sensitivity of the SN control is non-linear which allows the operator to make very precise
changes and at the same time to move the end effector over relatively long distances.
In addition a package for driving the robot base using the SN device has been developed too. It also
considers position of the observing camera in the 3D virtual environment so the control of the
robot is also view-aligned. The remote operator may, in certain situations, decide to switch-off the
collision avoidance using the SN button. The second button may be used to switch control to “robot
perspective” mode.
2.6 ASSISTED DETECTION
In the assisted arm navigation scenario, when there is an unknown object, it is necessary to have its
bounding box which is then considered when planning the arm trajectory. For this reason, solution
based on BB estimator (see D4.6.2) has been developed. There is an actionlib interface to call for
the user action. An image stream from the robot’s camera is shown to the remote operator and he is
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 20 of 42
asked to select a region of interest in the image. Then, the approximate position and size of the 3D
bounding box is estimated and the result is placed in the 3D scene. The operator can fine tune this
rough estimation using interactive markers or, if the estimation is not proper, he can select a new
region of interest in the image.
Figure 17: Selection of ROI containing the target object in an image stream.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 21 of 42
Figure 18: Adjusting the rough bounding box estimated from the chosen ROI in the 3D virtual scene.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 22 of 42
3 DYNAMIC ENVIRONMENT MODEL AND 3D MAPPING
In order to plan the robot motion it is required more than just sensor scans of the visible
environment. It must be possible to process incoming information, categorize and save it for later
use. It is clear that the remembered (and currently invisible) scene will change over time. For
planning a longer route from the current position to a place already visited this information is
needed. Another specific case is the collision free arm planning that requires to have a very detailed
model, or map, of the 3D environment.
3.1 RELATED WORK
There are several approaches to the problem of mapping of the environment in which the robot
moves. By registering and integrating incoming scans (whether from the Kinect, laser sensor or any
other device) to the existing structure as far as attempts not only to assume incoming data but to
recognize and set forth what robot sees, where it is and what does it mean for the future motion
planning. This represents to build a semantic model of the environment – to assign importance and
meaning to the incoming data and to conclude some new information.
This subject (the assignment of meaning to the detected geometry objects) is investigated in the
large number of works with a different approach. Some articles are aimed at recognition of position
or locality (if it was already visited [Cum08], or room detection by signs placed [Tap05], or attempt
to interpret room purpose from objects found [Vas07]), other to categorization of space (in relation
to the types of objects found on a given point [Tor03], and classification using AdaBoost [Moz07]).
Figure 19: 3D environment mapping using the OctoMap library. Image adopted from [Wur10].
Our system belongs to the category of low-level mapping. It takes care of processing data coming
from the input sensors and their placement in the current model. Gradually, this creates a data set
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 23 of 42
suitable for further work. The advantage of this approach is the ability to process any piece of data
independently on the real-time input. It is thus possible to gradually replace parts of the map by the
detected entities and to create higher level semantic model.
3.2 ENVIRONMENT MODEL
The environment model serves as an encapsulation of sensor received data, automatically detected
objects and geometry primitives and user (i.e. operators) marked entities - all in one place. The
environment model will provide services for data mining (for example it gives all obstacles near the
robot), information needed for more sophisticated robot or arm navigation in the 3D space and
orientation, and also allows data compression for transmission – instead of large point-cloud data
predefined “object shortcuts” can be send for recognized objects.
The current version of the environment model is built upon OctoMap library [Her11] that
implements the octree space partitioning and voxel occupancy system. The OctoMap models
environment as a grid of cubic volumes of equal size. The octree structure is used to hierarchically
organize this grid. Each node in this octree represents space contained in the cubic volume and this
volume is recursively subdivided into eight subvolumes until a preset minimum voxel size is
reached. The OctoMap library uses probabilistic volume occupancy estimation to cope with
problems associated with input sensor noise.
Figure 20: BUT environment model showcase.
Heart of the whole system and the most complex part is the octomap module. It stores the incoming
data to the octomap structure described before and provides some additional functionality. It can
load and save whole data set to the local file. Incoming cloud is filtered for ground plane/non
ground part, speckles can be removed and outdated and noise is removed by our modified ray-cast
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 24 of 42
algorithm. As the complementary, manual octomap modifications can be done – box-like part of the
map can be marked as free or occupied.
Figure 21: The environment model manual edit.
The environment model is divided into so called plugins. This allows to extend its structure easily.
There is a whole family of plugins available:
● Simple point cloud plugin. This plugin can operate in two modes – as the input plugin which
transforms incoming cloud to an internally used representation and in the output mode it
scans through the octomap data (on the preset detail level) and publishes map of the
environment as a point cloud on the corresponding topic.
● Limited point cloud plugin. The mentioned limitation lies in the octomap scanning phase.
Plugin is subscribed to the RViz camera position topic and it publishes only visible part of
the map from the operators perspective. This can be of great utility when using some
external device to control the robot and we need only a part of the map visible on screen.
● Compressed point cloud plugin. Works like the plugin described before but the robot
internal camera position is used as a view position so the plugin publishes only differences
made to the internal octomap. The plugin can be used in cooperation with the compressed
point cloud publisher node. This node collects published partial clouds send over network
to the remote operator’s PC and combines them into the same octomap model as the
environment model has. This heavily reduces transferred data amount because only a
fraction of that whole 3D map is usually sent over the network.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 25 of 42
Figure 22: Example of the published collision grid used for collision free arm trajectory planning..
Other possibilities of output data formats are covered by the following publishing plugins:
● Collision map plugin,
● Collision grid plugin,
● Collision object plugin,
● 2D map plugin.
Each of these plugin publishes appropriate data type messages and each can use another depth of
octomap tree scan.
An important property of the environment model is the possibility of saving the currently known
and scanned surrounding to the file and ability to restore and further update the whole previously
recorded model. Navigation and path finding algorithms can obtain and use the whole data set and
in different resolutions. The octomap and the collision map can be locked and manually modified
for example to remove some unimportant part which confuses the arm trajectory finding algorithm.
Generally, this concept allows to connect and to combine several data inputs in the one collecting
channel and to easily use this acquired information to the more high-level planning, visualization
and robot motion control.
3.3 DISCUSSION
The presented environment model is computationally more demanding part of the whole system.
Therefore, it is an important possibility to configure individual processing steps. This allows scaling
and load balancing depending on the requirements for the map accuracy and speed of response.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 26 of 42
Moreover, it is not necessary (and in fact even desirable) to process each incoming frame. Use of the
model is rather for the construction, maintenance and use of stable parts of the scene around the
robot. When all standard functions are turned on (input in the form of point cloud, filtering solitary
cells in octomap, rapid noise removing from the visible part of the map and output again in the form
of point cloud and collision map) approximately 3 frames per second are processed with a load of
one processor core in the range of about 50-70%. The most challenging operations are mainly
writing to the hierarchical octomap structure, data filtering and point cloud registration to the
existing part of the map.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 27 of 42
4 VIRTUAL 3D DISPLAY IN RVIZ
The prototype of the Virtual 3D Display (extension of the basic UI_PRO functionality) introducing
principles of the visualization and interaction patterns related to the Dynamic Environment Model
is developed in ROS/RViz [Her11]. Several new loadable plugins are provided able to extend
functionality of the standard RViz module:
● simplified user friendly interface for the manual arm trajectory planning and assisted
grasping,
● controls to adjust behaviour (enable/disable features) of the Dynamic Environment Model,
● extension displays able to visualize data (point clouds, etc.) according to our specific needs.
● visualization of described HRI primitives using Interactive Markers,
● and stereoscopic 3D environment visualization by means of NVidia 3D Vision technology.
4.1 RVIZ PLUGINS FOR ASSISTED ARM MANIPULATION AND GRASPING
User interface for the assisted arm navigation consists of the virtual manipulator representation in
the 3D scene and of RVIZ plugin. The controls in the plugin are disabled by default. The operator is
noticed by a pop-up window and appropriate controls become active when there is some task
(action interface was called). Simple GUI allows the operator to start a goal position setting, plan
trajectory to the desired position, execute it and stop the execution in an emergency case.
The operator may decide to plan more trajectories for one task. When the task is finished, the
operator clicks on “Task completed” button. If the operator will find the assisted detection not
precise enough during an attempt to perform manipulation task it is also possible to repeat the
detection process.
Several additional controls have been developed to help the operator to fulfill tasks. In cases when
there is a very complex environment or the 3D model is not precise because of some gaps or noise,
the operator can enable “Avoid fingers collisions” functionality. Virtual gripper will be extended by
a slightly bigger cylinder and then, this cylinder will be considered when planning the trajectory.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 28 of 42
Figure 23: Artificial cylinder to prevent finger collisions against the environment.
To speed up the process, the operator may choose one of predefined goal positions. The predefined
positions are of two types: absolute (denoted by “A”) and relative (“R”). The absolute position is
defined in the robot coordinate frame and can help the operator to move the virtual gripper for
instance to the robot’s tray faster. Second type of the predefined position is relative to the current
virtual gripper pose and can be used when it is necessary to lift an object for instance. There is also
an “undo” functionality which provides configurable number of back steps. If the operator want to
keep the current orientation (position) of the virtual gripper, it is possible to press the right (left)
button of the Space Navigator and lock it. These locks are indicated by checkboxes in
“SpaceNavigator locks” section.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 29 of 42
Figure 24: Assisted arm navigation GUI in different states.
When doing the grasping, the operator can select an object category and then press the “Grasp”
button. Categories are predefined in a configuration file. Each category has a name, target
configuration of SDH joints and desired maximum forces for all fingers. In the UI, the operator can
see names and corresponding maximum forces. After execution of the grasp, the operator can
decide if it was successful using tactile data visualization.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 30 of 42
Figure 25: Assisted grasping UI in different states.
4.2 3D ENVIRONMENT MAPPING PLUGINS
Octomap Control Panel (OCP) combines essential features for the interactive control of the
environment model. For any direct manipulation of the data an interactive marker object must be
inserted into the scene. For this purpose the first set of buttons serves. Button "Add selection box"
inserts an element at the position where it was inserted in the previous case (or at the origin of the
map coordinate system, if the box has never been inserted before). The "Hide selection box" deletes
the interactive element from the scene.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 31 of 42
Figure 26: Octomap control panel.
After the interactive marker insertion and it’s positioning and scaling, the octomap or the collision
map can be directly modified. In order to make changes permanent, the relevant part of the
dynamic model should be locked so the new data from our sensors are not added into and our
modifications are not overwritten. To do this, the check boxes "Pause mapping server" and "Lock
collision map" should be used. Then user can either insert data into the octomap (button "Insert
points at selected position"), delete octomap area included in the box ("Clear point-in-box ') insert
obstacle into the collision map (" Insert obstacle at selected position "), or delete a part of the
collision map ("Clear collision map in box"). The last not mentioned button in the “3D environment
map” group can be used to delete the entire octomap.
Figure 27: Near and far clipping planes set to the default distances.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 32 of 42
Figure 28: Near clipping plane slightly moved to hide interfering geometry.
Camera Control Panel controls one aspect of the Ogre camera in the RViz visualization window -
near and far clipping planes. In some cases, the displayed scene geometry interferes with the view
of the robot and it is not possible to rotate the scene so that the direct view is achieved and also, for
example, the operator cannot see an object that the robot should grasp. In such a situation, it is
possible to move the clipping planes so that obstacles disappear. Each individual slider controls the
position of one plane. The total visible distance can be set by the value of the spin box, where 100
means 100 percent of the base default value used in RViz.
4.3 STEREOSCOPIC VISUALIZATION
As mentioned above, for the stereoscopic display we use the hardware kit developed by NVidia. We
used the PNY NVIDIA Quadro 4000 graphics card and Asus VG278H 27 LCD". When properly setup
OpenGL scene is displayed alternately for the left and right eye the shutter glasses covers the
opposite eye. The Ogre library allows you to use this effect but the library version used in RViz (ROS
Electric) allows it only in a Windows environment. There was an unfinished patch
(http://www.ogre3d.org/forums/viewtopic.php?f=4&t=70179) which was used as a base for our
necessary adjustments. It combines several things:
● It adds new configuration parameters for Ogre which are then evaluated and passed to the
OpenGL layer.
● The graphics initialization function modifications.
● It adds new camera parameters which are possible to be changed to achieve better 3D
effect.
● The method that is suspended in the rendering loop that turns the position of the camera
according to the set positions of the right and left eye.
In addition to the Ogre patching (this was finally solved using the patches automatically applied
when compiling the stack), it was necessary to modify the RViz itself. A parameter was added that
can be used to turn the stereo mode on or off when you start the program. Application initialization
method was changed according to the modified library. Furthermore, a support for user set
parameters (position of the eyes, focal length) was added.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 33 of 42
The resulting stereoscopic effect actually brings the expected improvement of the perception of the
surrounding space and makes it easier to control the robot. It significantly reduces the need to
constantly turn the scene and look from different angles, so that the operator has received a more
complete idea of the layout of the room and the location of the objects to be manipulated.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 34 of 42
5 UI_PRO USER TESTS
To validate usefulness of the developed solutions, large user tests have been performed. The user
tests consist of three experiments where first two experiments took place in February 2013 and
over 50 carefully selected participants were asked to pass prepared tasks within two weeks of
intensive testing. The user tests were organized by HDM and prepared mainly in cooperation
between HDM, BUT and IPA partners.
The experiment one was aimed to expose advantages and disadvantages of different 3D
environment models. Users tried to solve navigation tasks in a home-like environment, unsolvable
for the existing autonomous system, using voxel-based 3D map (BUT’s Environment Model) or
using geometric map (IPA’s plane detection). During the navigation tests, users controlled the robot
using the Space Navigator device.
The second experiment has been designed to discover potential advantage of the stereoscopic user
interface (better perception of the depth, higher precision, etc.) for the navigation and manipulation
tasks. In this case, the navigation tasks were similar to those in experiment one. Manipulation tasks
were based on the assisted arm navigation approach where users were asked to perform pick and
place tasks in a cluttered scene.
The third planned experiment is aimed on validity of user studies carried out with robot
simulations where users will try to solve exactly the same tasks as before in the reality. This
experiment will show if it is possible to obtain reliable results when using simulation and will
identify in which areas the results are sound and which aspects of simulations need to be improved
to provide better results. Mainly for this experiment but also for testing purposes, the whole testing
site has been modelled with very high accuracy in the Gazebo simulator. The simulated
environment is based on CAD models, it uses realistic textures and models all objects present in the
real environment.
A new SRS package to support all these tests have been developed (srs_user_tests). It contains all
necessary configuration files and makes launching of particular test under specific condition easy
with just two commands: one for robot and one for operator’s PC. These commands also start
logging of specified data for later analysis.
Due to the limited amount of time to evaluate huge log data obtained during the user tests, the
results are not reported in this deliverable. We expect to report results of this joint work in other
deliverable and also publishing papers in journals and scientific conferences.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 35 of 42
6 PREREQUISITIES
The core prerequisites for the software to be used are:
● Linux OS (developed on Ubuntu 11.10),
● Robot Operating System (ROS) (developed on Electric version),
● Care-O-bot stacks installed in ROS,
● Stacks for COB simulation in robot simulator Gazebo,
● srs_public stack.
MANUAL ARM MANIPULATION AND PLANNING
● core prerequisites
● additional stacks: cob_manipulation, arm_navigation, warehousewg, joystick_drivers
DYNAMIC ENVIRONMENT MODULE
● Octomap Library (http://www.ros.org/wiki/octomap_mapping)
● Ogre library - stacks
VIRTUAL 3D DISPLAY
● RViz utility and Interactive Markers installed in ROS [Gos11,Jon11].
The software components are property of Brno University of Technology. Most of the components
are be available under LGPL open source license, or a license for academic/research purposes can
be granted to any prospective user.
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 36 of 42
7 DOCUMENTATION OF PACKAGES
The proposed software components are mostly realized as ROS nodes/services in C++ or Python
programming language. This section briefly describes used interfaces of newly created ROS nodes
and their integration with shadow robotic system that is developed within the SRS project.
7.1 GUI PRIMITIVES FOR HRI
All GUI primitives are implemented as Interactive Markers [Gos11]. All necessary ROS services,
own interactive marker server called but_gui_service_server and relevant C++ source files can be
found in packages srs_interaction_primitives, srs_ui_but, cob_interactive_teleop and
cob_velocity_filter. Predefined ROS services can be used to add, modify or remove GUI primitives.
USAGE:
First you have to run the server:
roslaunch srs_interaction_primitives interaction_primitives.launch
LIST OF AVAILABLE SERVICES:
● /interaction_primitives/add_bounding_box
● /interaction_primitives/add_billboard
● /interaction_primitives/add_plane
● /interaction_primitives/add_plane_polygon
● /interaction_primitives/add_object
● /interaction_primitives/add_unknown_object
● /interaction_primitives/remove_primitive
● /interaction_primitives/change_pose
● /interaction_primitives/change_scale
● /interaction_primitives/change_color
● /interaction_primitives/change_description
● /interaction_primitives/get_update_topic
● /interaction_primitives/change_direction
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 37 of 42
● /interaction_primitives/change_velocity
● /interaction_primitives/set_pregrasp_position
● /interaction_primitives/remove_pregrasp_position
● /interaction_primitives/set_allow_object_interaction
● /interaction_primitives/clickable_positions
● /interaction_primitives/robot_pose_prediction
Moreover, each primitive publishes special update topics about their internal changes of position,
size or menu interactions.
LIST OF UPDATE TOPICS:
● /interaction_primitives/primitive_name/update/pose_changed
● /interaction_primitives/primitive_name/update/scale_changed
● /interaction_primitives/primitive_name/update/menu_clicked
● /interaction_primitives/primitive_name/update/movement_changed (for Billboard only)
● /interaction_primitives/primitive_name/update/tag_changed'' (for Plane only)
Example of calling the service add_bounding_box from bash:
rosservice call /interaction_primitives/add_bounding_box '{frame_id: /base_link, name:
bbox, object_name: obj, description: "", pose: { position: { x: -1, y: 0, z: 0 }, orientation: { x: 0,
y: 0, z: 0, w: 1 } }, scale: { x: 1, y: 1, z: 1 }, color: { r: 1, g: 0, b: 0 }}'
PARAMETERS:
● frame_id (string) - fixed frame,
● name (string) - bounding box name,
● object_name (string) - attached object name [optional],
● description (string) - bounding box description [optional],
● pose (geometry_msgs::Pose) - position and orientation of the bounding box,
● scale (geometry_msgs::Vector3)- scale of the BB,
● color (std_msgs::ColorRGBA) - color of the BB.
Example of calling the change_pose service:
rosservice call /interaction_primitives/change_pose '{name: plane, pose: { position: { x: -1,
y: 0, z: 0 }, orientation: { x: 0, y: 0, z: 0, w: 1 } }}'
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 38 of 42
PARAMETERS:
● name (string) - object name,
● pose (geometry_msgs::Pose) - new pose.
Description of all remaining services can be found on ROS wiki pages.
7.2 ASSISTED ARM MANIPULATION AND TRAJECTORY PLANNING
Assisted arm navigation functionality is divided into several packages. Package
srs_assisted_arm_navigation contains the main node and configuration for collision free arm
planning for Care-O-Bot. Then, there is srs_assisted_arm_navigation_msgs with API definition and
srs_assisted_arm_navigation_ui with user interface implementation.
USAGE:
roslaunch srs_arm_navigation but_arm_nav_sim.launch
roslaunch srs_assisted_arm_navigation_ui rviz.launch
The main launch file for testing is but_arm_nav_sim.launch. It launches simulation of the Care-O-
bot, 2D navigation and all needed planning and sensing stuff. Namely it's:
● Mongo database (for storing trajectory related data),
● robot self-filter for point-cloud data,
● environment server (detection of collisions),
● planning scene validity server,
● interpolated IK motion planner,
● RVIZ (display of all needed markers),
● constraint aware kinematics,
● OMPL planning,
● trajectory filter server,
● and Assisted arm navigation node.
After the second launch file is started, user will see RViz display as well as Assisted arm navigation
user interface.
Launch files and configuration files for arm_navigation stack were automatically generated by
Planning Description Configuration Wizard utility and are COB specific (features and constraints of
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 39 of 42
the COB arm are considered). They were modified to use point-cloud data from the RGB-D camera
(Kinect) to detect collisions of the arm with the environment.
User can manipulate the arm gripper in the 3D environment by interactive markers of
Spacenavigator 6 DOF mouse. Feasibility of current position (existence of an IK solution) is
indicated by the marker color which represents the arm and by text message. When the new
position is ready, user can raise trajectory planning, filtering and execution.
There is a actionlib interface which can be used to give task to a user and set of services where
some of them are used for internal communication between background node and user interface.
Complete documentation of parameters, topics and services can be found on ROS wiki:
http://www.ros.org/wiki/srs_assisted_arm_navigation.
7.3 ASSISTED GRASPING
Assisted grasping is divided into three packages. Package srs_assisted_grasping contains main
node, tactile data filter and velocity interface simulation, srs_assisted_grasping_msgs API
definition and srs_assisted_grasping_ui implementation of user interface. The main node realizes
grasping and commands SDH gripper in velocity mode. Because the simulation of COB lacks this
velocity mode (there is only position mode available), there is also implementation of simulated
velocity mode which translates velocities for joints to positions.
USAGE:
roslaunch srs_assisted_grasping assisted_grasping_test_sim.launch
roslaunch srs_assisted_grasping_ui rviz.launch
where first command will start simulation, grasping node and tactile filter node and latter one RVIZ
with grasping plugin. Then it’s necessary to call actionlib interface. This can be done (for testing
purposes) by following script:
rosrun srs_assisted_grasping grasping-as-test.py
Complete documentation of parameters, topics and services can be found on ROS wiki:
http://www.ros.org/wiki/srs_assisted_grasping.
7.4 DYNAMIC ENVIRONMENT MODEL
The Dynamic Environment Model is realized as OctoMap server called but_server that provides the
3D mapping functionality and some additional ROS services. The server processes point-cloud data
from RGB-D Kinect sensor and gradually builds voxel-based map of the environment. The map is
published using common sensor_msgs::PointCloud2 message.
USAGE:
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 40 of 42
In order to run the environment server, type:
roslaunch srs_env_model only_but_server.launch
or you can run a complete example including COB simulation in Gazebo:
roslaunch srs_env_model only_but_dynmodel_test1.launch
Topics published by but_server_node:
● /but_env_model/binary_octomap
● /but_env_model/pointcloud_centers
● /but_env_model/collision_object
● /but_env_model/marker_array_object
● /but_env_model/map2d_object
● /but_env_model/collision_map
● /but_env_model/visualization_marker
● /but_env_model/visible_pointcloud_centers
LIST OF AVAILABLE SERVICES:
All services are published in the /but_env_model/ namespace.
● server_reset
● server_pause
● server_use_input_color
● get_collision_map
● insert_planes
● reset_octomap
● load_octomap
● load_octomap_full
● save_octomap
● save_octomap_full
● add_cube_to_octomap
● remove_cube_from_octomap
● lock_collision_map
● is_new_collision_map
● add_cube_to_collision_map
● remove_cube_from_collision_map
● set_crawl_depth
● get_tree_depth
OBJECT TREE PLUGIN SERVICES
Services provided by Object tree plugin can be divided into two parts. The first one is common for
all saved objects. Services in the other one have variants for all supported objects. All services are
published in the /but_env_model/ namespace.
● get_objects_in_box
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 41 of 42
● get_objects_in_halfspace
● get_objects_in_sphere
● remove_object
● show_object
● show_objtree
Following services are available for these object types: plane, aligned box and bounding box. All
services are published in the /but_env_model/ namespace.
● get_{object_type}
● get_similar_{object_type}
● insert_{object_type}
● insert_{object_type}_by_position
MESSAGES ● srs_env_model/OctomapUpdates
FP7 ICT Contract No. 247772 1 February 2010 – 30 April 2013 Page 42 of 42
REFERENCES
[Gos11] Gossow, D. and Ferguson, M.: Iteractive Markers, ROS Wiki page, stack visualization,
available on the web: http://www.ros.org/wiki/interactive_markers, 2011.
[Jon11] Jones, E. G.: Warehouse Viewer, ROS tutorial, stack arm_navigation, available on the
web: http://www.ros.org/wiki/arm_navigation/Tutorials/tools/Warehouse%20Viewer, 2011.
[Her11] Hershberger, D. and Faust, J.: RViz, ROS Wiki page, stack visualization, available on
the web: http://www.ros.org/wiki/rviz, 2011.
[Wur10] Wurm, K. M., Hornung, A., Bennewitz, M., Stachniss, C. and Burgard, W.: OctoMap: A
Probabilistic, Flexible, and Compact 3D Map Representation for Robotic Systems, Proc. of the ICRA
2010 Workshop on Best Practice in 3D Perception and Modeling for Mobile Manipulation,
Anchorage, AK, USA, 2010. http://octomap.sf.net/
[Tap05] Tapus, A. and Siegwart, R.: Incremental Robot Mapping with Fingerprints of Places,
Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA 2005),
Edmonton, Alberta, Canada, 2005.
[Cum08] Cummins, M. and Newman, P. M.: FAB-MAP: Probabilistic localization and mapping in
the space of appearance, The International Journal of Robotics Research (IJRR), 27(6):647–665,
2008.
[Vas07] Vasudevan, S., Gächter, S., Nguyen, V. and Siegwart, R.: Cognitive maps for mobile
robots - an object based approach, Robotics and Autonomous Systems (RAS), 55(5):359–371, May
2007.
[Tor03] Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A.: Context-based vision
system for place and object recognition, In Proceedings of the 9th IEEE International Conference on
Computer Vision (ICCV’03), 2003.
[Moz07] Mozos, O. M., Triebel, R., Jensfelt, P., Rottmann, A., and Burgard, W.: Supervised
semantic labeling of places using information extracted from sensor data, Robotics and Autonomous
Systems (RAS), 55(5):391–402, 2007.