+ All Categories
Home > Documents > An Efficient Pipeline for Pick-and-Place Between Bins for ...An Efficient Pipeline for...

An Efficient Pipeline for Pick-and-Place Between Bins for ...An Efficient Pipeline for...

Date post: 31-Dec-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
2
An Efficient Pipeline for Pick-and-Place Between Bins for Warehouse Automation Rahul Shome 1 , Wei N. Tang 1 , Chaitanya Mitash 1 , Abdeslam Boularias 1 , Jingjin Yu 1 , and Kostas Bekris 1 Abstract— Advances in sensor technologies, object detection algorithms, motion planning frameworks and manipulator designs have motivated vast leaps in the application of robots in warehouse automation. A variety of such applications, like order fulfillment or sorting tasks, require pick-and-place actions that transfer objects between bins or containers placed in the robot’s reachable workspace. The challenge in designing an effective solution to this problem relates to appropriately integrating multiple components so as to achieve a robust pipeline that maximizes important metrics for the warehouse automation industry, such as the number of successful picks per hour, as well as minimizing failure conditions. This process involves careful study of workspace design, perception and picking primitives, as well as motion and task planning. I. I NTRODUCTION &METHOD There has been significant interest in the deployment of robotic systems in warehouse automation [1], [2], [3]. Typi- cally robots have been used for large scale industrial setups to perform repetitive tasks in highly structured environments, as in automobile manufacturing. Recently, there is a push to expand the applications of robotic arms in less structured settings that arise in order fulfillment as well as warehouse sorting tasks. Prior work [1], [4] has investigated the importance of careful design choices in terms of end-effector modalities, perception systems and planning methods appropriate for the problem. The current work focuses on a specific challenge involving two bins in the robot’s reachable workspace on a tabletop (Fig 1), where the objective is to pick up every object from the source bin, and transfer all of them to the target bin, as fast and robustly as possible. This report describes the considerations that were made to solve the problem, in terms of workspace design, sensing methodology, grasping scheme, planning algorithms and a fast execution pipeline. The demonstration section will re- count executions of the pipeline on real-world scenes. The observations made are not specific to the specific setup, but can apply to a wide variety of similar applications involving robotic arms for automation. A. Hardware The robot used in the setup is the Kuka IIWA14, which is a 7 DoF arm. A custom designed end-effector solution extrudes a cylindrical end-effector that ends with a compliant suction cup, to engage vacuum grasps. A high-power compressor 1 The authors are affiliated to the Department of Computer Science, Rutgers University, New Brunswick, New Jersey, USA. *All authors have made equal contributions to the work. *This work has been undertaken as a part of a project funded by jd.com Fig. 1. The workspace takes into account reachability with overhand picks. and valve mechanism is used to generate powerful suction forces at the end-effector. Two RealSense RGB-D cameras are mounted on a frame that points them to the respective bins. This frame is attached to the static base of the robot such that calibration errors are minimized, in estimating the positions of the cameras in the robot’s coordinate frame. A portable computing device is connected to both the cameras to act as a low-latency interface that publishes the sensing data to the planning and perception machines. There is an additional machine that runs the Kuka drivers and controllers. B. Workspace Design An effective workspace design can significantly impact the efficiency of the overall system. A robotic arm m with d DoFs has a configuration space C⊂ R d defining the space of all possible configurations q ∈C of the arm. The operation FK(q) returns an end-effector pose p SE(3). We define the reachable task space as the set of all end-effector poses: T = {p SE(3) : q ∈C so that FK(q)= p}. The setup is designed as shown in Fig. 1. The annular blue region represents the subset of the reachable workspace that allowed for top-down picks for the robot’s end-effector. Experiments indicate that the radial region between 40cm and 70cm from the robot center maximizes success for top- down picks, over the range of heights above the tabletop plane. The bins, represented by red rectangles, are placed so that they lie inside the reachable region. This promotes picking strategies that approach the bins from the top. C. Pose Estimation The RGB-D data captured by the sensor is passed through a convolutional neural network trained to perform object seg- mentation at the image level. This work has explored prob- lems involving multiple object classes, and multiple instances of the same object, utilizing state-of-the-art solutions, such as
Transcript
Page 1: An Efficient Pipeline for Pick-and-Place Between Bins for ...An Efficient Pipeline for Pick-and-Place Between Bins for Warehouse Automation Rahul Shome 1, Wei N. Tang , Chaitanya

An Efficient Pipeline for Pick-and-Place Between Binsfor Warehouse Automation

Rahul Shome1, Wei N. Tang1, Chaitanya Mitash1, Abdeslam Boularias1, Jingjin Yu1, and Kostas Bekris1

Abstract— Advances in sensor technologies, object detectionalgorithms, motion planning frameworks and manipulatordesigns have motivated vast leaps in the application of robotsin warehouse automation. A variety of such applications, likeorder fulfillment or sorting tasks, require pick-and-place actionsthat transfer objects between bins or containers placed inthe robot’s reachable workspace. The challenge in designingan effective solution to this problem relates to appropriatelyintegrating multiple components so as to achieve a robustpipeline that maximizes important metrics for the warehouseautomation industry, such as the number of successful picksper hour, as well as minimizing failure conditions. This processinvolves careful study of workspace design, perception andpicking primitives, as well as motion and task planning.

I. INTRODUCTION & METHOD

There has been significant interest in the deployment ofrobotic systems in warehouse automation [1], [2], [3]. Typi-cally robots have been used for large scale industrial setupsto perform repetitive tasks in highly structured environments,as in automobile manufacturing. Recently, there is a push toexpand the applications of robotic arms in less structuredsettings that arise in order fulfillment as well as warehousesorting tasks.

Prior work [1], [4] has investigated the importance ofcareful design choices in terms of end-effector modalities,perception systems and planning methods appropriate for theproblem. The current work focuses on a specific challengeinvolving two bins in the robot’s reachable workspace ona tabletop (Fig 1), where the objective is to pick up everyobject from the source bin, and transfer all of them to thetarget bin, as fast and robustly as possible.

This report describes the considerations that were madeto solve the problem, in terms of workspace design, sensingmethodology, grasping scheme, planning algorithms and afast execution pipeline. The demonstration section will re-count executions of the pipeline on real-world scenes. Theobservations made are not specific to the specific setup, butcan apply to a wide variety of similar applications involvingrobotic arms for automation.

A. Hardware

The robot used in the setup is the Kuka IIWA14, which is a7 DoF arm. A custom designed end-effector solution extrudesa cylindrical end-effector that ends with a compliant suctioncup, to engage vacuum grasps. A high-power compressor

1 The authors are affiliated to the Department of Computer Science,Rutgers University, New Brunswick, New Jersey, USA.

*All authors have made equal contributions to the work.*This work has been undertaken as a part of a project funded by jd.com

Fig. 1. The workspace takes into account reachability with overhand picks.

and valve mechanism is used to generate powerful suctionforces at the end-effector. Two RealSense RGB-D camerasare mounted on a frame that points them to the respectivebins. This frame is attached to the static base of the robotsuch that calibration errors are minimized, in estimating thepositions of the cameras in the robot’s coordinate frame. Aportable computing device is connected to both the camerasto act as a low-latency interface that publishes the sensingdata to the planning and perception machines. There is anadditional machine that runs the Kuka drivers and controllers.

B. Workspace Design

An effective workspace design can significantly impactthe efficiency of the overall system. A robotic arm m with dDoFs has a configuration space C ⊂ Rd defining the space ofall possible configurations q ∈ C of the arm. The operationFK(q) returns an end-effector pose p ∈ SE(3). We definethe reachable task space as the set of all end-effector poses:

T = {p ∈ SE(3) : ∃ q ∈ C so that FK(q) = p}.

The setup is designed as shown in Fig. 1. The annularblue region represents the subset of the reachable workspacethat allowed for top-down picks for the robot’s end-effector.Experiments indicate that the radial region between 40cmand 70cm from the robot center maximizes success for top-down picks, over the range of heights above the tabletopplane. The bins, represented by red rectangles, are placedso that they lie inside the reachable region. This promotespicking strategies that approach the bins from the top.

C. Pose Estimation

The RGB-D data captured by the sensor is passed througha convolutional neural network trained to perform object seg-mentation at the image level. This work has explored prob-lems involving multiple object classes, and multiple instancesof the same object, utilizing state-of-the-art solutions, such as

Page 2: An Efficient Pipeline for Pick-and-Place Between Bins for ...An Efficient Pipeline for Pick-and-Place Between Bins for Warehouse Automation Rahul Shome 1, Wei N. Tang , Chaitanya

Fig. 2. Top: Perception and picking steps. a) Regularly sample the object mesh and store the pickable subset during preprocessing. b) Mask-RCNNreports several instances of detected objects. c) The detection with the highest mean world Z-coordinate is used for 6D pose estimation, and point cloudregistration. d) Local search from highest scoring point on registered, pickable point cloud segment returns point with maximum pickable neighborhood. e)This point is attempted with a top-down suction pick. f) Motions of the arm that affect the pick. Bottom: The complete pipeline in terms of the control anddata flow (gray lines). Red lines represent planned transitions to arm states. Green lines represent precomputed arm motions. The blocks identify detection(light blue), picking (dark blue) and planning (light red).

FCN [5] and MaskRCNN [6]. Out of all the detected segmentsthat pass a confidence threshold, only one is selected, thatseems heuristically most promising for overhand picks. Theheuristic criteria maximize the mean global Z-coordinate ofall the RGBD pixels in the segment. Pose estimation [7][8]is performed over the selected segment, as in Fig. 2. Theestimation returns a registered point set in the observed data,corresponding to the retrieved object model.

D. Pick Selection

Using the corresponding set of mesh points, the one withthe best pick score is chosen. Scoring calculates the distanceto the center of the object mesh. A continuous neighborhoodof planar pickable points is required to make proper contactbetween the suction cup and the object surface. A localsearch is performed around the best grasp score point, tothen maximize the pickable neighborhood.

E. Motion Planning

MoveIt! [9] is used for motion planning. Most ofthe motions are performed using Cartesian Control, whichguides the arm using end-effector waypoints. Ensuring themotions to occur in reachable parts of the space increases thesuccess of Cartesian Control and simplifies motion planning.In order to decrease planning time, motion between the binsis precomputed(green lines in Fig 2) using the RRT∗[10]algorithm and simply replayed at appropriate times.

F. Task Planning

The pipeline described in Fig. 2 shows the sequence oftask planning steps undertaken to perform continuous Pick-and-Place. The gray circles represent the control juncturesbetween which motion and sensing is parallelized. Insteadof waiting for the sensing and grasp generation, both areinvoked asynchronously at the beginning of moving to thesource bin. Once the motion ends, the grasping point isavailable to plan to. The pipeline keeps trying to detect new

grasps till no further segments can be detected in the scene,which indicates that the source bin is empty.

II. DEMONSTRATION

Experiments were performed on a real-world setup withmultiple objects, either with single or multiple instances. Thepipeline achieved picking rates of up to 250 picks per hourusing the designed pipeline. This substantiates the viabilityof the pipeline as a practical automation framework. Relatedvideos of experimental runs can be found here:Single-instance: https://youtu.be/LCoUrl0MdJEMulti-instance: https://youtu.be/ejsmRCVkKsE

REFERENCES

[1] N. Correll, K. E. Bekris, D. Berenson, O. Brock, A. Causo, K. Hauser,K. Okada, A. Rodriguez, J. M. Romano, and P. R. Wurman, “Analysisand observations from the first amazon picking challenge,” IEEETransactions on Automation Science and Engineering, 2016.

[2] P. Baker and Z. Halim, “An exploration of warehouse automationimplementations: cost, service and flexibility issues,” Supply ChainManagement: An International Journal, vol. 12, no. 2, 2007.

[3] R. D’Andrea, “Guest editorial: A revolution in the warehouse: Aretrospective on kiva systems and the grand challenges ahead,” TASE,vol. 9, no. 4, 2012.

[4] Z. Littlefield, S. Zhu, C. Kourtev, Z. Psarakis, R. Shome, A. Kimmel,A. Dobson, A. Ferreira De Souza, and K. E. Bekris, “Evaluating end-effector modalities for warehouse picking: A vacuum gripper vs a3-finger underactuated hand,” in CASE, 2016.

[5] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networksfor semantic segmentation,” in Proceedings of the IEEE conference oncomputer vision and pattern recognition, 2015.

[6] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” in ICCV.IEEE, 2017.

[7] C. Mitash, K. E. Bekris, and A. Boularias, “A self-supervised learningsystem for object detection using physics simulation and multi-viewpose estimation,” in IROS, 2017.

[8] C. Mitash, A. Boularias, and K. E. Bekris, “Robust 6d object poseestimation with stochastic congruent sets,” in British Machine VisionConference (BMVC), 2018.

[9] S. Chitta, I. Sucan, and S. Cousins, “Moveit!” IEEE Robotics &Automation Magazine, vol. 19, no. 1, 2012.

[10] S. Karaman and E. Frazzoli, “Sampling-based Algorithms for OptimalMotion Planning,” IJRR, vol. 30, no. 7, June 2011.


Recommended