RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

1University of Illinois Urbana Champaign & 2NVIDIA

Abstract

We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is computed. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF. We show results on a real dataset collected and annotated in our lab.

Video

Results for Scene 1
gt our colmap isdf voxblox
gt our colmap isdf voxblox
Results for Scene 2
gt our colmap isdf voxblox
gt our colmap isdf voxblox
Results for Scene 3
gt our colmap isdf voxblox
gt our colmap isdf voxblox

BibTeX

@inproceedings{tang2023icra:rgbonly,
  author = "Zhenggang Tang and Balakumar Sundaralingam and Jonathan Tremblay and Bowen Wen and Ye Yuan and Stephen Tyree and Charles Loop and Alexander Schwing and Stan Birchfield",
  title = "{RGB}-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control",
  booktitle = "ICRA",
  year = 2023
}