Chairs and Mugs
-- A Dataset for Object-Centric Scene Understanding
and Equivariance
Introduction
This dataset provides 3D scenes with repetitive objects from the same categories (chairs, mugs) under a variety of scene configurations, from the simplest case with all objects standing upwards on a ground plane, to the most challenging cases with diverse object poses and complex background contents. Such scenarios are common in many highly interactive real-world environments. And this dataset encourages the development of scene understanding methods that are:- object-centric, leveraging category-level information on repeating instances
- robust to scene configuration changes
- generalizable to unseen or even out-of-distribution scene configurations
Dataset details
Synthetic scenes: Our synthetic dataset is simulated with SAPIEN. For the synthetic tabletop scenes, we place 4 synthetic depth cameras at the 4 corners of a table and place the objects in a bin at the center of the table, which is a common setup for tabletop manipulators. We simulate realistic IR sensor depth patterns with IR ray tracing and the mesh reconstruction is created by integrating 4 view depths via TSDF fusion. For the chair scenes, we use 8 static cameras with ideal depth (instead of IR ray tracing), because unlike tabletop scenes, real-world indoor scenes are usually captured with continuous scans which will result in smoother and better reconstruction.Real scans: Our real dataset contains 240 reconstructions of real scenes containing challenging configurations and backgrounds. More data are collected for scenes with more complex configurations or that are harder to create in simulation environments.
Mugs Z | 10 | Mugs SO3 | 10 | Mugs Pile | 10 |
Mugs Tree | 50 | Mugs Others | 50 | Mugs Wild | 50 |
Chairs Z | 20 | Chairs SO3 | 20 | Chairs Pile | 20 |
Publication
EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene SupervisionPDF | Code
Data Samples
Synthetic scenes
-
Mugs Z -
Mugs SO3 -
Mugs Pile -
Mugs Tree -
Mugs Box -
Mugs Shelf
Real scans
-
Mugs Z -
Mugs SO3 -
Mugs Pile -
Mugs Tree -
Mugs Wild -
Mugs Others
-
Chairs Z -
Chairs SO3 -
Chairs Pile
Citation
If you use the dataset or code please cite:@inproceedings{Lei2023EFEM, title={EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision}, author={Lei, Jiahui and Deng, Congyue and Schmeckpeper, Karl and Guibas, Leonidas and Daniilidis, Kostas}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, url={https://cis.upenn.edu/~leijh/projects/efem}, year={2023} }