If we take a short look at recent years pose estimation and object location methods, data driven takes up an increasing proportion, such as CullNet, DenseFusion 
. These methods reveal reliable ways to estimate the 6D pose of objects, and of course out there are still many examples like this. With the help of large scale of data, the time to learn pose estimation or grasping has been significantly shortened.
Ii Building Dataset
Ii-a System Setup
Simulation:We choose PyBullet as our simulator, which provides real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc..
Ii-B Data Generation
The virtual environment we designed is to place an empty tray box in the middle of a blank plane and the camera 0.7 meters above the tray box. There are 77 different kinds of models in our dataset, which are all selected from YCB dataset. We set a blank space of 0.4*0.4*0.45 cubic meters, and make it 0.05 meters right above the tray box. Each time we randomly selected 12 different kinds of models to appear from random positions above the box, every single object’s x,y,z parameters were generated randomly within the size of the blank space. Figure 2 shows the situation when 12 objects came out, which are sugar-box, g-cup, mug, sponge, a-colored-wood-blocks, c-lego-duplo, g-lego-duplo, scissors, large-marker, fork, h-cups, tennis-ball.
As soon as we turned on gravity, the objects would naturally fall into the tray box. Due to the collision, the poses of each objects were naturally randomly generated, so that the stacking states of objects were very similar to the real world situation. Figure 3 shows the situation after falling. For each falling case, the lighting of the scene comes from a point light that will constantly change its angle, which means we could obtain nearly every lighting situation that is possible in the real world.
Ii-C Simulation Result
Thanks to the powerful build-in function from PyBullet, we could easily get segmentation, depth and RGB images of our tray box. Figure 4 shows the 3 kinds of images and point cloud we get.All images are saved as .png file, point cloud is saved as .ply file.
Figure 5 show cases part of our simulation result. 6D Poses of each object falling case are saved as .csv file, we describe the 6D Poses by quaternion.
We present a new dataset with point cloud, 6D pose ,segmentation,depth and RGB created using the PyBullet. This dataset includes 77 kinds of YCB models and includes random collision, lighting variations. Our Dataset contains 100k groups of data and provides significantly lots of parameter variations. In the future, we are planning to validate the effectiveness of this dataset using real world object examples.The website for the data generation procedure is available online as cheneating716.github.io
-  (2015) The ycb object and model set: towards common benchmarks for manipulation research. In 2015 International Conference on Advanced Robotics (ICAR), Vol. , pp. 510–517. Cited by: 2nd item.
CullNet: calibrated and pose aware confidence scores for object pose estimation.
2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Vol. , pp. 2758–2766. Cited by: §I.
DenseFusion: 6d object pose estimation by iterative dense fusion.
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vol. , pp. 3338–3347. Cited by: §I.