Performance on novel objects

Question

Performance on novel objects

thomastangucb opened this issue 4 years ago · comments

Thank you for sharing this interesting work! I tested the trained model on some novel transparent objects. However, the result is not that good. Some transparent bottles are not recognized, so no depth re-construction. Some transparent objects are recognized but the estimated point cloud is not correct.

Do you think it is normal for novel objects, or there is some problem in my implementation? Any suggestion to improve the network's generalization on novel objects? Thanks!

Shreeyak · Answer 1 · Mon Apr 13 2020 10:34:02 GMT+0800 (China Standard Time)

Could you share some examples where the model failed? I can only give general inputs given the information.

Our models were trained on a small variety of objects, only 5 unique shapes in the training dataset. Considering this, the model's ability to generalize is good, but limited at the moment. The only way would be to train on a larger dataset. There are other factors that effect performance:

Make sure the scene is well light, but not too bright
Try to reduce clutter. A plain background behind objects improves performance.
Keep the camera approx 0.8m away and higher than the objects. Camera kept too far might result in lower quality results.
The reconstruction quality is highly dependent on the surface normal prediction. The above points should help with improving quality of output point cloud.

thomastangucb · Answer 2 · Wed Apr 22 2020 11:28:35 GMT+0800 (China Standard Time)

Here is the test sample. The surface normal picture looks very nice. But it seems there are errors in the predicted point cloud. The left box is higher than expectation, while the middle and right objects are lower than expectation. Have tuned light direction/intensity, but similar error. Maybe the objects are too different from your training objects. I am thinking to create training dataset for these objects. Could you advice the steps to create the virtual dataset? It will be nice to have some template file to configure the Blender rendering? Thank you!

Shreeyak · Answer 3 · Thu Apr 23 2020 07:30:14 GMT+0800 (China Standard Time)

These are some of the limitations of our current method. The object on the left has a depth discontinuity (green line) inside the object, telling the model that the top surface of the glass bar is disconnected completely from the rest of the scene. That's why that top surface is floating.

The middle object is missing depth discontinuities on the right edge, telling the model that the right edge is connected to the table, that's why it's lying down in the reconstructed depth.

I'm not sure what's up with the glass on the right.

Could you advice the steps to create the virtual dataset?

Unfortunately, the code used to create our datasets is not available. We will be releasing a different version soon though, for our new project. Meanwhile, you can try using Nvidia's Dataset Synthesizer, or the Mitsuba Renderer.

You could, if you have the bandwidth, create a training set of real images of these objects. Using our GUI app and spray-painted objects, it is possible to collect a fairly accurate set of ground truth depth images.

Other Suggestions: In this image, it's the boundaries predictions that are causing a problem. Try to get brighter lighting, which trying to avoid strong directional lights, and use a slightly textured background. If the background is too noisy, it'll cause noisy output, but some amount of texture results in better predictions. The source problem is that our dataset is limited and we could definitely benefit from more objects.

thomastangucb · Answer 4 · Sat Apr 25 2020 12:17:28 GMT+0800 (China Standard Time)

Thank you for your feedback! Very useful information. I will try to use the renderer you mentioned. Also look forward to the new version of your work!