Evaluating on custom data/images

Question

Evaluating on custom data/images

ramanpreet9 opened this issue 4 years ago · comments

Ramanpreet Singh Pahwa commented 4 years ago

How can I run the model for detecting objects on my custom data/images? The classes can still be the same as scannet/sunrgbd dataset for now. From what i understand based on looking at sunrgbd data:
For evaluation
I need 3 files -

bbox.npy - this contains 3D bboxes of objects in the scene
pc.npz - this contains the point cloud
votez.npz - this contains a Nx10 array describing votes (from votenet?) that is used for detection.

lets say i capture an RGBD image. i can fill in the depth image and get a dense pointcloud (along with color).
How/what can i do to run the trained model on this file.
I have 2). 1) should only be used for evaluation and not inference.
How do i get 3)?

zaiweizhang · Answer 1 · Thu Nov 26 2020 12:12:10 GMT+0800 (China Standard Time)

First of all, for SUNRGBD benchmark, tilt angle is provided with the dataset. We can apply it to the point clouds so that the axis of all point clouds is aligned to the gravity direction. This tilt angle needs to be calculated with some algorithm and some manual adjustments, see here.
I have not tried to train with depth scans not aligned to the gravity direction. You can certainly try it. I am also curious about the results.

Now, let's talk about the data. Let's say you have 1) and 2). Our current dataloader takes Nx9 array describing the object labels. It is organized like this: center(3 dimensions), size (3 dimensions), rotation (1 dimension), instance label (1 dimension), semantic label (1 dimension). In order to get these labels, you do need to have instance labels.

If you only have the object bounding box information, you can use this code to extract the points inside an object bounding box. You need to be careful of overlapping objects, such as box on a sofa.

Thanks,
Zaiwei

zaiweizhang · Answer 2 · Tue Dec 08 2020 13:41:45 GMT+0800 (China Standard Time)

Closing this for now. Feel free to reopen it.

Ramanpreet Singh Pahwa · Answer 3 · Fri Dec 18 2020 15:22:21 GMT+0800 (China Standard Time)

Thanks for the info.
May I check if there are scripts available to convert rgbd data into a scannet or sunrgbd format you are using in the model.

for scannet - I see two files.
eg.
'scene0000_00_vert.npy' - 50,000 x 6
'scene0000_00_all_noangle_40cls.npy' - 50,000 x 9

what is the information stored here? I believe _vert file include the vertices of points in the scene. does this represent [X, Y, Z, R, G, B]?
what is the information stored in second file?
How can I generate this format for a sample rgbd data I collect from a rgbd camera.

zaiweizhang · Answer 4 · Sat Dec 19 2020 09:18:05 GMT+0800 (China Standard Time)

_vert file includes X, Y, Z, R, G, B

cls.npy file includes point-level annotation: bbox center x, bbox center y, bbox center z, bbox size x, bbox size y, bbox size z, bbox rotation angle, point instance label, bbox semantic label.

In order to generate this information, you will need to manually annotate rgbd data. Please refer to this paper for help:
https://arxiv.org/abs/1702.04405

Adrian Mai · Answer 5 · Fri Mar 12 2021 02:52:28 GMT+0800 (China Standard Time)

Hi,
Same question here. If I only have the 3d point-cloud (x,y,z,r,g,b), can I make inference on your trained model? I assumed yes, though the dataloader takes Nx9 array. I guess we can just fill the remaining columns with zeros? Thanks

zaiweizhang · Answer 6 · Sat Mar 13 2021 13:57:23 GMT+0800 (China Standard Time)

Yeah. I think you can do that. Make sure to comment out the evaluation code. Might cause some problems.