apple / ml-hypersim

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to obtain 3D point cloud for 3D object detetcion

Lizhuoling opened this issue · comments

Hi,
Thanks for your great work. I am not familiar with graphics, and I want to obtain the 3D point cloud of tonemap images to conduct 3D object detection. But I still cannot figure out how to get the points. Although I note there are many related issues in this repo, I still do not understand what to do. For 3D object detection, I need two items, the point cloud in the camera view coordinate system and camera intrinsic in the form of
[[f_u, 0, c_u],
[0, f_v, c_v],
[0, 0, 1]].
For the first one, point cloud, in issue #57, you mention that it can be obtained from the world_space_position images. But where is the image? I do not find images named world_space_position in this dataset. Is it the normal_bump_world.hdf5? Or maybe I can get point cloud from depth_meters.hdf5. But as mentioned in issue #9, a transformation is needed based on camera intrinsic. However, I cannot find the camera intrinsic I need.
As mentioned in Issue #44, the intrinsic is given in the OpenGL form, not the form I can understand. And I do not figure out how to convert the given intrinsic into the aforementioned from I can understand. Is there any related code demo? For example, the process of reading data from metadata_camera_parameters.csv and converting it into camera intrinsics in the aforementioned form.

Hi! Thanks for your detailed question and for including references to other issues. Those will be helpful to future readers. There is a small typo in #57, where I implied that the images with world-space positions have world_space_position in the name. This is incorrect. The images you're looking for just have position in the name, e.g., frame.0000.position.hdf5. See the directory layout in our README. You can use the user-contributed download script to selectively download only the necessary files. Working with these images will be easier for your use-case because you won't need to worry about per-scene camera intrinsics or projecting our depth images into world-space..