Yang7879 / 3D-RecGAN-extended

🔥3D-RecGAN++ in Tensorflow (TPAMI 2018)

Home Page:https://arxiv.org/abs/1802.00411

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Volumetric conversion

PranjalBiswas opened this issue · comments

How do you convert the raw depth image to volumetric representation that acts as input of the network? Do you use "binvox" for voxelization of the single view point cloud and also the 3D CAD models? Thanks in advance.

@PranjaLBiswas27

  1. There are two steps to convert a raw depth image to a volumetric grid: (1). depth --> point cloud, (2). find the min/max along xyz of the point cloud and then voxelize it.

  2. We use a virtual camera to take raw depth images, and then obtain the single view volumetric grids using the above processing steps.

  3. "binvox" can be used to voxelize the 3D CAD models to get the full ground truth, but it may not deal well with non-watertight CAD models. Therefore, we use our own ray-tracing voxelization algorithm to generate the dataset.

----- Here's the script to convert a depth image to a point cloud.
def single_depth_2_pc(in_depth_path):
depth = np.load(in_depth_path)
h = depth.shape[0]
w = depth.shape[1]

fov = 49.124/2  # degree
fx = w/(2.0*np.tan(fov/180.0*np.pi))
fy = h/(2.0*np.tan(fov/180.0*np.pi))
k = np.array([[fx, 0, w/2],
              [0, fy, h/2],
              [0, 0, 1]], dtype=np.float32)

xyz_pc = []
for hi in range(h):
    for wi in range(w):
        if depth[hi, wi]>5 or depth[hi, wi]==0.0:
            depth[hi, wi] =0.0
            continue
        x = -(wi - w/2)*depth[hi, wi]/fx
        y = -(hi - h/2)*depth[hi, wi]/fy
        z = depth[hi, wi]
        xyz_pc.append([x, y, z])

print "pc num:", len(xyz_pc)
xyz_pc = np.asarray(xyz_pc, dtype=np.float16)
return xyz_pc

Thanks a lot for the help, I will try it out.

Hi,
I tried to implement the code above to convert depth into point cloud in MATLAB. It kind of works but I find that the spatial geometry in world coordinates is not preserved. In the sense that the actual distance between 2 points is not equal to that in the point cloud. I also see a tilt in the point cloud. The only thing different I did was that I calculated K (intrinsic parameter) for Kinect v2 using Caltech Camera Calibration Toolbox. I get the following K matrix:

K = [367.7006 0 256.7453; 0 368.2318 207.3112; 0 0 1.0000]

What exactly might be the issue and how can I resolve it?
Kind Regards

@PranjaLBiswas27 If your depth image is captured by your Kinect v2, then you need to get the correct K matrix for that Kinect. It seems the K matrix you used is not correct. You may need to manually calibrate your camera.

Thanks for the reply, but how can I ensure that my K matrix is correct? Cause after calibration the pixel error that was reported by the tool was around 0.1, thus I thought the camera is calibrated fine.

Also, while calculating x and y in your code(which I guess is w.r.t. camera reference frame), the values are of wi, hi and elements of matrix K are in pixels, what is the unit for depth[hi, wi]?

The depth unit is meters. Your K matrix looks ok, and a reprojection error <= 0.1 is fine. Your problem may be in the depth or in the fov.

@Yang7879 , @morpheus1820 Thank you for your inputs. Apparently I made a very trivial error. I calculated the K matrix using IR(depth camera) images, while I was attempting to create point cloud using depth images registered on images from the rgb camera, which will change the principal point and focal length i.e K matrix. This also resulted in depth image to be of the higher resolution as rgb image. After registering rgb image on depth image, the point clouds were exactly as expected using the same K matrix as mentioned above. Although this reduces the resolution of rgb image to that of the depth image, but preserves the actual spatial geometry in the point cloud.