zju3dv / Vox-Fusion

Code for "Dense Tracking and Mapping with Voxel-based Neural Implicit Representation", ISMAR 2022

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I have a question about the equ(6) and equ(7) of the paper

bobfacer opened this issue · comments

In your paper of equ(6) and equ(7) are not the same as your code。in the code:

def get_sdf_loss(self, z_vals, depth, predicted_sdf, truncation, loss_type="l2"):

    front_mask, sdf_mask, fs_weight, sdf_weight = self.get_masks(
        z_vals, depth.unsqueeze(-1).expand(*z_vals.shape), truncation
    )
    fs_loss = (self.compute_loss(predicted_sdf * front_mask, torch.ones_like(
        predicted_sdf) * front_mask, loss_type=loss_type,) * fs_weight) # this is eq(6)?
    sdf_loss = (self.compute_loss((z_vals + predicted_sdf * truncation) * sdf_mask,
                depth.unsqueeze(-1).expand(*z_vals.shape) * sdf_mask, loss_type=loss_type,) * sdf_weight) #this is eq(7)?
    # back_loss = (self.compute_loss(predicted_sdf * back_mask, -torch.ones_like(
    #     predicted_sdf) * back_mask, loss_type=loss_type,) * back_weight)

    return fs_loss, sdf_loss

I wonder why in sdf_loss the predicted_sdf have to multiply truncation and why fs_loss is compute the loss between predicted_sdf and ones,in your paper the fs_loss compute the loss between depth and truncation . Could you please explain this two loss?

Hi,
Thank for pointing out. The purpose of these two losses is consistent with the equ(6) and equ(7) in the paper, but we have made some adjustments in the implementation.
The truncation value of the near surface can be approximated as the distance on the ray, so "z_vals + predicted_sdf * truncation" is a small expansion of the current sampling point, expanding the constraint range of sdf_loss.
For fs_loss, we directly increase the truncation value to 1 for fast convergence.

Thank you for your reply!And I have another question about the RGBDFrame pose,In your code,why the pose must add 10 of the first frame,I have try add 0,but It will raise an error of not hit any voxels.Why we must add 10 here?

class RGBDFrame(nn.Module):
    def __init__(self, fid, rgb, depth, K, pose=None) -> None:
        super().__init__()
        self.stamp = fid
        self.h, self.w = depth.shape
        self.rgb = rgb.cuda() 
        self.depth = depth.cuda() #/ 2
        self.K = K
        # self.register_buffer("rgb", rgb)
        # self.register_buffer("depth", depth)

        if pose is not None:
            pose[:3, 3] += 10 #why we must add 10 here?
            pose = torch.tensor(pose, requires_grad=True, dtype=torch.float32)
            self.pose = OptimizablePose.from_matrix(pose)
        else:
            self.pose = None
        self.precompute()

Hi, please refer to #3.