yukitsuji / 3D_CNN_tensorflow

KITTI data processing and 3D CNN for Vehicle Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

raw_to_voxel function

OneManArmy93 opened this issue · comments

I was trying to understand the code but I could wrap my mind around the values of x, y z and resolution (resolution=0.50, x=(0, 90), y=(-50, 50), z=(-4.5, 5.5)).
Can anyone explain this please and what's the point from them? thank you

x, y, z is the range of scene in which you are interested to detect vehicles. And resolution is the size of voxel.

Thank you for your reply.
So you are telling me that x, y, z dont have to do with the data in the .bin files?

@FouratOueslati x, y, z in the bin file represents each point in 3D space.
XYZ in the poitntcloud is used to filter out the range of Lidar scan you are interested. Like X is for forward, (0, 90) means you are interested in from 0 meters to 90 meters in front of vehicle. Rest of pointcloud points will be filtered out.
So, Using x=(0, 90), y=(-50, 50), z=(-4.5, 5.5) resolution=0.50, you will end up with a voxelgrid of size 900x1000x50 voxel. You can reduce this range to fit the voxelgrid into your gpu memory

@ansabsheikh9 thank you for your explanation, I can see a bit more clearly now
But can you elaborate a little bit on the rest of the function(like the use of:
logic_x,
voxel[velo[:, 0], velo[:, 1], velo[:, 2]] = 1
velo =((velo - np.array([x[0], y[0], z[0]])) / resolution).astype(np.int32) )?
Much appreciated

@FouratOueslati As this deeplearning architecture is designed to takein voxelgrid. So first step is to convert raw pointcloud data to a voxel grid representation. So, this fucntion is converting conversion of pointcloud representation to voxelgrid representaiton.
where,
velo =((velo - np.array([x[0], y[0], z[0]])) / resolution).astype(np.int32) )? function making all pointclouds positive (Or you can say shifting the reference of pointcloud for voxel grid representaiton) as there are points in the pointcloud which are negative and this neuralnetwork can take in poitive values.

@ansabsheikh9 you said that by choosing those values of x,y,z we get 90x1000x50 voxels; does this means each point cloud data (that has xyz coordinates) is tranformed to a single voxel?

@FouratOueslati if there is a pointcloud data within 0.5meters (which is resolution) it will be a single voxel, otherwise it will be empty. There can be many pointcloud within 0.5meters grid

@ansabsheikh9 so basically if a voxel is present a binary indicator will be attributed to highlight its presence (1 for presence 0 for absence)? and does this line of code { voxel[velo[:, 0], velo[:, 1], velo[:, 2]] = 1 } refers to that?
Thank you

@ansabsheikh9 thank you for your help

@ansabsheikh9 hi, can I ask about the relationship between the coordinates of a point cloud data (x,y,z) and its representation in the 3D voxel grid? I just want to make sure that an accurate voxelization has happened to a particular point. thank you

@FouratOueslati You can calculate using this formula
velo =((velo - np.array([x[0], y[0], z[0]])) / resolution).astype(np.int32) ) this will give you the index of the voxelgrid. In this equation x,y,z is the position of Lidar sensor.

@ansabsheikh9 thank you. that's was very helpful. But i m also curious about the size of the edge of the Voxel it is always equal to 1 no matter is the resolution. Can you explain why and how it could be changed based on the resolution?

@ansabsheikh9 hello again. any info you can provide me can really unstuck me.
Much appreciated :)

@OneManArmy93 We can say that voxel is equivalent to a 3D pixel. Each voxel size is determined by resolution you are using.
If you are interested in pointcloud range of (x, y, z) = (10, 10, 10)meters so that means using resolution of 0.5 our 3D voxel grid will have the size (100, 100, 100) voxels. Each Voxel is a 3D pixel rather than having range from 0-255, in this experiment it can be only 0 or 1 (Binary).
Here is a sample voxelized representation of a point cloud I hope it will help you to clear things.
Capture

@ansabsheikh9 so the voxel edge which is equal to a 1( in this [picture]) can not be modified because it is binay indicator for wether the space is occupied or not?
résolution10

@ansabsheikh9 thank you! much appreciated