isarandi / metrabs

Estimate absolute 3D human poses from RGB images.

Home Page:https://arxiv.org/abs/2007.07227

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

specifie the world coordinate

luoww1992 opened this issue · comments

i see the result pose3d: Each pose is shaped and is given in the 3D world coordinate system in millimeters (or in the camera coordinate frame, if is not specified).

so how to specifie the world coordinate ?

You can use the argument extrinsic_matrix: a float32 Tensor of shape [4, 4], the camera extrinsic matrix, with millimeters as the unit of the translation vector. It's the matrix that transforms points from the world coordinate system to the camera coordinate system.

@isarandi
so if i set args: extrinsic_matrix in model.detect_poses(......), the result pose3d is in the camera coordinate,
if the extrinsic_matrix is None, the result pose3d is in the woldcoordinate?

@isarandi
i also see a new job in the youtube about your new work ?
it is beautiful !
have you a plan to update it or do something to optimize the model in inference ?

so if i set args: extrinsic_matrix in model.detect_poses(......), the result pose3d is in the camera coordinate,
if the extrinsic_matrix is None, the result pose3d is in the woldcoordinate?

If you do set the extrinsic_matrix, then the result will be in world coordinates. If you don't set it, then all we can provide is the camera-relative result, so it will be in camera coordinates.

have you a plan to update it or do something to optimize the model in inference ?

I'm currently working on pushing an update to this repo, with some refactoring and better dependency handling and easier-to-use scripts. I also plan to look into inference optimization like TF-lite. Usually, for best speed one needs to pre-specify image sizes and other tensor shapes etc. This takes away some of the flexibility and ease of use of the current API. I'll look into these things a bit later.

@isarandi
now i use the default func: model.detect_poses_batched(images) with default args, the extrinsic_matrix is eys(4) by model: metrabs_eff2l_y4 to inference a video. so the result should be in world coordinates like you say.
this is my poses3d file with smpl_24 skt.
pose3d.zip
when i plot it with matplotlib, i find the skt is lying down, but in poseviz is stand.
this is my show code:

def main(file):
index1 = [0, 1, 4, 7, 10]
index2 = [0, 2, 5, 8, 11]
index3 = [0, 3, 6, 9, 12, 15]
index4 = [9, 13, 16, 18, 20, 22]
index5 = [9, 14, 17, 19, 21, 23]

ax = plt.axes(projection='3d')
positions = np.load(file, allow_pickle=True).reshape((-1, 24, 3))

for position in positions:
    line1 = position[index1]
    line2 = position[index2]
    line3 = position[index3]
    line4 = position[index4]
    line5 = position[index5]
    for line in [line1, line2, line3, line4, line5]:
        color = random.choice(['r', 'g', 'b'])
        x = line[:, 0]
        y = line[:, 1]
        z = line[:, 2]
        plt.plot(x, y, z, color=color)
    break

ax.set(xlabel='X',
       ylabel='Y',
       zlabel='Z',
       )
ax.set_title('3D line plot')
plt.show()
plt.savefig('smpl.jpg')

so i check the code: i find it use poses3d result, the main data processing is first by function:set_world_up() in mayavi_util.py
then get mayavi pose by func mayavi_util.world_to_mayavi(pose) in mayavi_util.py. after use func: pointset.add_point() to plot point.
it make a camera projection in the mayavi space, i see the show is good,
so if we use the mayavi space to be the world, then how to get the world position in the mayavi space.?

No, if you don't specify the extrinsics then it will be camera coordinates. The extrinsics describe the transformation from world to camera. If you don't specify it, we can't make predictions in world coordinates, only in camera coordinates.

So if you use the default, without specifying the extrinsic_matrix, you will get pose3d results in camera coordinates: x points to the right, y down, z forwards.

With Matplotlib you need to be careful, because it draws the Z axis as the vertical one (upwards), instead of Y. As I said, in the result poses the Y direction points downwards, following the standard convention. This is just a visualization thing.

See https://github.com/isarandi/metrabs/blob/master/demo.py#L55 for how to plot the poses with Matplotlib.

@isarandi
what about my other question:
what i say before it creates a camera and makes a camera projection in the mayavi space, i see the show is good,
so if we use the mayavi space to be the world, then how to get the world position in the mayavi space.?

The "Mayavi space" concept is an internal implementation detail in PoseViz that is not important from the API-user's perspective.

Do you have an extrinsic matrix? If not, it makes no sense to talk about a world space, as we don't know how the camera is placed in the world.

i know it makes no sense without extrinsic matrix,
so for the internal implementation in PoseViz, if no extrinsic, make the PoseViz to be the 'world space', then use the projection point to be the ‘world’ point.
is it a alternative method ?

If you don't set the extrinsic matrix, then the world and camera spaces are equal. Set the extrinsic matrix if you want to have a world space that's different from camera space. In that case, also set 'world_up' in detect_poses and in the PoseViz constructor.

make the PoseViz to be the 'world space

PoseViz is a visualizer, it does not define its own world space.