It seems that joint_image can't be utilized as 2D joints
Javacr opened this issue · comments
Javacr commented
Hi, SMPLer-X had already provided 3D info, but I still wanna obtain 2D information for my work (I would not like to add another 2D joints detector). Here is my modification (refer to InterWild).
SMPLer_X.py
def project_to_body_space(part_name, bbox):
hand_bbox_w = bbox[:, None, 2] - bbox[:, None, 0]
hand_bbox_h = bbox[:, None, 3] - bbox[:, None, 1]
joint_img[:, smpl_x.pos_joint_part[part_name], 0] *= (
(hand_bbox_w / cfg.output_hand_hm_shape[2]))
joint_img[:, smpl_x.pos_joint_part[part_name], 1] *= (
(hand_bbox_h / cfg.output_hand_hm_shape[1]))
joint_img[:, smpl_x.pos_joint_part[part_name], 0] += bbox[:, None, 0]
joint_img[:, smpl_x.pos_joint_part[part_name], 1] += bbox[:, None, 1]
for part_name, bbox in (('lhand', lhand_bbox), ('rhand', rhand_bbox)):
project_to_body_space(part_name, bbox)
joint_img[:,smpl_x.pos_joint_part['body'],0] *= (cfg.input_body_shape[1]/ cfg.output_hm_shape[2])
joint_img[:,smpl_x.pos_joint_part['body'],1] *= (cfg.input_body_shape[0]/ cfg.output_hm_shape[1])
joint_img[:,:,0] *= (cfg.input_img_shape[1] / cfg.input_body_shape[1])
joint_img[:,:,1] *= (cfg.input_img_shape[0] / cfg.input_body_shape[0])
inference.py
joint_img = out['joint_img'].cpu().numpy()[0]
joint_img_xy1 = np.concatenate((joint_img[:,:2], np.ones_like(joint_img[:,:1])),1)
joint_img = np.round(np.dot(bb2img_trans, joint_img_xy1.transpose(1,0)).transpose(1,0)).astype(int)
lhand_bbox = out['lhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyl = np.concatenate((lhand_bbox, np.ones_like(lhand_bbox[:,:1])),1)
lhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyl.transpose(1,0)).transpose(1,0)).astype(int)
lhand_bbox = [lhand_bbox[0,0], lhand_bbox[0,1], lhand_bbox[1,0], lhand_bbox[1,1]]
rhand_bbox = out['rhand_bbox'].cpu().numpy().reshape(2,2)
hand_bbox_xyr = np.concatenate((rhand_bbox, np.ones_like(rhand_bbox[:,:1])),1)
rhand_bbox = np.round(np.dot(bb2img_trans, hand_bbox_xyr.transpose(1,0)).transpose(1,0)).astype(int)
rhand_bbox = [rhand_bbox[0,0], rhand_bbox[0,1], rhand_bbox[1,0], rhand_bbox[1,1]]
When I use default params, the results seems wired, the joints are not accurate.
After modified as below:
# lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()
# rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 2).detach()
lhand_bbox = restore_bbox(lhand_bbox_center, lhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()
rhand_bbox = restore_bbox(rhand_bbox_center, rhand_bbox_size, cfg.input_hand_shape[1] / cfg.input_hand_shape[0], 1.5).detach()
# bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height)
bbox = process_bbox(mmdet_box_xywh, original_img_width, original_img_height, ratio=1.1)
Things get better, but still not accurate.
How to explain this?
Javacr commented
Sorry, I misunderstood. Actually, These results are normal, I should not compare whole body with hand joints methods.