about training (video/photo) and different size of betas in datasets

Question

about training (video/photo) and different size of betas in datasets

GhostLate opened this issue 9 months ago · comments

I saw on the main page, that SMPLer-X inference scripts expect video data as input. Is it possible to modify model to support single images?
Do you train your model using images as video (in strict sequence)?
BEDLAM dataset has 11 betas and only one neutral gender (according to the base model). AGORA has only 10 betas. How do you combine different numbers of shapes while training and in the model's head? Does SMPLX regression layer have dynamic size?

Thank you!

WEI CHEN · Answer 1 · Thu Nov 02 2023 01:10:01 GMT+0800 (China Standard Time)

Hi, I could anwser the first 2 questions.
For Concern1, yes, single image is possible, just need it to take it as a single-frame sequence, a more straightforward and smple to use inference script is yet to be merged, please see this issue for quick solution.
For Concern2, we use smplx instances when training the model (temporal information not addressed).
Hope this could help you.

GhostLate · Answer 2 · Tue Nov 07 2023 18:05:58 GMT+0800 (China Standard Time)

Thank you!
After some tests, betas can be converted from 10 to 11 by .reshape(11).
P.s. But it's not possible to convert from 11 to 10 (unless last one is 0)

GhostLate · Answer 3 · Tue Nov 07 2023 18:11:50 GMT+0800 (China Standard Time)

Hi, I could anwser the first 2 questions. For Concern1, yes, single image is possible, just need it to take it as a single-frame sequence, a more straightforward and smple to use inference script is yet to be merged, please see this issue for quick solution. For Concern2, we use smplx instances when training the model (temporal information not addressed). Hope this could help you.

Data converting scripts #23

I tried to convert BEDLAM to Human_Data, but it was impossible.
There is not keypoints_to_scaled_bbox_bfh in data/data_converters/bedlam.py which suppose to be in:
utils/demo_utils.py (from mmhuman3d.utils.demo_utils import keypoints_to_scaled_bbox_bfh)

@Wei-Chen-hub, Could you share, please?

GhostLate · Answer 4 · Fri Mar 08 2024 00:13:33 GMT+0800 (China Standard Time)

@Wei-Chen-hub, Any ideas?

WEI CHEN · Answer 5 · Thu Apr 11 2024 11:19:53 GMT+0800 (China Standard Time)

Hi, i just put my function here. Currently we are not planning to release humandata due to short of hands and licensing issues.

    def _keypoints_to_scaled_bbox_bfh(self,
                                    keypoints,
                                    occ=None,
                                    body_scale=1.0,
                                    fh_scale=1.0,
                                    convention='smplx'):
        '''Obtain scaled bbox in xyxy format given keypoints
        Args:
            keypoints (np.ndarray): Keypoints
            scale (float): Bounding Box scale
        Returns:
            bbox_xyxy (np.ndarray): Bounding box in xyxy format
        '''
        bboxs = []

        # supported kps.shape: (1, n, k) or (n, k), k = 2 or 3
        if keypoints.ndim == 3:
            keypoints = keypoints[0]
        if keypoints.shape[-1] != 2:
            keypoints = keypoints[:, :2]

        for body_part in ['body', 'head', 'left_hand', 'right_hand']:
            if body_part == 'body':
                scale = body_scale
            else:
                scale = fh_scale
            bp = self.kps_body_part[body_part]
            kp_id = list(range(bp[0], bp[1]))
            kps = keypoints[kp_id]

            if occ is not None:
                occ_p = occ[kp_id]
                if np.sum(occ_p) / len(kp_id) >= 0.1:
                    conf = 0
                else:
                    conf = 1
            else:
                conf = 1
            if body_part == 'body':
                conf = 1

            xmin, ymin = np.amin(kps, axis=0)
            xmax, ymax = np.amax(kps, axis=0)

            width = (xmax - xmin) * scale
            height = (ymax - ymin) * scale

            x_center = 0.5 * (xmax + xmin)
            y_center = 0.5 * (ymax + ymin)
            xmin = x_center - 0.5 * width
            xmax = x_center + 0.5 * width
            ymin = y_center - 0.5 * height
            ymax = y_center + 0.5 * height

            bbox = np.stack([xmin, ymin, xmax, ymax, conf],
                            axis=0).astype(np.float32)
            bboxs.append(bbox)

        return bboxs

WEI CHEN · Answer 6 · Thu Apr 11 2024 11:22:54 GMT+0800 (China Standard Time)

I would also like to share some insights on Betas. Accidentally i visualized some mocap data with only the first Betas (1st of 10 parameters) correct. The overlay is quite good already. Perhaps only the frist few Betas matter.

GhostLate · Answer 7 · Thu Apr 11 2024 15:14:15 GMT+0800 (China Standard Time)

Thank you for share!

GhostLate · Answer 8 · Fri Apr 12 2024 20:06:08 GMT+0800 (China Standard Time)

@Wei-Chen-hub, could you also share: bp = self.kps_body_part[body_part] in def _keypoints_to_scaled_bbox_bfh()

jameskuma · Answer 9 · Tue Jun 18 2024 14:12:20 GMT+0800 (China Standard Time)

I would also like to share some insights on Betas. Accidentally i visualized some mocap data with only the first Betas (1st of 10 parameters) correct. The overlay is quite good already. Perhaps only the frist few Betas matter.

Hi, could you kindly share the code of aligning the output mesh from smplx to the image. I am confused to implement this part. Thanks a lot!!