justusschock / shapenet

PyTorch implementation of "Super-Realtime Facial Landmark Detection and Shape Fitting by Deep Regression of Shape Model Parameters" predicting facial landmarks with up to 400 FPS

Home Page:https://shapenet.rtfd.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

inference with no lmk_bounds

shuxp opened this issue · comments

I have tried predict_from_net, which use a lmk_bounds to predict. If I don't have a lmk_bounds, how to predict on a new image?

lmk_bounds = sample.get_landmark_bounds(sample.lmk)
min_y, min_x, max_y, max_x = lmk_bounds
range_x = max_x - min_x
range_y = max_y - min_y

max_range = max(range_x, range_y) * (1 + crop)

center_x = min_x + range_x / 2
center_y = min_y + range_y / 2

tmp = sample.crop(center_y - max_range / 2,
                center_x - max_range / 2,
                center_y + max_range / 2,
                center_x + max_range / 2)

EDIT: Adjusted Code Formatting

You could for example replace the values obtained from the landmarks with the values from a face detector. The easiest approach would be to install dlib and use the frontal_face_detector

You could for example replace the values obtained from the landmarks with the values from a face detector. The easiest approach would be to install dlib and use the frontal_face_detector

Is this mean if I want to get keypoints from an object, I have to get bbox/lmk_bounds first? If my bbox is not accuracy, when set a larger area(crop = 0.2), can shapenet detects lmk well? In my private datasets, it seems not fit well...

It can definitely, but you have to retrain it. The pretrained ones have been trained with crop=0.1 since this is a common values, used for good face detectors.

It can definitely, but you have to retrain it. The pretrained ones have been trained with crop=0.1 since this is a common values, used for good face detectors.

OK, I will try this. Thanks a lot.

It can definitely, but you have to retrain it. The pretrained ones have been trained with crop=0.1 since this is a common values, used for good face detectors.

If I have no lmk_bound, I should set crop=None in SingleShapeDataset.
And in the inference, the code as follows.
Is this the right way?

h, w = sample.img.shape[:2]
min_y, min_x, max_y, max_x = 0, 0, h, w
range_x = max_x - min_x
range_y = max_y - min_y
    
max_range = max(range_x, range_y) * (1 + crop)
    
center_x = min_x + range_x / 2
center_y = min_y + range_y / 2
    
tmp = sample.crop(center_y - max_range / 2,
                  center_x - max_range / 2,
                  center_y + max_range / 2,
                   center_x + max_range / 2)

Yes, that's the way I would do this.

You should note, that the parameters for affine transformations are learned by the network, therefore it may not be that good to provide an image with lot's of non-face pixels around. Also the PCA might not be that suitable, since the face might be smaller and this might lead to an ambigous representation.

That said, it should nevertheless work, but may not perform exactly as good as the cropped version

Yes, that's the way I would do this.

You should note, that the parameters for affine transformations are learned by the network, therefore it may not be that good to provide an image with lot's of non-face pixels around. Also the PCA might not be that suitable, since the face might be smaller and this might lead to an ambigous representation.

That said, it should nevertheless work, but may not perform exactly as good as the cropped version

OK, I will try it, thanks a lot.

Hope this works for you. Please report back.

Hope this works for you. Please report back.

Absolutely. I am going to train now.