isarandi / metrabs

Estimate absolute 3D human poses from RGB images.

Home Page:https://arxiv.org/abs/2007.07227

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inference speed

Harsh-Vavaiya opened this issue · comments

Hello @isarandi ,

I found your work exceptional and really appreciated it. I'm working on live animation of a broadcast using 2D input frames, I can successfully animate characters using your model. But the inference speed is not real-time around 20 FPS. My machine configuration is 3080 Ti, 64 GB RAM, and Xenon.

Model is metrabs_eff2s_y4_384px_800k_28ds.tar.gz

Is there any way to speed up this, like convert it into Tflite or any other way? and I can't use any other backbone because I want accurate data. or is there any way you can update the repo, so I can use the source model to convert?

One more question, is there any way you can tell me how to find 2D poses from crop_model?

The small model should be performant enough for it. I've run a live demo at ECCV last year from an Nvidia 2080 laptop GPU at around 25 fps with 2-3 simultaneous subjects. To make it as fast as possible, turn off test-time augmentation (num_aug=1), constrain the max number of detections, and use the lower resolution model (internally 256 px, instead of 384). https://omnomnom.vision.rwth-aachen.de/data/metrabs/metrabs_eff2s_y4_256px_1600k_28ds.tar.gz

Some have experimented with tflite but I didn't have time to figure out the engineering necessary for getting it to work.

It may be of interest to you, though, that I recently added an experimental PyTorch implementation of model inference (and converted the checkpoints from TF to PyTorch). The S model is here: https://omnomnom.vision.rwth-aachen.de/data/metrabs/metrabs_eff2s_256px_800k_28ds_pytorch.tar.gz

You can check out metrabs_pytorch.scripts.demo_image, whose command line call would be.

python -m metrabs_pytorch.scripts.demo_image --model-dir metrabs_eff2s_256px_800k_28ds_pytorch --image img/test_image_3dpw.jpg

I've successfully verified identical results between the frameworks, but as I said this code is a bit less polished, but should be workable.