A few suggestions.

Question

A few suggestions.

ioctl-user opened this issue 2 years ago · comments

ioctl-user commented 2 years ago

Hello!

I have a couple of ideas:

Could you, please, add text description about difference between models, especially between b0 and b2 general types?
Please consider adding hsemotion-onnx package to the pip repository.

ioctl-user commented a year ago

Thanks!

Andrey Savchenko · Answer 1 · Tue Dec 06 2022 22:31:18 GMT+0800 (China Standard Time)

Hello!

EfficientNet-B0 and B2 are the architectures proposed in an excellent paper. I just trained them by my own procedure. The main difference is in scaling, b2 accepts an input image with greater resolution(260x260 vs 224x224 from EfficientNet-B0). Moreover, the depth and width, and consequently, the size of EfficientNet-B2 are larger. Hence, it is potentially possible to reach better accuracy, but the latency of EfficientNet-B0 is better. But I should say, that my internal experiments demonstrate that EfficientNet-B0 works better for typical videos. I personally recommend to use enet_b0_8_best_vgaf.pt or enet_b0_8_va_mtl.pt in practice. These models are typically more accurate for various datasets if you do not want to fine-tune them on new domain
Thanks, I will try to do it in 1 or 2 weeks. I'm overwhelmed with a lot of reports now (

Andrey Savchenko · Answer 2 · Sat Dec 17 2022 19:40:42 GMT+0800 (China Standard Time)

I created separate hsemotion-onnx package that is available via pip. Moreover, I slightly refactored the code and moved the python packages in the separate repositories: hsemotion and hsemotion-onnx