Best qaulity result without speed / performance concern

Question

Best qaulity result without speed / performance concern

MyraBaba opened this issue 3 years ago · comments

What is the best for detection (both accuracy and good landmark )and recognition model if there is no speed concern ? It could be run slower but produce highest quality results.

Such as RetFace-50 very good but old one. little bit slower . Is SCRFD superior than the retinaFace ? Specially feeding the recognition models ?

Is there any past experience or theoretical information ?

Best

SthPhoenix · Answer 1 · Sat Mar 12 2022 14:53:35 GMT+0800 (China Standard Time)

From my observations scrfd_10g_bnkps is better in term of lower false positive detections rate and better recall, though it produces worse landmarks for faces rotated at 90 and more degrees. Retinaface gives inaccurate landmarks in those conditions too, but they are mostly closer to expected.
Secondly bnkps model doesn't work well for large faces, it mostly misses them. gnkps model was retrained to fix this issue, but it has a bit lower recall and landmarks quality.
Thought I must admit faces missed by gnkps model are mostly useless for recognition, since they are of very low quality.

If you are not concerned about false detections and speed - Retinaface is still most universal detector IMO.

As for recognition I am mostly using glintr100 as most accurate from published models.
BUT, there is also Webface based model w600k_r50 from official package, which is almost twice faster then glintr100 and have better accuracy on some benchmarks as reported by authors. I have tested it for clustering and it seem to produce clusters almost identical as glintr100 model, but I can't tell you if it's really better or not.

MyraBaba · Answer 2 · Sat Mar 12 2022 15:00:15 GMT+0800 (China Standard Time)

@SthPhoenix Thanks for the reply. So best for accurate recognition is retinaface landmarks (R50).

have you tried yolo5l ?

Best

SthPhoenix · Answer 3 · Sat Mar 12 2022 15:47:59 GMT+0800 (China Standard Time)

@SthPhoenix Thanks for the reply. So best for accurate recognition is retinaface landmarks (R50).

It'll work better in extreme cases, but in general scrfd models should work pretty the same.

have you tried yolo5l ?

I have cloned the repo weeks ago, but still can't get some time to test it. It seems promising though.

MyraBaba · Answer 4 · Sun Mar 13 2022 19:11:04 GMT+0800 (China Standard Time)

@SthPhoenix What is your preferred metrics for glintr100 threshold according to your experience ?

and the paper said yolo5 landmarks better than the retinaFace so is it means producing better accuracy ? paper

finally what is your practical preference for arcface or cosface ? or distance make you satisfied

SthPhoenix · Answer 5 · Sun Mar 13 2022 19:56:11 GMT+0800 (China Standard Time)

I'm using (1. + np.dot(A, B)) / 2. as similarity metric, which gives score in 0.0-1.0 range.
From my experience scores above 0.8 are exact matches.
0.78-0.8 might contain false positive, though for really similar people and quite rarely.
0.75-0.78 contains more false positives though still rarely.
Everything below 0.75 could contain many false positives.

Also take note that true positives might be anywhere in range 0.6-1.0, poor image quality or side images may greatly decrease score, as well as comparing lot of poor images might give you lots of false positives with high scores.

MyraBaba · Answer 6 · Sun Mar 13 2022 22:24:57 GMT+0800 (China Standard Time)

@SthPhoenix so you are using similarity metric instead of distance ?

SthPhoenix · Answer 7 · Sun Mar 13 2022 22:33:04 GMT+0800 (China Standard Time)

Yes, I found it more human readable and convenient for display.

MyraBaba · Answer 8 · Mon Mar 14 2022 03:54:02 GMT+0800 (China Standard Time)

@SthPhoenix
I am testing lfw for glintr100.onx. as you know many small images.
Is there any way to index as batches to increase speed. ? GPU utilization not more than %15 .

SthPhoenix · Answer 9 · Mon Mar 14 2022 04:00:55 GMT+0800 (China Standard Time)

You can tweak batch sizes in deploy_trt.sh.
Also you can set multiple workers, though you should tweak your lfw code to support multithreading.

MyraBaba · Answer 10 · Mon Mar 14 2022 22:59:25 GMT+0800 (China Standard Time)

@SthPhoenix

Funny thing that genderModel says Queen_Elizabeth_II is Male :)

have you tested ?

SthPhoenix · Answer 11 · Mon Mar 14 2022 23:29:17 GMT+0800 (China Standard Time)

I always knew she has some secrets )
I'm not using gender age detection anywhere, this model was added just for compatibility with original lib. In my tests it shown bad accuracy too.

MyraBaba · Answer 12 · Fri Mar 25 2022 23:09:30 GMT+0800 (China Standard Time)

Have you tested IR-152 ? Looks better than the IR-50 FYI

…

On 12 Mar 2022, at 09:53, SthPhoenix ***@***.***> wrote: From my observations scrfd_10g_bnkps is better in term of lower false positive detections rate and better recall, though it produces worse landmarks for faces rotated at 90 and more degrees. Retinaface gives inaccurate landmarks in those conditions too, but they are mostly closer to expected. Secondly bnkps model doesn't work well for large faces, it mostly misses them. gnkps model was retrained to fix this issue, but it has a bit lower recall and landmarks quality. Thought I must admit faces missed by gnkps model are mostly useless for recognition, since they are of very low quality. If you are not concerned about false detections and speed - Retinaface is still most universal detector IMO. As for recognition I am mostly using glintr100 as most accurate from published models. BUT, there is also Webface based model w600k_r50 from official package, which is almost twice faster then glintr100 and have better accuracy on some benchmarks as reported by authors. I have tested it for clustering and it seem to produce clusters almost identical as glintr100 model, but I can't tell you if it's really better or not. — Reply to this email directly, view it on GitHub <#74 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEFRZHZ2IBXLOLP4GCTVG7DU7Q5PTANCNFSM5QQVUKHQ>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.

SthPhoenix · Answer 13 · Sat Mar 26 2022 00:56:44 GMT+0800 (China Standard Time)

Have you tested IR-152 ? Looks better than the IR-50

Could you please provide link to the model you are talking about?

MyraBaba · Answer 14 · Sat Mar 26 2022 01:23:54 GMT+0800 (China Standard Time)

https://drive.google.com/file/d/142hTmfmTiw1MkUdzn5M7dWIMyIBOwFO7/view?usp=sharing Let me know when you download.

…

On 25 Mar 2022, at 19:56, SthPhoenix ***@***.***> wrote: Have you tested IR-152 ? Looks better than the IR-50 Could you please provide link to the model you are talking about? — Reply to this email directly, view it on GitHub <#74 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEFRZH6XUCHTKU5PCH2S2DDVBXV5PANCNFSM5QQVUKHQ>. You are receiving this because you authored the thread.

SthPhoenix · Answer 15 · Sat Mar 26 2022 01:54:09 GMT+0800 (China Standard Time)

I've downloaded it, but still some more info is needed ) What dataset was used for training? Are there any performance metrics available?
I haven't seen any info about ir-152 based models in deepinsight repo.

MyraBaba · Answer 16 · Sat Mar 26 2022 17:15:03 GMT+0800 (China Standard Time)

MyraBaba commented 3 years ago

MyraBaba · Answer 17 · Sat Mar 26 2022 21:21:58 GMT+0800 (China Standard Time)

@SthPhoenix is this one or glint100 is better according to your field practice

SthPhoenix · Answer 18 · Mon Apr 04 2022 02:26:50 GMT+0800 (China Standard Time)

@MyraBaba unfortunately I can't check your model in near future, but just from benchmarks perspective they seems to be pretty the same.
Though I must admit that for later models it seems that classical LFW, CFP, AgeDB, etc. benchmarks may be not enough representative.
Also in many use cases slight accuracy gains may be neglected by slower inference of IR-152 backbone.

BTW, I have tested yolov5-face models, it seems they give a bit better landmarks in hard cases, though I haven't performed any speed tests yet.
I think if I'll be able to run those models at speeds comparable with at least RetinaFace, I'll add them to this repo.

SthPhoenix · Answer 19 · Sat May 07 2022 03:49:07 GMT+0800 (China Standard Time)

@MyraBaba , I have added support for yolov5-facemodels, you can check it )

MyraBaba · Answer 20 · Sat May 07 2022 03:50:16 GMT+0800 (China Standard Time)

looking now :) @SthPhoenix

MyraBaba · Answer 21 · Sat May 07 2022 06:28:00 GMT+0800 (China Standard Time)

Did you convert to onnx ? I didn’t se the model where is it :)

…

On 6 May 2022, at 22:49, SthPhoenix ***@***.***> wrote: @MyraBaba <https://github.com/MyraBaba> , I have added support for yolov5-facemodels, you can check it ) — Reply to this email directly, view it on GitHub <#74 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEFRZH65W5JONPWEGNE6HFTVIVZT3ANCNFSM5QQVUKHQ>. You are receiving this because you were mentioned.

SthPhoenix · Answer 22 · Sat May 07 2022 14:18:47 GMT+0800 (China Standard Time)

Did you convert to onnx ? I didn’t se the model where is it :)

It's covered to onnx and will be automatically downloaded when you switch detector to, for example, yolov5s-face in deploy-trt.sh

MyraBaba · Answer 23 · Sat Jun 11 2022 18:38:48 GMT+0800 (China Standard Time)

@SthPhoenix
I am trying to add mqt an option for your project. Still on it.

Meanwhile I would love to hear your experience regarding to realtime face recognition from video feed .
detection faces 25/fps means 25 faces every second for one person. If there is 3 person it means 3x25 = 75 faces per second has to be feature extracted and search through db. But there is 3 person.

what would be the best performant solution to lower faces number to send to recognition server for process ?

lowering fps could cause the miss the desired face chip or on the detection process using the low feature (12) to understan if the still same person and not to send to recognition server ?

What would be your suggestion ? Consider there will be 10 camera and a lot of faces coming { ie : 3 person per camera and 25 fps = 750 faces per second to hand } . Meanwhile recognition server re-detect face for better alignment .

Would love to hear your opinion and suggestion

SthPhoenix · Answer 24 · Sat Oct 29 2022 00:09:35 GMT+0800 (China Standard Time)

Hi! I have no much experience in such scenario, as I have noted somewhere else you could try combining detection\feature extraction with some kind of tracker algo, like FastMOT. Also if you scenario expects people entering room\building it might be good idea to limit face size and\or tune detection area to exclude unwanted by-passers in background.

PS: Wow, this question was posted on Jun 11, totally missed it.