inference and comparing faces time in real time applications

Question

inference and comparing faces time in real time applications

wareziom opened this issue 3 years ago · comments

Hi. I can do face recognition in real time using the Python insightface package and onnx pre-trained models.
(https://github.com/deepinsight/insightface/tree/master/python-package)
I really face a lot of questions and challenges if you please help me.

cuda 11.1
mxnet :1.8.1 from source installed
onnxruntime-gpu:1.7.0
numpy:1.17.0

I use the following code to identify faces for 1000 feature extracted faces:

import cv2
import sys
import time
import numpy as np
import insightface
from insightface.app import FaceAnalysis
from imutils import paths
import os
import pickle
from numpy as np
from numpy.linalg import norm


parser = argparse.ArgumentParser()
parser.add_argument('--ctx', default=0, type=int, help='GPU')
args = parser.parse_args()
app = FaceAnalysis(name='models')
app.prepare(ctx_id=args.ctx, det_size=(640, 640))

database = {}

def compare(feat1, feat2):

    distance = np.dot(feat2, feat1) / norm(feat2) * norm(feat1))

    return distance

verification_threshhold = 0.3
database = {}

data = pickle.loads(open('encodings.pickle', "rb").read())

encoding = np.empty(512, )
imagePaths = list(paths.list_images('dataset'))

for (i, imagePath) in enumerate(imagePaths):
       
       name = os.path.splitext(os.path.basename(imagePath))[0]
       image = cv2.imread(imagePath)

        t1 = time.time()
        encodings = app.get(frame)
        t2 = time.time()
        print('elapsed time for extract encodings: ',(t2-t1))

        ts = time.time()
        for num, e in enumerate(encodings):

            encoding = e.embedding

            identity = ''
            for (name, db_enc) in data.items():

                dist = compare(encoding, db_enc)

                if 0.35 < dist:
                    identity = name
        te = time.time()
        print('elapsed time for  compare face in 1000 data')

Here I use different images for identification and each image has a different number of faces.
For example, one image has one face and the other image has 6 faces and the other has 15 faces.
After testing different images, I came up with the following outputs:

picture 1 (1 face ) >>> elapsed time for extract encodings: 0.019 s , elapsed time for compare face in 1000 data: 0.009 s
picture 2 (6 faces ) >>> elapsed time for extract encodings: 0.057 s , elapsed time for compare face in 1000 data: 0.05 s
picture 2 (15 faces ) >>> elapsed time for extract encodings: 0.19 s , elapsed time for compare face in 1000 data: 0.22 s

This is not at all useful for my purpose, considering real-time detection using multiple cameras with at least 20,000 faces for comparison.
How can I reduce this time?
Am I doing the right thing?
Thank you in advance for your reply

SthPhoenix · Answer 1 · Sat Aug 07 2021 00:36:13 GMT+0800 (China Standard Time)

This issue isn't related to this repository, so I'll close it for now.

As for your question you should look at FAISS or Milvus which are greatly optimized ANN search libraries, capable of searching multiple embeddings in single request.