Image Recognition

Question

Image Recognition

JanOwiesniak opened this issue 5 years ago · comments

Will jeelizAR work with 2D images too or is this library designed for 3D object detection use cases only?

Bourry Xavier · Answer 1 · Thu Apr 11 2019 04:32:59 GMT+0800 (China Standard Time)

Sure, we just have to train a neural network for the image.
But for 2D images keypoints feature based computer vision can also do the job.
Best,
Xavier

Jan Owiesniak · Answer 2 · Thu Apr 11 2019 19:57:45 GMT+0800 (China Standard Time)

Thanks for the instant feedback :)

What are the pros / cons for using a neural network for image detection in comparison to feature based computer vision? Is there a clear winner when it comes to speed and robustness?

What are the main differences between jeelizAR and tracking.js Object Tracker?

Bourry Xavier · Answer 3 · Thu Apr 11 2019 21:37:13 GMT+0800 (China Standard Time)

Hi,

It is a quite difficult question.
I would say the main drawback of neural network approach is that is is not deterministic. If a detection fails for example we cannot inspect exactly why. We can add the faulty input to the training dataset, train the network longer, or with a larger structure...

For image keypoints features matching, if a detection fails we can inspect the detection of the keypoints, then the matching, and understand why the matching has failed in order to change the algorithm. But then the algorithm is more complex, with differents steps (keypoint detection, descriptors computation, matching, computing the 3D transform) whereas the neural network approach is more integrated (1 step, an image as input and the 3D transform as output).

In term on speed I would bet on neural network approach, but this is only an opinion. For robustness I would bet on the other approach.

JeelizAR is based on a neural network running on the GPU in a very optimized way (with our own deep learning engine) whereas tracking.js object detection is based on haar cascades (https://docs.opencv.org/3.4.3/d7/d8b/tutorial_py_face_detection.html) which is an algorithm fitted for the CPU. Then haar cascade is less flexible, it is made for detection of rectangular areas only (otherwise using integral image is hard), we cannot easily add an output to evaluate an angular value of the detected object.

Jan Owiesniak · Answer 4 · Thu Apr 11 2019 21:47:26 GMT+0800 (China Standard Time)

Thanks for the incredible detailed explanations. I will give jeelizAR a try to see how image recognition in a neural network (jeelizAR) performs against feature based computer vision image recognition (tracking.js).

Marcus5234 · Answer 5 · Tue Jan 21 2020 23:09:52 GMT+0800 (China Standard Time)

I don't understand. So, I want to do an augmented reality experience based on image tracking, much like shown here How do I do it with this library?