jeeliz / jeelizAR

JavaScript object detection lightweight library for augmented reality (WebXR demos included). It uses convolutional neural networks running on the GPU with WebGL.

Home Page:https://jeeliz.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Recognition

JanOwiesniak opened this issue · comments

Will jeelizAR work with 2D images too or is this library designed for 3D object detection use cases only?

Sure, we just have to train a neural network for the image.
But for 2D images keypoints feature based computer vision can also do the job.
Best,
Xavier

Thanks for the instant feedback :)

What are the pros / cons for using a neural network for image detection in comparison to feature based computer vision? Is there a clear winner when it comes to speed and robustness?

What are the main differences between jeelizAR and tracking.js Object Tracker?

Hi,

It is a quite difficult question.
I would say the main drawback of neural network approach is that is is not deterministic. If a detection fails for example we cannot inspect exactly why. We can add the faulty input to the training dataset, train the network longer, or with a larger structure...

For image keypoints features matching, if a detection fails we can inspect the detection of the keypoints, then the matching, and understand why the matching has failed in order to change the algorithm. But then the algorithm is more complex, with differents steps (keypoint detection, descriptors computation, matching, computing the 3D transform) whereas the neural network approach is more integrated (1 step, an image as input and the 3D transform as output).

In term on speed I would bet on neural network approach, but this is only an opinion. For robustness I would bet on the other approach.

JeelizAR is based on a neural network running on the GPU in a very optimized way (with our own deep learning engine) whereas tracking.js object detection is based on haar cascades (https://docs.opencv.org/3.4.3/d7/d8b/tutorial_py_face_detection.html) which is an algorithm fitted for the CPU. Then haar cascade is less flexible, it is made for detection of rectangular areas only (otherwise using integral image is hard), we cannot easily add an output to evaluate an angular value of the detected object.

Thanks for the incredible detailed explanations. I will give jeelizAR a try to see how image recognition in a neural network (jeelizAR) performs against feature based computer vision image recognition (tracking.js).

I don't understand. So, I want to do an augmented reality experience based on image tracking, much like shown here How do I do it with this library?