EGO-YOLO

We retrain the YOLO-series detection framework on the ego-object dataset in order to obtain a more complete egocentric perspective visual tool chain. The backbone for the detecor is from YOLOv5. The ego-object dataset is from the : https://ai.meta.com/datasets/egoobjects-downloads/. In this work, we do not set the object classification branch in YOLO, only the foreground (object) and background were classified.

We freeze the Classify Decoder and set the classification-head into a binary class structure – front ground and back ground.
We involve the COCO pretrained backbone and finetune on the Ego-Object Datasets
Reset all the data into a COCO format from detron2 format.

How to use it

Download the pretrained YOLO: The pretrained model is putted in: https://drive.google.com/drive/folders/1j6z27hA8vNA_oCB8aZcYrNG2JDFEJrlu?usp=drive_link , please download the pretrained model (last.pt or best.pt).
Install the package: pip install -r requirements.txt
Run with: python detect.py --weights best.pt --source $Your Image$

Some Experimental results

Here is the mAP-50 results without pretrained YOLOv5 and pretrained YOLOv5:

We show the val-set comparison results in below:

The pretrained results:

The origin YOLO results:

Then, we show the real-world (real headset videos) comparison results in below:

The pretrained results:

The origin YOLO results:

About

We retraine the YOLO-series detection framework on the ego-object dataset in order to obtain a more complete egocentric perspective visual tool chain.

Languages

Language:Python 76.9%Language:Jupyter Notebook 21.9%Language:Shell 0.8%Language:Dockerfile 0.4%