ultralytics / yolov5

📚 This guide explains how to use YOLOv5 🚀 model ensembling during testing and inference for improved mAP and Recall. UPDATED 25 September 2022.

From https://www.sciencedirect.com/topics/computer-science/ensemble-modeling:

Ensemble modeling is a process where multiple diverse models are created to predict an outcome, either by using many different modeling algorithms or using different training data sets. The ensemble model then aggregates the prediction of each base model and results in once final prediction for the unseen data. The motivation for using ensemble models is to reduce the generalization error of the prediction. As long as the base models are diverse and independent, the prediction error of the model decreases when the ensemble approach is used. The approach seeks the wisdom of crowds in making a prediction. Even though the ensemble model has multiple base models within the model, it acts and performs as a single model.

Before You Start

Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7. Models and datasets download automatically from the latest YOLOv5 release.

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Test Normally

Before ensembling we want to establish the baseline performance of a single model. This command tests YOLOv5x on COCO val2017 at image size 640 pixels. yolov5x.pt is the largest and most accurate model available. Other options are yolov5s.pt, yolov5m.pt and yolov5l.pt, or you own checkpoint from training a custom dataset ./weights/best.pt. For details on all available models please see our README table.

$ python val.py --weights yolov5x.pt --data coco.yaml --img 640 --half

Output:

val: data=./data/coco.yaml, weights=['yolov5x.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True
YOLOv5 🚀 v5.0-267-g6a3ee7c torch 1.9.0+cu102 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 476 layers, 87730285 parameters, 0 gradients

val: Scanning '../datasets/coco/val2017' images and labels...4952 found, 48 missing, 0 empty, 0 corrupted: 100% 5000/5000 [00:01<00:00, 2846.03it/s]
val: New cache created: ../datasets/coco/val2017.cache
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 157/157 [02:30<00:00,  1.05it/s]
                 all       5000      36335      0.746      0.626       0.68       0.49
Speed: 0.1ms pre-process, 22.4ms inference, 1.4ms NMS per image at shape (32, 3, 640, 640)  # <--- baseline speed

Evaluating pycocotools mAP... saving runs/val/exp/yolov5x_predictions.json...
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.504  # <--- baseline mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.688
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.351
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.551
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.382
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.628
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.681  # <--- baseline mAR
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.524
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.735
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.826

Ensemble Test

Multiple pretraind models may be ensembled togethor at test and inference time by simply appending extra models to the --weights argument in any existing val.py or detect.py command. This example tests an ensemble of 2 models togethor:

YOLOv5x
YOLOv5l6

python val.py --weights yolov5x.pt yolov5l6.pt --data coco.yaml --img 640 --half

Output:

val: data=./data/coco.yaml, weights=['yolov5x.pt', 'yolov5l6.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True
YOLOv5 🚀 v5.0-267-g6a3ee7c torch 1.9.0+cu102 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 476 layers, 87730285 parameters, 0 gradients  # Model 1
Fusing layers... 
Model Summary: 501 layers, 77218620 parameters, 0 gradients  # Model 2
Ensemble created with ['yolov5x.pt', 'yolov5l6.pt']  # Ensemble notice

val: Scanning '../datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100% 5000/5000 [00:00<00:00, 49695545.02it/s]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 157/157 [03:58<00:00,  1.52s/it]
                 all       5000      36335      0.747      0.637      0.692      0.502
Speed: 0.1ms pre-process, 39.5ms inference, 2.0ms NMS per image at shape (32, 3, 640, 640)  # <--- ensemble speed

Evaluating pycocotools mAP... saving runs/val/exp3/yolov5x_predictions.json...
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.515  # <--- ensemble mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.699
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.557
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.356
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.563
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.668
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.387
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.638
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689  # <--- ensemble mAR
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.526
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.743
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.844

Ensemble Inference

Append extra models to the --weights argument to run ensemble inference:

python detect.py --weights yolov5x.pt yolov5l6.pt --img 640 --source data/images

Output:

detect: weights=['yolov5x.pt', 'yolov5l6.pt'], source=data/images, imgsz=640, conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False
YOLOv5 🚀 v5.0-267-g6a3ee7c torch 1.9.0+cu102 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 476 layers, 87730285 parameters, 0 gradients
Fusing layers... 
Model Summary: 501 layers, 77218620 parameters, 0 gradients
Ensemble created with ['yolov5x.pt', 'yolov5l6.pt']

image 1/2 /content/yolov5/data/images/bus.jpg: 640x512 4 persons, 1 bus, 1 tie, Done. (0.063s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 3 persons, 2 ties, Done. (0.056s)
Results saved to runs/detect/exp2
Done. (0.223s)

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Can I use it in version 1?

what's influence of model ensemble

compare to test-time augmentation? which one will be better?

Well, I see, the model ensembling method is actually more like using a poor model to find missed detections for a good model. In contrast, TTA can also find missed detections by changing the input, while maintaining using the best model.

@Zzh-tju ensembling and TTA are not mutually exclusive. You can TTA a single model, and you can ensemble a group of models with or without TTA:

python detect.py --weights model1.pt model2.pt --augment

@Zzh-tju ensembling runs multiple models, while TTA tests a single model at with different augmentations. Typically I've seen the best result when merging output grids directly, (i.e. ensembling YOLOv5l and YOLOv5x), rather than simply appending boxes from multiple models for NMS to sort out. This is not always possible however, for example Ensembling an EfficientDet model with YOLOv5x, you can not merge grids, you must use NMS or WBF (or Merge NMS) to get a final result.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

How can I ensemble EfficientDet D7 with YOLO V5x?

@Blaze-raf97 with the right amount of coffee anything is possible.

How to solve this problem?
COCO mAP with pycocotools... saving detections_val2017__results.json...
ERROR: pycocotools unable to run: invalid literal for int() with base 10: 'Image_20200930140952222'

@LokedSher pycocotools is only intended for mAP on COCO data using coco.yaml. https://pypi.org/project/pycocotools/

@LokedSher pycocotools is only intended for mAP on COCO data using coco.yaml. https://pypi.org/project/pycocotools/

Thanks for your reply!

@LokedSher I also encountered the same problem as you, but after I read your Q&A, I still don't know how to improve to get the picture given by the author.

I want to ensemble yolov3-spp and yolov5x which are trained by using your excellent work yolov3 and yolov5. Few months ago i got the ensemble result ,but i try it again now, error encountered, can you help me? ths!

python detect.py --weights runs/train/exp13/weights/best.pt /home/work/pretrained_weights/best.pt --source /home/work/data

Fusing layers...
Model Summary: 484 layers, 88390614 parameters, 0 gradients
Traceback (most recent call last):
File "detect.py", line 172, in
detect()
File "detect.py", line 33, in detect
model = attempt_load(weights, map_location=device) # load FP32 model
File "/home/work/xunuo/yolov5/yolov5/models/experimental.py", line 137, in attempt_load
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'float'

@PromiseXu1 we are currently updating YOLOv3 models and should have them available for autodownload within this repo at the end of the week. In the meantime if you have an existing YOLOv3 model that runs inference correctly with this repo, you can modify the ensemble type to NMS ensemble to allow it to ensemble with the YOLOv5 models. v3 and v5 models have different heads (FPN and PANet), meaning the current ensemnble method will not work as it expects every output to have the same size and shape. You can modify ensemble type in the Ensemble() class:

yolov5/models/experimental.py

Lines 117 to 130 in 201bafc

    
           class Ensemble(nn.ModuleList): 
        
               # Ensemble of models 
        
               def __init__(self): 
        
                   super(Ensemble, self).__init__() 
        
               def forward(self, x, augment=False): 
        
                   y = [] 
        
                   for module in self: 
        
                       y.append(module(x, augment)[0]) 
        
                   # y = torch.stack(y).max(0)[0]  # max ensemble 
        
                   # y = torch.cat(y, 1)  # nms ensemble 
        
                   y = torch.stack(y).mean(0)  # mean ensemble 
        
                   return y, None  # inference, train output

@PromiseXu1 we are currently updating YOLOv3 models and should have them available for autodownload within this repo at the end of the week. In the meantime if you have an existing YOLOv3 model that runs inference correctly with this repo, you can modify the ensemble type to NMS ensemble to allow it to ensemble with the YOLOv5 models. v3 and v5 models have different heads (FPN and PANet), meaning the current ensemnble method will not work as it expects every output to have the same size and shape. You can modify ensemble type in the Ensemble() class:

yolov5/models/experimental.py

Lines 117 to 130 in 201bafc

class Ensemble(nn.ModuleList):

# Ensemble of models

def __init__(self):

super(Ensemble, self).__init__()

def forward(self, x, augment=False):

y = []

for module in self:

y.append(module(x, augment)[0])

# y = torch.stack(y).max(0)[0] # max ensemble

# y = torch.cat(y, 1) # nms ensemble

y = torch.stack(y).mean(0) # mean ensemble

return y, None # inference, train output

WOw ! thanks for your prompt response, i will try and wish you all the best in your work ~

@Zzh-tju ensembling runs multiple models, while TTA tests a single model at with different augmentations. Typically I've seen the best result when merging output grids directly, (i.e. ensembling YOLOv5l and YOLOv5x), rather than simply appending boxes from multiple models for NMS to sort out. This is not always possible however, for example Ensembling an EfficientDet model with YOLOv5x, you can not merge grids, you must use NMS or WBF (or Merge NMS) to get a final result.

Hi, I understand how the NMS ensemble works. But, I have not understood how exactly the mean and max ensemble types work in the context of ensembling two Yolov5 networks (Is it just the mean or max of the final predictions from each model?). Could you please provide more insight. Thanks a lot!

@AJ-RR see https://docs.ultralytics.com/yolov5/tutorials/model_ensembling#issuecomment-732088097 for forward method of each.

@AJ-RR see #318 (comment) for forward method of each.

Thanks! I understand it now

I have a self trained model with different picture sizes which works fine if I apply it without model ensembling. I want to detect 2 objects: persons which I use the coco trained dataset of yolov5 and another object trained with my custom trained dataset. Unfortunately, it fails with the following when applied together:

Fusing layers...
Model Summary: 232 layers, 7459581 parameters, 0 gradients, 17.5 GFLOPS
Fusing layers...
Model Summary: 232 layers, 7246518 parameters, 0 gradients, 16.8 GFLOPS
Ensemble created with ['yolov5s.pt', 'object.pt']

Traceback (most recent call last):
  File "detect.py", line 174, in <module>
    detect()
  File "detect.py", line 61, in detect
    _ = model(img.half() if half else img) if device.type != 'cpu' else None  # run once
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/src/app/models/experimental.py", line 109, in forward
    y = torch.stack(y).mean(0)  # mean ensemble
RuntimeError: stack expects each tensor to be equal size, but got [1, 25200, 85] at entry 0 and [1, 25200, 6] at entry 1

A short research of that message tells me the different sizes of the pictures of the 2 datasets might be the problem. Is that really a problem?

My training command was the following:
python train.py --data object.yml --cfg yolov5s.yaml --weights 'yolov5s.pt' --batch-size 64

I am still not sure if that's the best way or even a good way to combine these two models but it seemed to be the easiest way.

@philippneugebauer the recent v4.0 release updated the default ensembling method from mean to nms, which should allow dissimilar models to ensemble together more easily. You may want to git pull to receive this update and retry.

yolov5/models/experimental.py

Line 109 in 69be8e7

y = torch.cat(y, 1) # nms ensemble

In any case ensembling is intended for models that share the same classes. It looks like your method will intersect custom classes with COCO classes at those same indices, leading to incorrect results (silent failure mode).

I pulled the latest state from dockerhub and receive now the following error:

Traceback (most recent call last):
  File "detect.py", line 173, in <module>
    detect()
  File "detect.py", line 61, in detect
    _ = model(img.half() if half else img) if device.type != 'cpu' else None  # run once
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/src/app/models/experimental.py", line 109, in forward
    y = torch.cat(y, 1)  # nms ensemble
RuntimeError: Sizes of tensors must match except in dimension 2. Got 6 and 85 (The offending index is 0)

Ah, so do I understand that correctly, that both of my datasets have a class with index 0 and that's why it crashes?

Yeah, I want to detect different classes from 2 datasets. Is there a better way to do? I was thinking about training all of them together but then I wanted to avoid that effort to confirm it works together first.

@philippneugebauer you can train on multiple datasets by passing a list of directories or txt files, though this is only possible for datasets that share the same classes.

yolov5/data/coco.yaml

Lines 12 to 13 in fda8df7

    
           # train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/] 
        
           train: ../coco/train2017.txt  # 118287 images

@glenn-jocher sorry, I get confused.
If I have two datasets:
dataset1: there are cat, dog and python, and labeled as 0 , 1 and 2
dataset2: there are cat, fox and fish, and also labeled as 0 , 1 and 2
so, in this case, I cannot do model ensembling, right?

I have to intersect them to:
dataset: cat, dog, python, fox and fish, and relabeled them as 0 , 1, 2, 3, 4
and train them separatly:
dataset1_new: cat, dog and python, as 0, 1, 2, 3 4 (although there are no 3, 4, I have to label them in classes.txt to share same classes)
dataset2_new: cat, fox and fish, as 0, 1, 2, 3 4 (although there are no 1, 2, I have to label them in classes.txt to share same classes)

do I understand correctly?

@ryan994 only models with identical classes may be ensembled. How you produce those models is up to you.

@AJ-RR see #318 (comment) for forward method of each.

@glenn-jocher, I feel I still lack a clear understanding. Are there any references that I can use (and might have to cite) to exactly understand how the mean is computed?

@AJ-RR you can view the Ensemble() module source here:

yolov5/models/experimental.py

Lines 98 to 111 in f59f801

    
           class Ensemble(nn.ModuleList): 
        
               # Ensemble of models 
        
               def __init__(self): 
        
                   super(Ensemble, self).__init__() 
        
               def forward(self, x, augment=False): 
        
                   y = [] 
        
                   for module in self: 
        
                       y.append(module(x, augment)[0]) 
        
                   # y = torch.stack(y).max(0)[0]  # max ensemble 
        
                   # y = torch.stack(y).mean(0)  # mean ensemble 
        
                   y = torch.cat(y, 1)  # nms ensemble 
        
                   return y, None  # inference, train output

@glenn-jocher Is it possible to ensemble 2 different models together? I want to ensemble the yolov5l.pt (80 classes) with my own custom model which has 40 classes. I can do it by loading them separately and go through the models with a separate NMS and loop for each one!

@Auth0rM0rgan yes you can ensemble as many different models as you want, but they should have intersecting or complementary classes. For example a custom COCO trained model can be ensembled with a pretrained model, or alternatively a custom model with class indices 80-90 can be ensembled with a COCO model. You can't ensemble the same custom 10 class model if the indices are denoted 0-9 with a COCO model, as the classes will conflict.

@glenn-jocher
I got two models trianed based on two datasets
model1-dataset1:['eye', 'nose', 'mouth', 'head']
model2-dataset2:['eye', 'nose', 'mouth']
according what you said, these two models cannot be essembled, right?(Actually, I have tried this but failed).

And I am confused that if I have a model trained based on ['eye', 'nose', 'mouth'] and I want to make a new model with ['eye', 'nose', 'mouth', 'head'], can I achieve this by only training the ['head'] part?
Thank you!!

@letty825 ok looking back at this ensembling is intended for models with identical class vectors. Any other combination is too complicated to support easily. If there's concise modifications to the codebase that supports your use case feel free to submit a PR, thanks!

@glenn-jocher Does this ensemble method alleviate false positive detections or does it just prevent false negatives?

@connorlee77 the current ensemble technique (NMS ensemble) simply appends detections to a list from each model. The mechanism for increasing mAP is increasing recall, which will increase TPs and FPs, and reduce FNs.

@glenn-jocher I want to ensemble a model that I custom trained in yolov5 with one of the pretrained model in yolov5 . If yes how am I supposed to do it??

@pragyan430 if the classes are the same, you ensemble normally per the above tutorial.

@glenn-jocher Thanks for the quick reply, Is there no way to use the pretrained model yolov5s.pt that is available??

@pragyan430 you can ensemble any YOLOv5 models that share the same classes.

Hi @glenn-jocher

Que 1)

How can we combine more than 1 model together for detecting multiple classes together ?

Eg :-

Model 1 (classes : A , B , C)
Model 2 (classes : P , Q)

while detection can we use ,
( Model 1 and Model 2) for detecting (A,B,C,P,Q)
Is it possible ?

Que 2:

How to handle class imbalance ?

Class A : 100 images
Class B : 500 images

Does we have data augmentation for images and annotations like roboflow ?

Can you share any technique to handle this issue.

Thanks ,

@VinayChaudhari1996 only models trained on the same dataset may be ensembled, i.e. all YOLOv5 models may be ensembled together since they were all trained on COCO.

This is a Model Ensembling Tutorial. For questions unrelated questions raise a new issue or starting a new discussion.

@glenn-jocher : You are doing really a good job! :)

I was wondering how to do ensembling of models from torch hub for yolov5?

Regards

@gauravgund Torch Hub models are individual models with no ensemble capability currently. So far only test.py and detect.py have built-in ensembling capability. I'll keep this feature request in mind going forward.

When ensembling two models, I'm getting following error:

fatal: not a git repository (or any parent up to mount point /kaggle)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
detect: weights=['/kaggle/input/weights-of-yolov5-150-epochs/best.pt', '/kaggle/input/siim-cov19-yolov5-train/yolov5/runs/train/exp/weights/best.pt'], source=/kaggle/tmp/test/image, imgsz=512, conf_thres=0.005, iou_thres=0.5, max_det=1000, device=, view_img=False, save_txt=True, save_conf=True, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=True, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=True, line_thickness=3, hide_labels=False, hide_conf=False, half=False
Ensemble created with ['/kaggle/input/weights-of-yolov5-150-epochs/best.pt', '/kaggle/input/siim-cov19-yolov5-train/yolov5/runs/train/exp/weights/best.pt']
image 1/2 /kaggle/tmp/test/image/51759b5579bc_image.png: Traceback (most recent call last):
File "detect.py", line 228, in
main(opt)
File "detect.py", line 223, in main
run(**vars(opt))
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "detect.py", line 106, in run
visualize=increment_path(save_dir / 'features', mkdir=True) if visualize else False)[0]
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'visualize'

and code is:

weights_dir = '/kaggle/input/weights-of-yolov5-150-epochs/best.pt'
weights_dir_1 = '/kaggle/input/siim-cov19-yolov5-train/yolov5/runs/train/exp/weights/best.pt'
!python detect.py --weights $weights_dir $weights_dir_1\
--img 512\
--conf 0.005\
--iou 0.5\
--source $test_dir\
--augment\
--save-txt --save-conf --exist-ok

works fine when I use either of the model.

@Muhammad4hmed your code is out of date, update your code:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Hi,

I have a question. This ensemble method can be used in PyTorch API ?
I want to load multiple weights like in command line but in API.

for example;
model = torch.hub.load('ultralytics/yolov5', 'custom', path=['path/to/best1.pt','path/to/best2.pt'])

I can't find any reference, is there any other way around
thanks

@silenus092 currently ensembling is only available in val.py and detect.py:

python detect.py --weights yolov5m6.p5 yolov5l6.pt

I have some questions
When ensembling two models, model's classes must be same?
I want to use two models with different classes. But I can't find how to do.

@JongWooBAE yes classes must be identical.

@glenn-jocher, is this functionality available when loading from torch hub?

@davidbroberts no, currently torch hub can only load one model per command, though you can load multiple models into a workspace at the same time, i.e.:

yolov5s = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # load one model
yolov5m = torch.hub.load('ultralytics/yolov5', 'yolov5m')  # load a second model

That's what I thought. Thank you for all the great work!

I have some questions.
I want to know the proper number of models
when ensembeling pre-trained models(ex, yolov5x.pt, yolov5l6.pt etc...)

How to ensemble models with different input sizes

@glenn-jocher Can it be done with two models trained on different datasets?

你好，您的邮件我已收到。我将在近期查看，尽快给你回复。

@myasser63 model ensembling is intended for multiple models trained on the same dataset.

You can train one model on multiple datasets though:
https://community.ultralytics.com/t/how-to-combine-weights-to-detect-from-multiple-datasets

@glenn-jocher Thanks for your help. It's very helpful

@glenn-jocher
I'm trying to test my trained model by using torch like this
=> model = torch.hub.load('./utils','custom',path='/opt/ml/yolov5/runs/train/exp/weights/best.pt',source='local')
is there any way to ensemble more than 2 models ??

@kimkihoon0515 model ensembling is available with detect.py and val.py. PyTorch Hub will load individual models for you given your command.

@glenn-jocher oic btw tensor.split() now works pretty well on torch ver 1.7.1 also thx for that :)

how to detect images with two model first model is yolov5s model(pretrained model) for human detetcion and custom model for detect trolley @glenn-jocher

@arunmack789 👋 Hello! Thanks for asking about handling inference results. YOLOv5 🚀 PyTorch Hub models allow for simple model loading and inference in a pure python environment without using detect.py.

Simple Inference Example

This example loads a pretrained YOLOv5s model from PyTorch Hub as model and passes an image for inference. 'yolov5s' is the YOLOv5 'small' model. For details on all available models please see the README. Custom models can also be loaded, including custom trained PyTorch models and their exported variants, i.e. ONNX, TensorRT, TensorFlow, OpenVINO YOLOv5 models.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, etc.
# model = torch.hub.load('ultralytics/yolov5', 'custom', 'path/to/best.pt')  # custom trained model

# Images
im = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, URL, PIL, OpenCV, numpy, list

# Inference
results = model(im)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.

results.xyxy[0]  # im predictions (tensor)
results.pandas().xyxy[0]  # im predictions (pandas)
#      xmin    ymin    xmax   ymax  confidence  class    name
# 0  749.50   43.50  1148.0  704.5    0.874023      0  person
# 2  114.75  195.75  1095.0  708.0    0.624512      0  person
# 3  986.00  304.00  1028.0  420.0    0.286865     27     tie

See YOLOv5 PyTorch Hub Tutorial for details.

Good luck 🍀 and let us know if you have any other questions!

not working my problem i used pretrained model yolovs.pt and custom trained model but it gets size error problem
Sizes of tensors must match except in dimension 1. yolov5 @glenn-jocher

@arunmack789 ensembling is restricted to models trained on the same dataset, your problem is due to user error.

@glenn-jocher is it possible to ensemble YOLOv5s with other models other than YOLOv5

@myasser63 sure, any custom models trained with this repo should work. i.e. you can also go grab yolov3.pt from https://github.com/ultralytics/yolov3 and ensemble it with YOLOv5 models from this repo.

@glenn-jocher Can the ensemble be done using onnx?

你好，您的邮件我已收到。我将在近期查看，尽快给你回复。

@myasser63 no

Can I ensemble the model with wts that are trained on different classes?

@akshay-ast no

What is the mechanism behind ensemble in YOLOv5?

I just try this example and found an interesting result which is not easy to be explained!

In the following image, I tried to perform detection using yolov5I6, yolov5x, and yolov5I6 + yolov5x on the same image.
In the left image, the leftmost truncated person is detected with an imprecise bounding box (probability=0.58) which tried to enclose the unseen head. In the center image, the bounding box of this person is more precise and the probability is 0.71.
However, in the right image, the bounding box corresponding to the same person has 0.80 probability which is higher than the ones, 0.58 & 0.71 respectively, in the left and center images.

I can't figure out why because if we put these detection results of the same truncated person in the left and right images together, the one from the left image should be eliminated by the one in the center image. However, after trying to trace how NMS works in YOLOv5, I can't find any script tried to increase the probability after merging.

Besides, for the detected bus, I also can't understand the ensemble result.
The bus in the right image has probability of 0.92. However, since the same bus in the center image has higher probability, the merged result should have a probability of 0.92 instead of 0.93.

Could anyone explain why the results are counter-intuitive?

@AlexofNTU the YOLOv5 Ensemble() module is hard coded to an NMS ensembling method. All model outputs are concatenated before being passed to NMS. The other two methods are possible when all models share the same size outputs.

yolov5/models/experimental.py

Lines 61 to 72 in 5774a15

    
           class Ensemble(nn.ModuleList): 
        
               # Ensemble of models 
        
               def __init__(self): 
        
                   super().__init__() 
        
               def forward(self, x, augment=False, profile=False, visualize=False): 
        
                   y = [module(x, augment, profile, visualize)[0] for module in self] 
        
                   # y = torch.stack(y).max(0)[0]  # max ensemble 
        
                   # y = torch.stack(y).mean(0)  # mean ensemble 
        
                   y = torch.cat(y, 1)  # nms ensemble 
        
                   return y, None  # inference, train output

@AlexofNTU the YOLOv5 Ensemble() module is hard coded to an NMS ensembling method. All model outputs are concatenated before being passed to NMS. The other two methods are possible when all models share the same size outputs.

yolov5/models/experimental.py

Lines 61 to 72 in 5774a15

class Ensemble(nn.ModuleList):

# Ensemble of models

def __init__(self):

super().__init__()

def forward(self, x, augment=False, profile=False, visualize=False):

y = [module(x, augment, profile, visualize)[0] for module in self]

# y = torch.stack(y).max(0)[0] # max ensemble

# y = torch.stack(y).mean(0) # mean ensemble

y = torch.cat(y, 1) # nms ensemble

return y, None # inference, train output

Thanks for the reply!
Even though outputs from different models are concatenated before NMS, I still don't understand why the probability of the survived bounding boxes is possible to be increased. In the NMS function, the survived bounding box has the updated bounding boxes because they are the averaged results of the eliminated bounding boxes but the probability doesn't!

It's interesting indeed. I can't state the cause with any certainty.

It is possible to join two trained frozen models into a single model with different classes?

@brunopatricio2012 no, you can't ensemble two models trained on different datasets, but you can train one model on two different datasets. See https://community.ultralytics.com/t/how-to-combine-weights-to-detect-from-multiple-datasets

hi
I am working on object detection using Yolov5 with custom dataset I designed i have several problems i wish i can contact someone and send him my errors in object detection directly , it always give other object name or it give little confidence sometimes it give 2 bound boxes for the same object please help

@Zzh-tju ensembling runs multiple models, while TTA tests a single model at with different augmentations. Typically I've seen the best result when merging output grids directly, (i.e. ensembling YOLOv5l and YOLOv5x), rather than simply appending boxes from multiple models for NMS to sort out. This is not always possible however, for example Ensembling an EfficientDet model with YOLOv5x, you can not merge grids, you must use NMS or WBF (or Merge NMS) to get a final result.

i need to make 3 types of yolov5 models in Ensembled Deep Learning so The aggregation voting will give the final decision how can i do that ?

@glenn-jocher Do you have any emsemble method with faster r-cnn and yolov5? i trained faster r-cnn with tensorflow and i got model file .pb .
Should a faster r-cnn be convert from .pb to .pt? or Something?

@Thanapong-khajon Ensemble() only works with PyTorch models, but yes if you have a PyTorch Faster RCNN model it might work.

@Thanapong-khajon Ensemble() only works with PyTorch models, but yes if you have a PyTorch Faster RCNN model it might work.

Do you have weights of Faster RCNN ?

I want to ensemble two yolov5x6 models trained on the same data with some variation
I saw the Ensemble() class mentioned in above issues but I was a bit confused on how to implement it in a python script

In other words, how do I exactly use that Ensemble in order to create an ensemble of my 2 models within a python script/function just by passing the models?

@pathikg YOLOv5 ensembling is automatically built into detect.py and val.py, so simply pass two weights:

python detect.py --weights yolov5s.pt yolov5m.pt

@pathikg YOLOv5 ensembling is automatically built into detect.py and val.py, so simply pass two weights:
python detect.py --weights yolov5s.pt yolov5m.pt

Thanks @glenn-jocher for quick reply but I want to do this in a python script
At present, I am loading model from torch hub with my custom weights and then doing the inference on respective images.
At present I've two such models, and I want to make ensemble of the same so is there any way I can do that as well?

I’d follow the code in detect.py and use the Ensemble() module from models/common.py

You mean from models/experimental.py?
cuz there's no Ensemble() module present in common.py :/

yes sorry in experimental.py

@glenn-jocher I really need your help for one of my problems.
I have a dataset trained on cat, dog and pen, and after training, I have got dataset1.pt best file
Now, I trained another model with new data, for example I have taken just images of cow, and got dataset2.pt best file.
Both of the trained model can detect the images separately. But I want to make them one model, so that it will detect all the images (cat, dog, pen and cow) using a single weight file.
Can I do it using ensemble techniques?
will this work?
python detect.py --weights dataset1.pt dataset2.pt --img 640 --source data/images/cat
or how can I do this, is there any way, rather than creating a new dataset will all the images, and trained again. Please reply, thanks

@Zzh-tju ensembling and TTA are not mutually exclusive. You can TTA a single model, and you can ensemble a group of models with or without TTA:

python detect.py --weights model1.pt model2.pt --augment

Hi @glenn-jocher , what would be the best practice for deploying an ensemble model like this? I know we can export the individual models for different deployment frameworks, but how would I export an ensemble?

Hello, I read a previous discussion about ensembling multiple trained networks by simply passing more than 1 weight file during inference.

My question is what technique is being used for fusing the predictions? Is it majority voting on the bounding boxes? Some weighted averaging?

I ended up predicting each image with all of the ensemble members separately, and then combining the bounding box predictions together and doing a second stage NMS to generate a final combined prediction. Seems to work OK. Another possibility is to average the weights of the ensemble members into an averaged model like they do in federated learning, but I have not properly evaluated that method yet.

@michael-mayo that's a good approach! Typically, the ensembling technique involves averaging the model weights and biases instead of the predictions. Here, you are combining the predictions which can be achieved using the NMS algorithm. Keep in mind that it's important to experiment and choose the best method based on the specific problem and the performance of each approach.

I should also add that for each ensemble member I trained using a different global random seed, and a different (5/6) subset of the training data, to improve ensemble diversity.

@michael-mayo That's a great technique to improve ensemble diversity. It can help reducing the chances of overfitting (which can happen if all ensemble members are trained on exactly the same data) and increase the robustness of the final predictions.

	class Ensemble(nn.ModuleList):
	# Ensemble of models
	def __init__(self):
	super(Ensemble, self).__init__()

	def forward(self, x, augment=False):
	y = []
	for module in self:
	y.append(module(x, augment)[0])
	# y = torch.stack(y).max(0)[0] # max ensemble
	# y = torch.cat(y, 1) # nms ensemble
	y = torch.stack(y).mean(0) # mean ensemble
	return y, None # inference, train output

	# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
	train: ../coco/train2017.txt # 118287 images