Dets should have the same type as scores
Alan01252 opened this issue · comments
Hi,
Firstly thanks for the excellent work on this!
I've been trying the examples this morning ( and I realise this repository is all very new ) but am unable to get the class's/scores out of the image.
Is this expected with the current state of the project? My assumption is not because otherwise you'd also be seeing this error.
If I "fix" the error
valid_box = valid_box.float()
valid_cls = valid_cls.float()
Then I end up with an image like the below ( which I assume isn't what's expected):
python examples/lazy.py device=cpu task=inference task.source=./demo/images/inference/image.png
I am wondering if it's something to do with device=cpu? As I think it's directly from the output of the model.
I am new'ish to machine learning computer vision and have tried to dig into what's going wrong, but am lacking skills, so any point in the right direction as how I could diagnose further/help fix would be greatly appreciated too.
Hi Alan01252,
Thank you for pointing out the issues!
The Error
The exception is indeed caused by the CPU. I suspect it's related to the Torch JIT or AMP. When running on a GPU, Torch might automatically ensure datatype consistency, but this doesn't seem to be the case in CPU mode. To address this, I will investigate further and include a fix in the upcoming push. In the meantime, you can try modifying bounding_box_utils.py
from:
https://github.com/WongKinYiu/yolov9mit/blob/7dd7b628a945e37ec494b3278f432c194da1aa96/yolo/utils/bounding_box_utils.py#L300
to:
valid_cls = cls_idx[valid_mask].float()
as you mentioned. This should resolve the issue directly.
The Prediction Result
Regarding the prediction results, I believe they are correct after your code modification. The numerous strange bounding boxes on the image are likely due to the NMS confidence, which is the prediction confidence rate. To improve this, you can try adding task.nms.min_confidence=0.5
to your command line. This adjustment should make the predictions more robust.
$python yolo/lazy.py device=cpu task=inference task.nms.min_confidence=0.75
Please let me know if you encounter any further issues or have additional questions.
Best regards,
Hao-Tang Tsui
Thanks @henrytsui000,
The boundary boxes actually look good, it's the numbers that are confusing me, they are all 0, and I suspect that's why the .float is required?
Hi,
In the MS COCO dataset, the class index of people is 0, so we follow their settings here. I understand this may cause misunderstanding. So if I have time, I will change the annotation to the object name and confidence appearance.
Hao-Tang, Tsui
Oh that makes perfect sense! Thank you for clarifying, I feel stupid :)
Thank you for sharing your code! I have briefly reviewed it, and I must say that most of the code is clean and well-structured. Great job!
I have a suggestion for a minor revision: specifically, it would be beneficial to include the class names in coco.yaml
and modify inference.yaml
to load COCO's classes by default.
# yolo/config/task/dataset/coco.yaml
path: ...
class_name: [the class list]
autodownload: ...
During the inference stage, we could check if the inference.yaml
configuration has a class list and draw the image with the class name accordingly. If the class list is not provided, we can default to showing the class index.
Would you be able to open a pull request for this change? Alternatively, would it be alright if I borrow some of your code to make these adjustments?
Looking forward to your response!
I am happy for you to use the code however you want.
I did attempt to make the changes suggested but got confused by the requirements sorry.
Moving the class names into # yolo/config/task/dataset/coco.yaml like this
path: data/coco
train: train2017
class_names:
person
bicycle
car
makes sense to me, but I got confused by the requirement to:
modify inference.yaml to load COCO's classes by default.
I'm not sure what structure inference.yaml would take
Hi,
Sorry for the late reply; I've been quite busy these days. I wanted to let you know that I have implemented your suggestion in commit a0976c9. You can try visualizing it if you'd like!
I deeply appreciate your suggestion and had to refer to your code to make this happen. Thank you for your valuable input!
I also apologize for the misleading information in my last reply. After some trials, I believe that placing the class names in general.yaml may be the best approach.
Best regards,
Hao-Tang, Tsui
Thank you for the kind words! I have checked out what you've done and it all looks great .
Also thanks for all your hardwork on this, I am learning a TON following the progress of this repository!