Explaination on The YoloX Body-Head-Hand output dimension?

Question

Explaination on The YoloX Body-Head-Hand output dimension?

kho-bluefrogrobotics opened this issue 6 months ago · comments

kho-bluefrogrobotics commented 6 months ago

Issue Type

Documentation Feature Request

OS

Other

OS architecture

armv7

Programming Language

Other

Framework

TensorFlowLite

Model name and Weights/Checkpoints URL

YoloX Body-Head-Hand
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/426_YOLOX-Body-Head-Hand

Description

First of all thanks and congratulation for your excellent work, on YOLO.

I have a question regarding the output size, though.
@PINTO0309 You say you limit the output to 20 boxes : How come we obtain a float32[60,7] output then?
shouldn't we get something like 20x [class, score, x1, y1, x2, y2] = 20 x 5 floats?

Relevant Log Output

No response

URL or source code for simple inference testing code

No response

Katsuya Hyodo · Answer 1 · Tue Dec 12 2023 07:19:26 GMT+0800 (China Standard Time)

As detailed in the README.

max output boxes per class

20 (Body) + 20 (Head) + 20 (Hand) = 60 boxes

[batch_num, classid, score, x1, y1, x2, y2] * 60 = 7 x 60 floats

kho-bluefrogrobotics · Answer 2 · Tue Dec 12 2023 16:46:03 GMT+0800 (China Standard Time)

OK Thanks!