YueLiao / CDN

Thank you for the nice work of HOI. I'm currently following your work

I have a question about the args.use_matching. It seems that you did not use the matching_embed when training the model.
Is that true all the reported model were trained with args.use_matching=False ?

I also have anothr question about the matching_embed or the interactive_score.
In HICO-DET, it seems that the "interactiveness" is not compatible with the verb classifier when the classifer is set as 117-d. because there is a "no_interaction" class as one of the 117 verb classes.

I wonder if someone try a 116-d classifier and then use the "interactiveness" classifier (i.e., a 2-d classifier) as supplement.

In addition, comapred to HOI, another similar task scene graph generation (SGG) does not exist a relation class named "no_interaction", (e.g.,the VisualGenome dataset). And therefore SGG does not use mAP as metric.

I wonder if it is make sense to take account into the "no_interaction" as one of the relation classes and calculate mAP as metric. After all, enumerating and annotating all possible relations between subject-object pair is usually impossible. And annotating all no_interaction sub-obj pair is also impossible

In addition, according to the dataloader, here:

CDN/datasets/hico.py

Line 120 in 71c5702

target['matching_labels'] = torch.ones_like(target['obj_labels'])

The matching_labels are all ones, regardless of the verb label being "no_interaction" or not.

Thank you for the nice work of HOI. I'm currently following your work

I have a question about the args.use_matching. It seems that you did not use the matching_embed when training the model. Is that true all the reported model were trained with args.use_matching=False ?

Yes, all models were trained with args.use_matching=False

I also have anothr question about the matching_embed or the interactive_score. In HICO-DET, it seems that the "interactiveness" is not compatible with the verb classifier when the classifer is set as 117-d. because there is a "no_interaction" class as one of the 117 verb classes.

I wonder if someone try a 116-d classifier and then use the "interactiveness" classifier (i.e., a 2-d classifier) as supplement.

In addition, comapred to HOI, another similar task scene graph generation (SGG) does not exist a relation class named "no_interaction", (e.g.,the VisualGenome dataset). And therefore SGG does not use mAP as metric.

I wonder if it is make sense to take account into the "no_interaction" as one of the relation classes and calculate mAP as metric. After all, enumerating and annotating all possible relations between subject-object pair is usually impossible. And annotating all no_interaction sub-obj pair is also impossible

The HICO-DET only annotated very few no-interactive human-object pairs with a 'no-interaction' label in an image. Accurately, we directly regard the 'no-interaction' label as a common 'interactive' HOI label and didn't treat the 'no-interaction' label as a negative tag. Here, we define the human-object pairs without any annotation (including 'no-interaction') as the no-interactive pairs.

The HICO-DET only annotated very few no-interactive human-object pairs with a 'no-interaction' label in an image. Accurately, we directly regard the 'no-interaction' label as a common 'interactive' HOI label and didn't treat the 'no-interaction' label as a negative tag. Here, we define the human-object pairs without any annotation (including 'no-interaction') as the no-interactive pairs.

Okay, I see. Thank you for your answer!

questions about `args.use_matching`