impiga / Plain-DETR

[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about content-related query and hybrid matching.

Chuck-Yu opened this issue · comments

Thanks for your great work. After reading your paper and code, I have some questions about Content-related Query and Hybrid Matching.

  1. When two_stage enabled, query number = two_stage_num_proposals which is not equal one2one + one2many. Seems like the hybrid matching disabled by two_stage.
  2. If pass the one2one + one2many to two_stage_num_proposals, whether the one2many query part should initialized by encoder proposal and why?

if self.two_stage:
(reference_points, max_shape, enc_outputs_class,
enc_outputs_coord_unact, enc_outputs_delta, output_proposals) \
= self.get_reference_points(memory, mask_flatten, spatial_shapes)
init_reference_out = reference_points
pos_trans_out = torch.zeros((bs, self.two_stage_num_proposals, 2*c), device=init_reference_out.device)
pos_trans_out = self.pos_trans_norm(self.pos_trans(self.get_proposal_pos_embed(reference_points)))
if not self.mixed_selection:
query_embed, tgt = torch.split(pos_trans_out, c, dim=2)
else:
# query_embed here is the content embed for deformable DETR
tgt = query_embed.unsqueeze(0).expand(bs, -1, -1)
query_embed, _ = torch.split(pos_trans_out, c, dim=2)
else:
query_embed, tgt = torch.split(query_embed, c, dim=1)
query_embed = query_embed.unsqueeze(0).expand(bs, -1, -1)
tgt = tgt.unsqueeze(0).expand(bs, -1, -1)
reference_points = self.reference_points(query_embed).sigmoid()
init_reference_out = reference_points

Thanks for your interest!

  1. As shown below, two_stage_num_proposals is set to one2one + one2many

    two_stage_num_proposals=args.num_queries_one2one + args.num_queries_one2many,

  2. We initialize the one2many queries by proposals, following the practice of the original hybrid matching code . In my view, since the one2many branch has a decoding procedure similar to the one2one branch, the two-stage trick should yield similar effects for both branches.