chenhaoxing / DiffusionInst

This repo is the code of paper "DiffusionInst: Diffusion Model for Instance Segmentation" (ICASSP'24).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Typo dirname `utli`->`util`

vadimkantorov opened this issue · comments

Also, it would be nice to add to README pointers to model component source (especially Decoder), since it's not discussed much in the paper.

E.g. could you please comment on the inference path ddim_sample and the preds, outputs_class, outputs_coord,outputs_kernel,mask_feat = self.model_predictions(backbone_feats, images_whwh, img, time_cond,self_cond, clip_x_start=clip_denoised) call which seems to be the decoder call. Counterintuitively, it seems that the noisy boxes are stored in the img variable, right? And the dynamic mask kernels are produced in a deterministic way, right?

Thanks!

@vadimkantorov
Thanks for your advice!
The dirname has been changed.
The decoder follows the code of DiffusionDet, such as variable names and function names.
Finally, the mask kernel filters are now generated from bounding boxes. We have fixed the training and inference equations in the latest arxiv version. Also see #1
We are still working on this project for directly denoising the filters and will clean and revise the code in the future.

It's also a bit disappointing that according to your results in README, mask AP barely improves by going from 1step to 4steps :(

It's also a bit disappointing that according to your results in README, mask AP barely improves by going from 1step to 4steps :(

Yes, it is. And thus we are trying different denoising strategies and mask representations for further research.