Question about TCBlock
HYUNJS opened this issue · comments
HYUN Jeongseok commented
-
In my opinion, TCBlock just returns the clustered data info from the output of CTM. May I ask for any context/background on using this in your implementation?
https://github.com/PKU-YuanGroup/Chat-UniVi/blob/main/ChatUniVi/model/cluster.py#L259-L287 -
Also, in the recent commit, you have separately created the mm_projector builder. Is this indicating that you are conducting ablation experiments for its design (e.g., linear, MLP, residual..?)
Peng Jin commented
- In the original version, cross-attention was performed within the TCBlock. However, our experiments revealed that such operations significantly compromised the stability of model training. Consequently, we opted to remove the cross-attention operations.
- We didn't do a full ablation of MLP, but LLaVA claimed that MLP would be better than Linear.
HYUN Jeongseok commented
I see. Thank you for your reply!