0wj0's starred repositories
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Book4_Power-of-Matrix
Book_4_《矩阵力量》 | 鸢尾花书:从加减乘除到机器学习;上架!
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
HITSZ-OpenCS
哈尔滨工业大学(深圳)计算机专业课程攻略 | Guidance for courses in Department of Computer Science, Harbin Institute of Technology (Shenzhen)
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Pytorch-Memory-Utils
pytorch memory track code
pytorch_graph-rel
A PyTorch implementation of GraphRel
LM4VisualEncoding
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
language_modeling_via_stochastic_processes
Language modeling via stochastic processes. Oral @ ICLR 2022.
Emotion-Recognition-in-Conversations
User Emotion Recognition and Response Generation in Dialogue Text
MultiEMO-ACL2023
MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations (ACL 2023)
Color4Dial
Code and data for "Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue" (ACL Findings 2023).
masters-of-our-EMNLP2023-papers
Pytorch code for EMNLP 2023 accepted-main paper "How to Enhance Causal Discrimination of Utterances: A Case on Affective Reasoning" and paper "Learning a Structural Causal Model for Intuition Reasoning in Conversation" (TKDE)
MEMEX_Meme_Evidence
Official repo for ACL'23 (main) paper - MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization