0wj0's starred repositories

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Book4_Power-of-Matrix

Book_4_《矩阵力量》 | 鸢尾花书:从加减乘除到机器学习;上架!

nebuly

The user analytics platform for LLMs

Language:PythonLicense:Apache-2.0Stargazers:8365Issues:93Issues:202

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2729Issues:32Issues:156

HITSZ-OpenCS

哈尔滨工业大学(深圳)计算机专业课程攻略 | Guidance for courses in Department of Computer Science, Harbin Institute of Technology (Shenzhen)

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1169Issues:14Issues:119

Pytorch-Memory-Utils

pytorch memory track code

MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Language:PythonLicense:BSD-3-ClauseStargazers:496Issues:10Issues:75

pytorch_graph-rel

A PyTorch implementation of GraphRel

Language:PythonLicense:MITStargazers:268Issues:6Issues:31

LM4VisualEncoding

[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"

Language:PythonLicense:MITStargazers:218Issues:4Issues:9

MISA

MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis

Language:PythonLicense:MITStargazers:192Issues:5Issues:0

language_modeling_via_stochastic_processes

Language modeling via stochastic processes. Oral @ ICLR 2022.

AdaShare

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

ACM-GNN

NeurIPS 2022, Revisiting Heterophily For Graph Neural Networks, official PyTorch implementation for Adaptive Channel Mixing (ACM) GNN framework

Language:PythonLicense:MITStargazers:68Issues:7Issues:0

c-sts

[EMNLP 2023] C-STS: Conditional Semantic Textual Similarity

GCNet

GCNet, official pytorch implementation of our paper "GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation"

MEmoR

Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.

Emotion-Recognition-in-Conversations

User Emotion Recognition and Response Generation in Dialogue Text

tdlm

实现了Transformer中的几种位置编码方案

Language:PythonStargazers:36Issues:2Issues:0

MECPE

[TAFFC 2022] Multimodal Emotion-Cause Pair Extraction in Conversations

MultiEMO-ACL2023

MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations (ACL 2023)

UniS-MMC

Code for UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning (ACL 2023)

Language:PythonLicense:MITStargazers:29Issues:3Issues:7

Color4Dial

Code and data for "Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue" (ACL Findings 2023).

Language:PythonLicense:MITStargazers:21Issues:2Issues:4

MMSD2.0

[ACL2023] Code and dataset for paper "MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System"

masters-of-our-EMNLP2023-papers

Pytorch code for EMNLP 2023 accepted-main paper "How to Enhance Causal Discrimination of Utterances: A Case on Affective Reasoning" and paper "Learning a Structural Causal Model for Intuition Reasoning in Conversation" (TKDE)

Language:PythonLicense:Apache-2.0Stargazers:14Issues:0Issues:0

VSTAR

[ACL2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information

Language:PythonStargazers:12Issues:1Issues:0
Language:PythonStargazers:7Issues:1Issues:0

MEMEX_Meme_Evidence

Official repo for ACL'23 (main) paper - MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization

ECPEC

Code for JointEC model

Language:PythonStargazers:3Issues:0Issues:0