Aaron Han's starred repositories

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:40316Issues:393Issues:1292

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

Language:HTMLLicense:MITStargazers:10679Issues:266Issues:46

annotated-transformer

An annotated implementation of the Transformer paper.

Language:Jupyter NotebookLicense:MITStargazers:5520Issues:64Issues:85

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonLicense:MITStargazers:1708Issues:17Issues:79

awesome-stable-diffusion

Curated list of awesome resources for the Stable Diffusion AI Model.

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

atlas

Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03299)

Language:PythonLicense:NOASSERTIONStargazers:506Issues:13Issues:18

ChineseNMT

ChineseNMT: Translate English to Chinese with PyTorch Implementation of Transformer

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

awesome-text-to-image-studies

A collection of awesome text-to-image generation studies.

Language:TeXLicense:MITStargazers:309Issues:12Issues:0

LongVA

Long Context Transfer from Language to Vision

Language:PythonLicense:Apache-2.0Stargazers:283Issues:8Issues:12

MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Language:PythonLicense:MITStargazers:208Issues:4Issues:32

hands-on-research-tutorial

《动手做科研》面向科研初学者,一步一步地展示如何入门人工智能科研

Language:Jupyter NotebookStargazers:157Issues:0Issues:0

diffusion-models-class-CN

Materials for the Hugging Face Diffusion Models Course

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:153Issues:0Issues:0

Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

Language:PythonLicense:BSD-3-ClauseStargazers:99Issues:4Issues:7

VideoAgent

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

Language:PythonLicense:Apache-2.0Stargazers:74Issues:3Issues:5

VideoTree

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Language:PythonLicense:MITStargazers:60Issues:2Issues:4

Flash-VStream

Please refer to our official repo at https://github.com/IVGSZ/Flash-VStream.

Language:PythonLicense:Apache-2.0Stargazers:43Issues:2Issues:4

Soda

Search, organize, discover anything!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:43Issues:0Issues:0

LangRepo

Language Repository for Long Video Understanding

Language:PythonLicense:MITStargazers:27Issues:2Issues:1
Language:PythonLicense:BSD-3-ClauseStargazers:27Issues:1Issues:5

mvu

Multimodal Video Understanding Framework (MVU)

Language:PythonLicense:MITStargazers:22Issues:0Issues:0

explore-eqa

Public release for "Explore until Confident: Efficient Exploration for Embodied Question Answering"

Sealing

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

Language:PythonLicense:MITStargazers:9Issues:4Issues:0

Paper-Writing-Tips

该仓库是MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips

Stargazers:3Issues:0Issues:0
Language:PythonLicense:BSD-3-ClauseStargazers:1Issues:0Issues:0