Yuanhan Zhang (ZhangYuanhan-AI)

ZhangYuanhan-AI

Geek Repo

Company:Nanyang Technological University

Location:Singapore

Home Page:https://zhangyuanhan-ai.github.io/

Twitter:@zhang_yuanhan

Github PK Tool:Github PK Tool

Yuanhan Zhang's starred repositories

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49427Issues:562Issues:209

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26212Issues:217Issues:238

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:9327Issues:73Issues:1108
Language:PythonLicense:Apache-2.0Stargazers:2449Issues:32Issues:225

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:2030Issues:31Issues:84

LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

Language:PythonLicense:NOASSERTIONStargazers:1458Issues:27Issues:84

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonLicense:NOASSERTIONStargazers:1359Issues:4Issues:145

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Language:PythonLicense:NOASSERTIONStargazers:1172Issues:12Issues:27

ToMe

A method to increase the speed and lower the memory footprint of existing vision transformers.

Language:PythonLicense:NOASSERTIONStargazers:936Issues:111Issues:38

VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:786Issues:12Issues:87

ring-flash-attention

Ring attention implementation with flash attention

Language:PythonLicense:MITStargazers:536Issues:10Issues:32

ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language:PythonLicense:MITStargazers:453Issues:11Issues:14

prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Language:PythonLicense:MITStargazers:415Issues:12Issues:38

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

ttt-lm-jax

Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

scaling_on_scales

When do we not need larger vision models?

Language:PythonLicense:MITStargazers:316Issues:7Issues:14

LongVA

Long Context Transfer from Language to Vision

Language:PythonLicense:Apache-2.0Stargazers:295Issues:8Issues:21

Dataset

News: the 10k dataset is ready for download.

Language:HTMLLicense:NOASSERTIONStargazers:274Issues:13Issues:31

TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:267Issues:5Issues:45
Language:PythonLicense:NOASSERTIONStargazers:113Issues:1Issues:8

video_captioning_datasets

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

Language:Jupyter NotebookStargazers:109Issues:3Issues:1

PSG4D

4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)

Genixer

(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator

Language:PythonStargazers:77Issues:3Issues:0

MATH-V

MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.

Language:PythonLicense:MITStargazers:54Issues:1Issues:2

LongVideoBench

Official Dataloader and Evaluation Scripts for LongVideoBench.

Language:PythonStargazers:51Issues:0Issues:0

MMLongBench-Doc

Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations

Language:PythonLicense:Apache-2.0Stargazers:48Issues:0Issues:0

CVRR-Evaluation-Suite

Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".

Language:PythonLicense:CC-BY-4.0Stargazers:39Issues:0Issues:0