Zeyuan Chen's starred repositories
Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
HMT-pytorch
Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"
EasySpider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
ECCV2022-RIFE
ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
VQA-With-Multimodal-Transformers
Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)
open_flamingo
An open-source framework for training large multimodal models.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
NeuScraper
[ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".
GaussianObject
Code for "GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting"
magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch
single-video-curation-svd
Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.
MagicDance
[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion