Zeyuan Chen (zeyuanchen23)

zeyuanchen23

Geek Repo

Company:Salesforce Research

Github PK Tool:Github PK Tool

Zeyuan Chen's starred repositories

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:846Issues:0Issues:0

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonLicense:Apache-2.0Stargazers:562Issues:0Issues:0

HMT-pytorch

Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"

Language:PythonLicense:Apache-2.0Stargazers:47Issues:0Issues:0

EasySpider

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

Language:JavaScriptLicense:NOASSERTIONStargazers:28059Issues:0Issues:0

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:285Issues:0Issues:0

ScoreHMR

ScoreHMR: Score-Guided Diffusion for 3D Human Recovery (CVPR 2024)

Language:PythonLicense:MITStargazers:360Issues:0Issues:0

ECCV2022-RIFE

ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Language:PythonLicense:MITStargazers:4158Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22355Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10802Issues:0Issues:0

Inter4K

Official repository for downloading and using Inter4K video interpolation dataset

Language:PythonLicense:NOASSERTIONStargazers:22Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:19212Issues:0Issues:0

VQA-With-Multimodal-Transformers

Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:30Issues:0Issues:0

VidProM

VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models

Stargazers:81Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3530Issues:0Issues:0
Language:PythonStargazers:5Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7256Issues:0Issues:0

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Language:PythonStargazers:423Issues:0Issues:0

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:1050Issues:0Issues:0

EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1943Issues:0Issues:0

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:763Issues:0Issues:0

multi-hmr

Pytorch demo code and models for Multi-HMR

Language:PythonLicense:NOASSERTIONStargazers:137Issues:0Issues:0

NeuScraper

[ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".

Language:PythonLicense:MITStargazers:194Issues:0Issues:0

GaussianObject

Code for "GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting"

Language:PythonStargazers:726Issues:0Issues:0

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonLicense:MITStargazers:467Issues:0Issues:0

VisFusion

[CVPR 2023] Code for "VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos"

Language:PythonLicense:Apache-2.0Stargazers:176Issues:0Issues:0

FiT

[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model

License:Apache-2.0Stargazers:336Issues:0Issues:0

lida

Automatic Generation of Visualizations and Infographics using Large Language Models

Language:Jupyter NotebookLicense:MITStargazers:2536Issues:0Issues:0

single-video-curation-svd

Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:78Issues:0Issues:0

ChartVLM

Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Language:PythonLicense:CC-BY-4.0Stargazers:187Issues:0Issues:0

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonStargazers:596Issues:0Issues:0