Zeyuan Chen (zeyuanchen23)

zeyuanchen23

Geek Repo

Company:Salesforce Research

Github PK Tool:Github PK Tool

Zeyuan Chen's starred repositories

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

Stargazers:89Issues:0Issues:0

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1856Issues:0Issues:0

Unique3D

Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Language:PythonLicense:MITStargazers:2135Issues:0Issues:0

MeshAnything

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Language:PythonLicense:NOASSERTIONStargazers:1536Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:901Issues:0Issues:0

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonLicense:Apache-2.0Stargazers:596Issues:0Issues:0

HMT-pytorch

Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"

Language:PythonLicense:Apache-2.0Stargazers:50Issues:0Issues:0

EasySpider

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

Language:JavaScriptLicense:NOASSERTIONStargazers:29156Issues:0Issues:0

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:296Issues:0Issues:0

ScoreHMR

ScoreHMR: Score-Guided Diffusion for 3D Human Recovery (CVPR 2024)

Language:PythonLicense:MITStargazers:367Issues:0Issues:0

ECCV2022-RIFE

ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Language:PythonLicense:MITStargazers:4187Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22895Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10883Issues:0Issues:0

Inter4K

Official repository for downloading and using Inter4K video interpolation dataset

Language:PythonLicense:NOASSERTIONStargazers:23Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:20287Issues:0Issues:0

VQA-With-Multimodal-Transformers

Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:31Issues:0Issues:0

VidProM

VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models

Stargazers:83Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3560Issues:0Issues:0
Language:PythonStargazers:5Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7402Issues:0Issues:0

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Language:PythonStargazers:438Issues:0Issues:0

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:1290Issues:0Issues:0

EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1971Issues:0Issues:0

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:783Issues:0Issues:0

multi-hmr

Pytorch demo code and models for Multi-HMR

Language:PythonLicense:NOASSERTIONStargazers:146Issues:0Issues:0

NeuScraper

[ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".

Language:PythonLicense:MITStargazers:196Issues:0Issues:0

GaussianObject

Code for "GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting"

Language:PythonStargazers:729Issues:0Issues:0

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonLicense:MITStargazers:480Issues:0Issues:0

VisFusion

[CVPR 2023] Code for "VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos"

Language:PythonLicense:Apache-2.0Stargazers:177Issues:0Issues:0

FiT

[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model

License:Apache-2.0Stargazers:341Issues:0Issues:0