Beast code in Giters

Zeyuan Chen's starred repositories

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

8900

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT185600

Unique3D

Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Language:PythonMIT213500

MeshAnything

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Language:PythonNOASSERTION153600

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonApache-2.090100

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonApache-2.059600

HMT-pytorch

Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"

Language:PythonApache-2.05000

EasySpider

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

Language:JavaScriptNOASSERTION2915600

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookNOASSERTION29600

ScoreHMR

ScoreHMR: Score-Guided Diffusion for 3D Human Recovery (CVPR 2024)

Language:PythonMIT36700

ECCV2022-RIFE

ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Language:PythonMIT418700

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2289500

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonApache-2.01088300

Inter4K

Official repository for downloading and using Inter4K video interpolation dataset

Language:PythonNOASSERTION2300

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02028700

VQA-With-Multimodal-Transformers

Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)

Language:Jupyter NotebookApache-2.03100

VidProM

VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models

8300

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonMIT356000

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0740200

zeyuanchen23

Zeyuan Chen's starred repositories

MINT-1T

Lumina-T2X

Unique3D

MeshAnything

VILA

Grounding-DINO-1.5-API

HMT-pytorch

EasySpider

SEED-X

ScoreHMR

ECCV2022-RIFE

llama3

Open-Sora-Plan

Inter4K

Open-Sora

VQA-With-Multimodal-Transformers

VidProM

open_flamingo

mistral-7b-tensorrt-llm-truss

TensorRT-LLM

Panda-70M

OpenDiT

EfficientSAM

Bunny

multi-hmr

NeuScraper

GaussianObject

magvit2-pytorch

VisFusion

FiT