Beast code in Giters

CHENGY12's starred repositories

Tree-Transformer

Implementation of the paper Tree Transformer

Language:Python21000

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Language:PythonMIT698100

CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonApache-2.0601000

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookNOASSERTION48100

Structured-Diffusion-Guidance

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Language:Jupyter NotebookNOASSERTION30400

awesome-english-ebooks

经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

Language:CSS2063200

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT30000

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookApache-2.062700

DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Language:Python62200

datacomp

DataComp: In search of the next generation of multimodal datasets

Language:PythonNOASSERTION63000

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonMIT524700

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonApache-2.0103500

ALIP

[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption

Language:Python8700

Pandora

Pandora: Towards General World Model with Natural Language Actions and Video States

Language:Python45100

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonApache-2.0194200

LLaVA-NeXT

Language:PythonApache-2.0214700

i-stylegan

Multi-domain image generation and translation with identifiability guarantees

Language:PythonNOASSERTION600

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonApache-2.0314800

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonApache-2.0464900

Chain-of-Spot

Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models

Language:PythonApache-2.08100

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT394900

CHENGY12

CHENGY12's starred repositories

Tree-Transformer

dowhy

CogVideo

LaVIT

Structured-Diffusion-Guidance

awesome-english-ebooks

LaVi-Bridge

AlphaCLIP

DeCLIP

datacomp

InternVL

ELLA

ALIP

Pandora

MambaOut

LLaVA-NeXT

i-stylegan

MGM

sglang

Chain-of-Spot

VAR

tapnet

Open-Sora

Mora

SoraReview

honeybee

howto100m

unmasked_teacher

Revisiting-Contrastive-SSL

VILA