CHENGY12

CHENGY12

Geek Repo

Company:Tsinghua University

Location:Beijing

Github PK Tool:Github PK Tool

CHENGY12's starred repositories

Tree-Transformer

Implementation of the paper Tree Transformer

Language:PythonStargazers:210Issues:0Issues:0

dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Language:PythonLicense:MITStargazers:6981Issues:0Issues:0

CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonLicense:Apache-2.0Stargazers:6010Issues:0Issues:0

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:481Issues:0Issues:0

Structured-Diffusion-Guidance

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:304Issues:0Issues:0

awesome-english-ebooks

ē»ęµŽå­¦äŗŗ(含音频)态ēŗ½ēŗ¦å®¢ć€å«ęŠ„态čæžēŗæć€å¤§č„æę“‹ęœˆåˆŠē­‰č‹±čÆ­ę‚åæ—å…č“¹äø‹č½½,ę”Æꌁepub态mobi态pdfę ¼å¼, ęƏå‘Øꛓꖰ

Language:CSSStargazers:20632Issues:0Issues:0

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonLicense:MITStargazers:300Issues:0Issues:0

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:627Issues:0Issues:0

DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Language:PythonStargazers:622Issues:0Issues:0

datacomp

DataComp: In search of the next generation of multimodal datasets

Language:PythonLicense:NOASSERTIONStargazers:630Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. ꎄčæ‘GPT-4oč”ØēŽ°ēš„å¼€ęŗå¤šęØ”ę€åƹčƝęؔ型

Language:PythonLicense:MITStargazers:5247Issues:0Issues:0

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonLicense:Apache-2.0Stargazers:1035Issues:0Issues:0

ALIP

[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption

Language:PythonStargazers:87Issues:0Issues:0

Pandora

Pandora: Towards General World Model with Natural Language Actions and Video States

Language:PythonStargazers:451Issues:0Issues:0

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonLicense:Apache-2.0Stargazers:1942Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:2147Issues:0Issues:0

i-stylegan

Multi-domain image generation and translation with identifiability guarantees

Language:PythonLicense:NOASSERTIONStargazers:6Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3148Issues:0Issues:0

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:4649Issues:0Issues:0

Chain-of-Spot

Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models

Language:PythonLicense:Apache-2.0Stargazers:81Issues:0Issues:0

VAR

[GPT beats diffusionšŸ”„] [scaling laws in visual generationšŸ“ˆ] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:3949Issues:0Issues:0

tapnet

Tracking Any Point (TAP)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1237Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:21426Issues:0Issues:0

Mora

Mora: More like Sora for Generalist Video Generation

Language:PythonStargazers:1467Issues:0Issues:0

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

Stargazers:485Issues:0Issues:0

honeybee

Official implementation of project Honeybee (CVPR 2024)

Language:PythonLicense:NOASSERTIONStargazers:408Issues:0Issues:0

howto100m

Code for the HowTo100M paper

Language:PythonLicense:Apache-2.0Stargazers:248Issues:0Issues:0

unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Language:PythonLicense:MITStargazers:278Issues:0Issues:0

Revisiting-Contrastive-SSL

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]

Language:PythonLicense:NOASSERTIONStargazers:86Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:1664Issues:0Issues:0