JulianJuaner

followers

following

stars

CUHK, SmartMore

Hong Kong SAR

julianjuaner.github.io

Yuechen's starred repositories

U-ViT

A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".

Language:Jupyter NotebookMIT86300

LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language:PythonMIT50900

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Language:PythonAGPL-3.0153300

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookNOASSERTION34800

ID-Animator

Language:Python32300

evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes

Language:PythonApache-2.0112600

MiraData

Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"

Language:PythonGPL-3.031100

FRESCO

[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Language:Jupyter NotebookNOASSERTION69400

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT390800

Mira

Language:PythonGPL-3.032200

SLM

Language:Python1800

ComfyUI-Tripo

Custom nodes for using Tripo in ComfyUI.

Language:PythonMIT8200

TripoSR

Language:PythonMIT413800

InstantStyle

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥

Language:Jupyter Notebook152000

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonApache-2.087100

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonApache-2.022900

AnimateDiff

AnimationDiff with train

Language:Jupyter NotebookApache-2.011100

LivePhoto

Official implementations for paper: LivePhoto: Real Image Animation with Text-guided Motion Control

MIT17100

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT1108600

Visual-CoT

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Language:PythonApache-2.08100

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonApache-2.0311300

GeoWizard

[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Language:Python66400

StreamingT2V

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Language:Python114100

clarity-upscaler

Clarity AI | AI Image Upscaler & Enhancer - free and open-source Magnific Alternative

Language:PythonAGPL-3.0336400

M2Chat

Language:Python2900

pexels-crawler

The web crawler for pexels

Language:PythonMIT600

HD-VG-130M

The HD-VG-130M Dataset

grok-1

Grok open release

Language:PythonApache-2.04923600

GroupContrast

[CVPR 2024] GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

MIT4200

stable-diffusion-webui-wd14-tagger

Labeling extension for Automatic1111's Web UI

Language:Python55800