Beast code in Giters

HuaZheLei's starred repositories

books

【编程随想】收藏的电子书清单（多个学科，含下载链接）

CC0-1.017659 952 130

BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

Language:HTMLApache-2.07628 107 436

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonNOASSERTION5330 45 73

MiniCPM

MiniCPM-2B: An end-side LLM outperforms Llama2-13B.

Language:Jupyter NotebookApache-2.04082 52 111

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Language:PythonMIT2876 34 182

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonBSD-3-Clause2863 71 300

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.02322 390

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT1984 30 97

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonApache-2.01517 21 83

Latte

Latte: Latent Diffusion Transformer for Video Generation.

Language:PythonApache-2.01354 25 74

style-aligned

Official code for "Style Aligned Image Generation via Shared Attention"

Language:PythonApache-2.01094 23 23

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonApache-2.01059 16 62

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonApache-2.01029 20 51

improved-aesthetic-predictor

CLIP+MLP Aesthetic Score Predictor

Language:PythonApache-2.0732 6 10

Bunny

A family of lightweight multimodal models.

Language:PythonApache-2.0699 19 75

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

460 8 2

taesd

Tiny AutoEncoder for Stable Diffusion

Language:PythonMIT429 10 15

aesthetic-predictor

A linear estimator on top of clip to predict the aesthetic quality of pictures

Language:Jupyter NotebookMIT394 12 6

svd-temporal-controlnet

Language:Python361 9 15

edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Language:PythonNOASSERTION333 8 3

HPT

HPT - Open Multimodal LLMs from HyperGAI

Language:PythonApache-2.0289 6 6

Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

Language:PythonApache-2.0262 5 28

DriveDreamer

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

243 23 4

WorldDreamer

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

MIT148 17 3

VL-GPT

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Apache-2.083 19 2

MotionInversion

Language:Python61 9 4

INF-MLLM

Language:Python43 1 7

Uniaa

Unified Multi-modal IAA Baseline and Benchmark

42 1 1

AIGCBench

Official repo for AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

Language:PythonApache-2.02400

TransCore-M

Large Multimodal Model

Language:Python15 2 5