Beast code in Giters

CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".

Language:Python26000

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.0791100

HiDiffusion

[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!

Language:Jupyter NotebookApache-2.068400

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""

Apache-2.039100

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT1111900

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:PythonApache-2.0441200

MoMA

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Language:Jupyter Notebook16000

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonNOASSERTION272800

PuLID_ComfyUI

PuLID native implementation for ComfyUI

Language:PythonApache-2.043600

stylus

Language:Jupyter NotebookMIT11400

PairCustomization

Language:Python8200

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonNOASSERTION62000

StoryDiffusion

Create Magic Story!

Language:Jupyter NotebookApache-2.0545700

Vitron

A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Language:Python24400

comfyui-portrait-master-zh-cn

肖像大师中文版 comfyui-portrait-master

Language:PythonGPL-3.0149800

Fooocus

Focus on prompting and generating

Language:PythonGPL-3.03816800

maia

Official implementation of MAIA, A Multimodal Automated Interpretability Agent

Language:Python3900

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

1054800

ComfyUI-YoloWorld-EfficientSAM

Unofficial implementation of YOLO-World + EfficientSAM for ComfyUI

Language:PythonGPL-3.049000

Autonomous-Agents

Autonomous Agents (LLMs) research papers. Updated Daily.

MIT27000

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookApache-2.01414200

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookNOASSERTION29900

xyxxmb

xyxxmb's starred repositories

GLM-4

PAE

Omost

aesthetic-predictor-v2-5

Pandora

ReVideo

CraftsMan

CameraCtrl

CLoT