Beast code in Giters

Bin Zhu's repositories

MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Apache-2.0000

Dlink_Parse

解析优酷，腾讯，哔哩哔哩，抖音，芒果TV，爱奇艺，PP视频，咪咕视频，AcFun，快手，搜狐视频，QQ音乐，网易云音乐，酷我音乐，腾讯课堂，西瓜视频等下载地址

MIT000

lux

👾 Fast and simple video download library and CLI tool written in Go

MIT000

LLMBind

LLMBind: A Unified Modality-Task Integration Framework

Apache-2.0000

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:PythonMIT000

MagicTime

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Apache-2.0000

tmp-123

000

Video-Bench

A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!

000

LanguageBind

MIT100

Latte

Latte: Latent Diffusion Transformer for Video Generation.

Apache-2.0000

AnimateDiff

Official implementation of AnimateDiff.

Apache-2.0000

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Apache-2.0000

SEED-Bench

A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

NOASSERTION000

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.

Apache-2.0000

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.

Apache-2.0000

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Apache-2.0000

fastmoe

A fast MoE impl for PyTorch

Apache-2.0000

LaVIN

[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"

000

minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

MIT000

Video-LLaMA

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

BSD-3-Clause000

VALOR

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

MIT000

ControlNet

Let us control diffusion models!

Apache-2.0000

TaiSu

TaiSu（太素）--a large-scale Chinese multimodal dataset（亿级大规模中文视觉语言预训练数据集）

NOASSERTION000

chatgpt-on-wechat

Wechat robot based on ChatGPT, which using OpenAI api and itchat library. 使用ChatGPT搭建微信聊天机器人，基于GPT3.5/4.0 API实现，支持个人微信、公众号、企业微信部署，能处理文本、语音和图片，访问操作系统和互联网。

MIT000

Grounded-Segment-Anything

Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs

Apache-2.0000

denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

MIT000

consistency_models

Official repo for consistency models.

MIT000

hfai-models

HFAI deep learning models

MIT000

TCLIP

000

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

MIT000