Wang-Xiaodong1899

followers

following

stars

Peking University

https://wang-xiaodong1899.github.io/

Xiaodong Wang's starred repositories

DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Language:PythonNOASSERTION35567 1003 186

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.029128 341 267

generative-models

Generative Models by Stability AI

Language:PythonMIT23279 249 277

dalle-mini

DALL·E Mini - Generate images from a text prompt

Language:PythonApache-2.014699 112 155

codellama

Inference code for CodeLlama models

Language:PythonNOASSERTION13860 159 169

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonMIT11085 162 217

al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLMIT9905 23 527

cupy

NumPy & SciPy for GPU

Language:PythonMIT7973 128 2210

llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger

Language:Jupyter NotebookNOASSERTION7850 68 227

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION4301 48 395

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonMIT3442 30 250

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonMIT2871 37 188

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonBSD-3-Clause2589 31 152

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonMIT2112 23 158

gigagan-pytorch

Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs

Language:PythonMIT1705 72 49

Llama-X

Open Academic Research on Improving LLaMA to SOTA LLM

Language:PythonApache-2.01582 42 20

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonApache-2.01568 21 85

robotics_transformer

Language:PythonApache-2.01247 25 24

magvit

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

Language:PythonApache-2.0902 71 22

mmc4

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.

Language:PythonMIT884 9 17

SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Language:PythonNOASSERTION518 14 43

ReVersion

ReVersion: Diffusion-Based Relation Inversion from Images

Language:PythonNOASSERTION434 20 7

GigaGAN

Language:JavaScript397 260

robotic-transformer-pytorch

Implementation of RT1 (Robotic Transformer) in Pytorch

Language:PythonMIT358 10 6

Instruct2Act

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Language:Python303 3 19

Visual-LLaMA

Open LLaMA Eyes to See the World

Language:Python165 6 4

SceneScape

Official Pytorch Implementation for "SceneScape: Text-Driven Consistent Scene Generation"

Language:Python124 7 1

NUWA-LIP

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

Language:Python30 2 3

ORES

ORES: Open-vocabulary Responsible Visual Synthesis

Language:PythonMIT12 20

SSHT-plus-plus

SSHT++

Language:Python2 10