zhyang's repositories
awesome-align
A neural word aligner based on multilingual BERT
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
ColossalAI
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training
ControlNet
Let us control diffusion models!
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
FasterTransformer
Transformer related optimization, including BERT, GPT
gemma
Open weights LLM from Google DeepMind.
generative-recommenders
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152, ICML'24).
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
MiniCPM
MiniCPM-2B: An end-side LLM outperforms Llama2-13B.
NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Open-Sora-Plan
This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
QQMusicSpider
基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论等,并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
torchscale
Transformers at any scale
trpc
A multi-language, pluggable, high-performance RPC framework
Vary-toy
Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)