binmakeswell's repositories
ColossalChat
ColossalChat is the project to implement LLM with RLHF, powered by the Colossal-AI project.
ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
LARS-ImageNet-PyTorch
Accuracy 77%. Large batch deep learning optimizer LARS for ImageNet with PyTorch and ResNet, using Horovod for distribution. Optional accumulated gradient and NVIDIA DALI dataloader.
awesome-deeplearning-resources
Deep Learning and deep reinforcement learning research papers and some codes
awesome-mlops
A curated list of references for MLOps
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome-System-for-Machine-Learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
ColossalAI-Benchmark
Performance benchmarking with ColossalAI
FastFold
Optimizing Protein Structure Prediction Model Training and Inference on GPU Clusters
metaseq
Repo for external large-scale work
Open-Sora
Building your own video generation model like OpenAI's Sora
OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
PaLM-colossalai
Scalable PaLM implementation of PyTorch
pytorch-lamb
PyTorch implementation of LAMB for ImageNet/ResNet-50 training
SkyComputing
Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe