sneaxiy's repositories
OpAccStableFramework
The accuracy and stability test framework.
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
CINN
Compiler Infrastructure for Neural Networks
cutlass
CUDA Templates for Linear Algebra Subroutines
DeepLearningExamples
Deep Learning Examples
flash-attention
Fast and memory-efficient exact attention
logging
MLPerf™ logging library
models
Model configurations
nccl
Optimized primitives for collective multi-GPU communication
NVBug
NVIDIA Bug
NVIDIA-MxNet
NVIDIA optimized MxNet framework for MLPerf
PaddleFleetX
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
PaddleNLP
Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system.
PaddleScience
PaddleScience is SDK and library for developing AI-driven scientific computing applications based on PaddlePaddle.
PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
training_results_v2.0
MLPerf™ Training v2.0 results
triton
Development repository for the Triton language and compiler