Armand McQueen's repositories
tensorpack-mask-rcnn
Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the original implementation.
ec2-cluster
Simple CLI and Python library to spin up and run shell commands on clusters of EC2 instances using boto3 and fabric.
horovod-utils
Tools for doing performance tuning of Horovod training
centrality
Centrality is a toolkit for managing GPU clusters and training workflows.
amazon-sagemaker-examples
Example notebooks that show how to apply machine learning, deep learning and reinforcement learning in Amazon SageMaker
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
determined
Determined: Deep Learning Training Platform
ec2-cluster-test
Code for testing PyPi releases of ec2-cluster
environments
Determined AI public environments
mpi-operator
Kubernetes Operator for Allreduce-style Distributed Training
sagemaker-containers
This support code is used for making machine learning frameworks (e.g. MXNet, TensorFlow) run on Amazon SageMaker.
sm-training-jui
Jupyter UI for SageMaker Training focused on watching logs, infrastructure metrics, and science metrics in real-time
tensorflow
Computation using data flow graphs for scalable machine learning
tensorpack
A Neural Net training Interface on TensorFlow, with focus on speed + flexibility
training_results_v0.6
Training v0.6 results