Vassili's repositories
amazon-a2i-sample-jupyter-notebooks
Sample Jupyter Notebooks for Amazon Augmented AI (A2I)
amazon-sagemaker-architecting-for-ml
Materials for a 3-day instructor led course on applying machine learning
amazon-sagemaker-mlops-workshop
Machine Learning Ops Workshop with SageMaker: lab guides and materials.
autogluon
AutoGluon: AutoML Toolkit for Deep Learning
awesome-ml-courses
Awesome free machine learning and AI courses with video lectures.
aws-data-wrangler
Utility belt to handle data on AWS.
aws-glue-data-catalog-replication-utility
Replication utility for AWS Glue Data Catalog
aws-sagemaker-build
Creates a CloudFormation template that uses AWS StepFunctions to automate the building and training of Sagemaker custom models based on S3 and GitHub events
bookmark-utils
This repository contains the code for utility developed for bookmark functionality in AWS Glue Python jobs
dask-tutorial
Dask tutorial
data-scientists-guide-apache-spark
Best practices of using Spark for practicing data scientists in the context of a data scientist’s standard workflow.
datawig
Imputation of missing values in tables.
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
end-to-end-transformers
end-to-end Transformers workflow with SageMaker
fzf.aws
:cyclone: Using fuzzy finder to perform AWS operations on the command line
handson-ml2
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
koalas
Koalas: pandas API on Apache Spark
Machine-Learning-with-Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Mastering-Big-Data-Analytics-with-PySpark
Mastering Big Data Analytics with PySpark, Published by Packt
modin
Modin: Speed up your Pandas workflows by changing a single line of code
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
serverless-sagemaker-orchestration
This example shows how to build a serverless pipeline to orchestrate the continuous training and deployment of a linear regression model for predicting housing prices using Amazon SageMaker, AWS Step Functions, AWS Lambda, and Amazon CloudWatch Events.
Spark-Programming-In-Python
Apache Spark 3 - Spark Programming in Python for Beginners
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
workshop
AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker