Arun's starred repositories
gpu-python-tutorial
GPU Development in Python 101 tutorial
the-algorithm
Source code for Twitter's Recommendation Algorithm
Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
skunkworks-synthetic-data
A pipeline for generating and evaluating synthetic data generation models. Currently using SynthVAE to demonstrate functionality. Read more about the project here: https://nhsx.github.io/skunkworks/synthetic-data-pipeline
NeuralNetworks-for-Quantum
Tutorials for the paper: "Neural network in quantum many-body physics: a hands-on tutorial"
label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
pyspark-style-guide
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
pyspark.test
Example unit tests for Apache Spark Python scripts using the py.test framework
pre-commit-hooks
Some out-of-the-box hooks for pre-commit
machine-learning-interview
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
machine-learning-systems-design
A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"
dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
spark-df-profiling
Create HTML profiling reports from Apache Spark DataFrames
feature-selector
Feature selector is a tool for dimensionality reduction of machine learning datasets
datalab-ml-training
Machine Learning Training
academic_advisory
Collected opinions and advice for academic programs focused on data science skills.
docker-images
Official source of container configurations, images, and examples for Oracle products and projects