Ankur Nayyar's repositories
cluster-policy-sdk
cluster-policy-sdk
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow-on-kubernetes
Bare minimal Airflow on Kubernetes (Local, EKS, AKS)
drunken-data-quality
Spark package for checking data quality
great-expectation
great expectation
gtc2017-numba
Numba tutorial for GTC 2017 conference
hyperas
Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization
ide-best-practices
Best practices for working with Databricks from an IDE
Mapreduce-Custom-Input-format
Ebcidic to Ascii
mlflow
Open source platform for the machine learning lifecycle
mlops
mlops slack
my-mlops-project
my-mlops-project
simple-keras-rest-api
A simple Keras REST API using Flask
spark
Mirror of Apache Spark
spark-pandas
Koala: Pandas APIs on Apache Spark
Twitter-Sentiment-Analysis-using-Apache-Spark-
Accessed the Twitter API for live streaming tweets. Performed Feature Extraction and transformation from the JSON format of tweets using machine learning package of python pyspark.mllib. Experimented with three classifiers -Naïve Bayes, Logistic Regression and Decision Tree Learning and performed k-fold cross validation to determine the best.
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow