sachinyar's repositories
ipython_notebooks
Code snippets for reference
ML-mastery
Code from Jason Brownlee's course on mastering machine learning
flight-data-analysis
US Flight Data Analysis from January 2016
Simple-k-Means-Clustering-Python
Simple k-means clustering (centroid-based) using Python
movie-freak
A small movie recommendation system using OMDB API's
SparkCourse
Taming Big Data with Apache Spark and Python - Hands On - Udemy
spark_airline_delays
Rehash of HDP popular Predicting Airline Delays project
Udemy---Machine-Learning
Notebooks for Course
awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
Machine-Learning-Tutorials
machine learning and deep learning tutorials, articles and other resources
DataSciencePython
common data analysis and machine learning tasks using python
Udemy-notes
My udemy notebooks
Axa-Insurance-Telematics-Kaggle
I developed this case study only in 7 days with Pyspark (Spark 1.6.0) SQL & MLlib. I used Databricks cluster and AWS. %90 AUC is achieved (without involving Trip Matching-Repeated Trips feature) with Random Forest. Many ensembles with RF, GBT and Logistic Regression and outlier elimination could be used to improve this result. There are two versions of my code (test and full execution). Since AWS costs have exceeded my budget I sopped to train my model(s) all dataset for full dataset execution. There is also a ppt that presents my outputs in test execution. Full Data Execution code is more production ready and slightly different version. I had to use Databricks Table Caching to TRAIN and TEST data tables to obtain acceptable performance in production ready version.
DAT8
General Assembly's 2015 Data Science course in Washington, DC
mGalarnyk.github.io
Simple website for now using Github.
ds-for-telco
Source material for Data Science for Telecom Tutorial at Strata Singapore 2015
PCF-demo
System.exit
nyc-flights-analysis
Exploratory analysis bringing to bear all of my new skills in data manipulation and visualization in python.
building-spark-applications-live-lessons
Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable Spark applications for predictive analytics in the context of a data scientist's standard workflow.
DataScienceCourse
This holds iPython notebooks and lecture slides for the Intro to Data Science Master's course I teach at NYU.
Statistics-Notes
iPython NOtebooks on Stats