bchoivw / python-data-science-study

Python 3 data science projects from online courses

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

python-data-science-study

For the past 8 weeks, I have been focusing on my company's internal DS competition everyday and put some of the stuffs below on hold. I actually learned tons more from stack overflow, kaggle kernels, and other useful websites during that period to work on that DS competition. I've been utilizing AWS EC2/EMR/S3, bitbucket/sourcetree, and machine learning algorithms like regularized linear regression, Generalized Additive Model(PyGam), xgboost, catboost, and convolutional neural network (Keras) with real company data to build models that can potentially save company's money and be used in practice. Unfortunately, I won't be able to put my codes or data on public repo.

This repo is for data science projects and guided exercises in Python 3 from online courses and books.
Projects I independently work on will have their own repo and/or be on Kaggle.

I'm utilizing multiple resources listed below in order to have a good understanding of both application and theory of machine learning.

1. Udemy - Python for Data Science and Machine Learning Bootcamp (Progress: 100%)

This online course covers broad range of applications of machine learning

2. Stanford Online - Statistical Learning (Progress: 100%)

This course covers logic and theory behind ML which I think are very important to understand to become a good data scientist
The course was originally taught in R, but I leart it with Python version of it

3. Book - An Introduction to Statistical Learning (Progress: 95%)

This is a companion book of Stanford Online course above. I read it on my iPad during my commute

4. Coursera - Machine Learning by Andrew Ng (Progress: 70%)

Covers similar topics as above (+ Neural Network and ML best practices), but uses linear algebra for vectorization which is something the courses above tried to avoid. The course was originally taught in Octave, but I am learning it with Python version

5. Coursera - Deep Learning Specialization by deeplearning.ai (Progress: 0%)

I'm not sure how long it will take for me to get here, but I will be very excited to learn more advanced techniques like NN

6. Kaggle - Machine Learning Explainability (Progress: 80%)

This short course on Kaggle goes over few techniques that attempt to give more insights into machine learning model which are sometimes considered black boxes. I personally found this very interesting and I have been utilizing them in practice. Covers permutation important, partial dependency plot, and SHAP value. I highly recommend this interactive course as these relatively new techniques are often not included in typical ML course.

About

Python 3 data science projects from online courses


Languages

Language:Jupyter Notebook 100.0%