AhmedAchraf2001 / Data-Science-Resources

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python

Python + Algorithm Programming Exercises

General Data Science

Data Science Books

Pandas

Pandas EDA

Feature Engineering

https://elitedatascience.com/feature-engineering-best-practices https://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/ https://www.quora.com/What-are-some-best-practices-in-Feature-Engineering http://www.jmlr.org/

Statistics

Distributions

https://blog.cloudera.com/blog/2015/12/common-probability-distributions-the-data-scientists-crib-sheet/

Modeling

Probablistic Modeling

https://github.com/jmschrei/pomegranate

Statistics

https://students.brown.edu/seeing-theory/ ** HAS BASEBALL DATA! (‘homerun’ and ‘hitter’)

Bayes

https://github.com/AllenDowney/BayesSeminar

PYMC3

https://www.youtube.com/watch?v=VVbJ4jEoOfU&t=1151s&list=PL0eRwZHmE_S_vLkhXktls0PXSZIHJZ3Fb&index=2 https://www.youtube.com/watch?v=rZvro4-nFIk&index=3&list=PL0eRwZHmE_S_vLkhXktls0PXSZIHJZ3Fb

Linear Regression

Logistic Regression

SVM

Random Forest

Boosting and Bagging

PCA / ICA

Ensemble

Clustering

Time Series

https://github.com/ultimatist/ODSC17.git https://www.youtube.com/watch?v=JNfxr4BQrLk&t=3012s http://earthpy.org/pandas-basics.html https://github.com/agconti/trading/blob/master/GOOG%20V.%20AAPL%20Correlation%20Arb.ipynb

Winters

https://grisha.org/blog/2016/01/29/triple-exponential-smoothing-forecasting/

Model Tuning

http://machinelearningmastery.com/how-to-tune-algorithm-parameters-with-scikit-learn/

More complex http://fa.bianp.net/blog/2016/hyperparameter-optimization-with-approximate-gradient/

Text / NLP

https://github.com/kwartler/text_mining https://github.com/diegonogare/DataScience/tree/master/Text%20Mining http://juliasilge.com/blog/ https://www.good.is/articles/can-yelp-help-independent-restaurants-drive-chains-out-of-business https://www.springboard.com/blog/eat-rate-love-an-exploration-of-r-yelp-and-the-search-for-good-indian-food/ http://www.theatlantic.com/business/archive/2011/10/how-yelp-helps-steer-people-away-fast-food-chains/337181/ http://cs109.joeong.com/ <— cool MIT project https://www.canva.com/design/DACJbaSfIMY/jf93l6bhZr1WO1CgVXX0DA/edit https://blog.insightdatascience.com/super-donor-detecting-hidden-matches-in-a-public-sperm-donor-registry-a687fe6e05a0#.rvxktifgh http://www.dailydot.com/layer8/fake-news-sites-list-facebook/ https://journals.agh.edu.pl/csci/article/viewFile/1339/1311 https://priceonomics.com/our-fixation-on-terrorism/ http://dlab.berkeley.edu/blog/scraping-new-york-times-articles-python-tutorial https://aqibsaeed.github.io/2016-07-26-text-classification/ http://people.cs.vt.edu/naren/papers/sdm2016.pdf http://www.kdnuggets.com/2015/01/text-analysis-101-document-classification.html http://blog.christianperone.com/2011/09/machine-learning-text-feature-extraction-tf-idf-part-i/ https://pdfs.semanticscholar.org/aa96/9114cf6e4d77c5bb3dd62a20bee3446f33ab.pdf http://nlp.stanford.edu/courses/cs224n/2011/reports/nccohen-aatreya-jameszjj.pdf http://nlp.stanford.edu/courses/cs224n/2012/reports/kat_busch_writeup.pdf https://www.cs.sfu.ca/~anoop/papers/pdf/anoop_maryam-canvas-2013.pdf http://hint.fm/papers/wordtree_final2.pdf https://bl.ocks.org/mbostock/4339083 http://bbengfort.github.io/tutorials/2016/05/19/text-classification-nltk-sckit-learn.html <— Teresa’s most used NLTK tutorial for capstone https://rud.is/b/2013/03/12/visualizing-risky-words-part-4-d3-word-trees/ http://peekaboo-vision.blogspot.de/2012/11/a-wordcloud-in-python.html http://blancosilva.github.io/post/2016/08/24/bokeh.html http://streamhacker.com/2010/06/16/text-classification-sentiment-analysis-eliminate-low-information-features/ http://streamhacker.com/2010/05/10/text-classification-sentiment-analysis-naive-bayes-classifier/ http://fjavieralba.com/basic-sentiment-analysis-with-python.html http://www.nytimes.com/interactive/2012/09/06/us/politics/convention-word-counts.html?_r=0 http://people.csail.mit.edu/azar/wp-content/uploads/2011/09/thesis.pdf

Deep Learning

https://github.com/danromuald?tab=repositories

Design Algorithms

https://dspace.mit.edu/handle/1721.1/9044 https://www.smashingmagazine.com/2017/01/algorithm-driven-design-how-artificial-intelligence-changing-design/#comments-algorithm-driven-design-how-artificial-intelligence-changing-design

Generative Design + Optimization(3D)

https://medium.com/generative-design/design-optimization-2ec2ba3b40f7 https://www.datadvance.net/ https://www.formtrends.com/algorithms-design/ https://www.grasshopper3d.com/page/download-1 https://www.rhino3d.com/download

Artificial Intelligence

https://people.eecs.berkeley.edu/~russell/aima1e/chapter01.pdf http://dilab.gatech.edu/test/wp-content/uploads/2014/11/AI-GoelDavies2011-Final.pdf http://courses.csail.mit.edu/6.034f/ai3/rest.pdf

Knowledge Base AI

https://classroom.udacity.com/courses/ud409

Visualizations

Chord

http://www.delimited.io/blog/2013/12/8/chord-diagrams-in-d3

Matplotlib

https://github.com/WeatherGod/interactive_mpl_tutorial https://github.com/matplotlib/AnatomyOfMatplotlib

Seaborn

https://elitedatascience.com/python-seaborn-tutorial

Bokeh

D3

https://github.com/morganecf/imdb-odsc

Data Science Project Management

http://www.datasciencemanifesto.org/ https://drivendata.github.io/cookiecutter-data-science/#cookiecutter-data-science https://www.slideshare.net/srikanthps/scrum-in-15-minutes-presentation https://www.slideshare.net/joelhorwitz/agile-data-science-36258963 https://www.slideshare.net/srogers74/agile-software-development-overview-presentation/11-Introduction_to_Agile_Methodologies_contd https://www.slideshare.net/katemats/manage-datascience-2013strata/17-After_For_the_top_search gm-spacagna/datasciencemanifesto-copy#1

Production

http://www.kdnuggets.com/2017/06/dataiku-checklist-data-science-implemented-production.html

Blogs

http://vrl.cs.brown.edu/color - generates categorical color palettes http://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3 - generates categorical color palettes http://algorithms-tour.stitchfix.com/#data-platform - storytelling with d3 https://students.brown.edu/seeing-theory/regression/index.html#first - visually seeing statistics theories

Open Source

http://www.kdnuggets.com/2016/11/top-20-python-machine-learning-open-source-updated.html

Matrix Factorization w/ deep learning

http://www1.cmc.edu/pages/faculty/BHunter/

Industries

Finance

https://github.com/CaptainKanuk https://songyao21.github.io/Research_Papers/Risk%20Transfer%20versus%20Cost%20Reduction.pdf

Kaggle

https://github.com/jdwittenauer/kaggle https://www.slideshare.net/markpeng/general-tips-for-participating-kaggle-competitions

Tools

Command Line

GIT

SQL

Debugging

http://kawahara.ca/how-to-debug-a-jupyter-ipython-notebook/

Jupyter Notebooks

https://github.com/drivendata/data-science-is-software https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/

SciKit Learn

https://github.com/amueller/advanced_training

Apache Drill

https://github.com/cgivre

Skills

https://blog.udacity.com/data-analyst-skills-checklist-eguide

About