chansigit / Data-Science-Projects

Repository containing data science projects.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Science Projects

Repository containing data science projects completed by me for academic and self learning purposes. Those are presented through Jupyter Notebooks and datasets (csv files).

Content:

  • Machine Learning

    • Principal Components Analysis with numpy: In this project, I will apply PCA to a dataset without using any of the popular machine learning libraries such as scikit-learn and statsmodels. The goal of this document is to have a deeper understanding of the PCA fundamentals using functions just from numpy library.

    • Shopper Segmentation (Unsupervised Learning): The objective of this project is to segment shoppers from a dataset given. K-Means, Agglomerative and DBSCAN are the three different unsupervised machine learning algorithms used for the project. At the end of the notebook, you can find the evaluation of those models comparing metrics as ARS (Adjusted Rand Score), NMI (Normalized Mutual Information) and Average Score.

    • Online News Popularity Prediction (Supervised Learning): This is project which objective is to predict the popularity of articles published by Mashable website. The machine learning algorithms used for this project were: Random Forest, Support Vector Classification and KNN / K-Nearest Neighbor.

    • Predictions of Admissions to Master's Degree (Supervised Learning): Using a Linear Regression Algorithm, this project was developed to predict the chance of admission of foreign students to Master's Degree Programs in American Colleges.

Tools: Python 3, Scikit-learn, pandas, numpy, matplotlib and seaborn

  • Data Analytics, Visualization and miscellaneous

    • A/B Test Analysis - email Campaign This project is an A/B Test Analysis, I will analyze the results of an email campaign experiment, which main objective is to influence customers to make a decision. I will apply the test of means analysis to verify weather the results of the campaign are occurring by chance or because the email strategy is working as expected.

    • Women Legal Rights in the World: This is an Analytic Report of legal gender differentiation around the world. The analysis of data collected in 187 countries, from 2009 to 2018, highlights the inequity in terms of laws and regulations.

    • Creating my own Dataset of Boston Apartments Leasing (Web Scraping): The goal of this project is to create my own dataset for future analysis. Data was extracting from the RentHop site and store it into a CSV file (apartments_leasing.csv).

Tools: Python 3, pandas, matplotlib, BeautifulSoup

Author

Wendy Navarrete

About

Repository containing data science projects.


Languages

Language:Jupyter Notebook 86.8%Language:HTML 13.2%