Elicilla's repositories

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ds_study_cases

In this repository I will post some reproduction of data sciences study cases amd pther stuff in R and maybe in python

Stargazers:0Issues:0Issues:0

MIT-xPRO-DSxCase-Study-2.2-Gender-Wage-Gap

Case Study 2.2: Gender Wage Gap Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Estimate the difference in predicted wages between men and women with the same job characteristics. Why this Case Study? Participants can pose an economic question and investigate that question using a linear regression model. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 1.6. Self-help-package.zip Codebook.txt contains the description of worker job-relevant characteristics. pay.discrimination.Rdata: the CPS (2012) data on wages and job-relevant worker characteristics, such as experience exp, gender, education. Regression1.6.CaseStudy.R estimates gender wage gap, i.e., difference in predicted wages between men and women with same job-relevant characteristics. The gap is estimated in two steps: (1) residualizing the outcome (wages) and covariate of interest (gender) (taking residuals from corresponding regressions on worker characteristics), and (2) computing the correlation between residualised wages on residualised gender. Both linear and quadratic specifications are tried at residualizing step. Regression.1.6.pdf is the set of slides that describes the estimation technique and present the results. .Rhistory

Stargazers:0Issues:0Issues:0

MIT-xPRO-DSxCase-Study-2.1-Predicting-Wage-1

Case Study 2.1: Predicting Wage I Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Predict wages using various characteristics of workers and assess predictive performance. Why this Case Study? Prediction is getting important these days in the age of big data. Participants can apply a simple model from this class and assess the prediction performance of their model. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 1.4. Self-Help-Package.zip Codebook.rtf contains the description of worker job-relevant characteristics. pay.discrimination.Rdata: the CPS (2012) data on wages and job-relevant worker characteristics, such as experience, gender, education. Regression1.4.CaseStudy.R predicts expected wage given worker characteristics using linear model with linear and quadratic specifications. In addition, it evaluates the performance of the predictor by: r.squared and mean squared error, with and without sample splitting.Regression.1.4.pdf is the set of slides describing the wage prediction model. .Rhistory Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.

Stargazers:0Issues:0Issues:0

Spectral-Clustering---Grouping-News-Stories

Case Study 1.3.2: Spectral Clustering - Grouping News Stories Instructor: Stefanie Jegelka Activity Type: Optional Case Study Description: Auto-clustering News stories. Why this Case Study? Build your own clustering for news stories on the web similar to how you see Google News organize news stories by auto-generated topics/groupings! Self-Help Documentation: In this document, we walk through some helpful tips to get you started with building your own application for automating the clustering of news stories using Spectral Clustering. In this tutorial, we provide examples and some pseudo-code for the following programming environment: Python. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience. Have questions? Feel free to discuss the case study with other participants in the Discussion Forum under Module 2 - Case Studies Section.

Stargazers:0Issues:0Issues:0

MITXpro-DSx-PCA---Identifying-Faces

DO IT YOURSELF Case Study 1.2.1: PCA - Identifying Faces Instructor: Stefanie Jegelka Activity Type: Optional Case Study Description: Classifying and identifying human faces. Why this Case Study? Build your own implementation of an image classification algorithm that helps classify new photos of humans! This can help you understand how it is possible for Facebook to suggest, very accurately, who to tag in a given photo with people's photos. Self-Help Documentation: In this document, we walk through some helpful tips to get you started with building your own application for classifying faces in photo images using Principle Component Analysis (PCA). In this tutorial, we provide examples and some pseudo-code for the following programming environment: Matlab. Download Self-Help Documentation Download Pictures DataSet Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.

Stargazers:0Issues:0Issues:0

MITXpro-DSx-Genetic-Code

Do It Yourself Case Study 1.1.1: Genetic Codes Instructor: Tamara Broderick Activity Type: Optional Case Study Description: Using K-means to help figure out that DNA is composed of 3-letter words. Self-Help Documentation: From this document, you will learn how data visualization can help in genomic sequence analysis and start with a fragment of genetic text of a bacterial genome and analyze its structure. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience. Have questions? Feel free to discuss the case study with other participants in the Discussion Forum under Module 1 - Case Studies Section.

Stargazers:0Issues:0Issues:0

MITIE

MITIE: library and tools for information extraction

Stargazers:0Issues:0Issues:0

python-goose

Html Content / Article Extractor, web scrapping lib in Python

License:Apache-2.0Stargazers:0Issues:0Issues:0

DSx

Hands on tutorials demonstrating the concepts of Prediction engineering, Feature engineering and automation in data science.

Stargazers:0Issues:0Issues:0

stochasticLDA

Python implementation of Stochastic Variational Inference for LDA

License:GPL-3.0Stargazers:0Issues:0Issues:0