gonzaferreiro

Gonza Ferreiro Volpi's repositories

Networks_Analysis_plus_Recommendations_system

This project explores the classic MovieLens dataset, first from a networks perspective, analyzing the relationship between users and movies. Later, in the main part of the project, we built and evaluate several Recommendations Systems.

Language:Jupyter Notebook17 10

SQL_practice_and_application

Some simple notebooks to show basic SQL skills through Pandas

Language:Jupyter Notebook1100

NLP_with_20newsgroups

In this brief project we're gonna explore a few NLP tools using a Sklearn dataset and the following modelling techniques: bag of words, Hashing and TF-IDF vectorizer.

Language:Jupyter Notebook1000

Simple_web_scraper

Simple web scraper using a little bit of regex to obtain a list of books and authors

Language:Jupyter Notebook10 10

Dealing_with_class_imbalance

In this repository you'll find a theoretical introduction to the problem of class imbalance, as well as a notebook with examples about how to use some of the algorithms mentioned in the theoretical guide.

Language:Jupyter Notebook8 10

Market_value_football_players

Final project of my immersive course in Data Science at General Assembly. It consisted of a lot of Web Scraping + several regression techniques to predict current value of football players. Conclusions inside.

Language:Jupyter Notebook600

Spark_theoretical_practical_application

In this repository I'll be exploring in deep three labs from my Immersive Course in Data Science about Spark including some basic map reduce and SQL-Spark operations, as well as a bit of modelling through Spark

Language:Jupyter Notebook600

Basic_skills_with_python

First project ever done with Python about data structures, functions and some statistics/probability to describe and refine a Pokemon gameplay.

Language:Jupyter Notebook400

From_job_posts_to_salaries_classification

For this project I explored different machine learning classification models to predict four salary categories for Data Science job posts using publications from Indeed.co.uk. The goal was to obtain an accuracy of 0.8 or higher on both the train and test group, which implies predicting correctly at least 80% of the total population.

Language:Jupyter Notebook300

Random_forest_theoretical_practical_application

This brief project explores first the theoretical background behind Random Forest, followed by its application with the Boston Housing dataset

Language:Jupyter Notebook300

gonzaferreiro.github.io

Repository for website creation

Language:CSSMIT200

Predicting_house_prices_with_ames_dataset

This projected aimed to estimate the sale price of properties based on their "fixed" characteristics, such as neighborhood, lot size, number of stories, etc. In second place, I tried to estimate the value of possible changes and renovations to properties from the variation in sale price not explained by the fixed characteristics. The goal was to estimate the potential return on investment when making specific improvements to properties. This project uses the Ames housing data recently made available on Kaggle.

Language:Jupyter Notebook200