erstome / MSc-Data-Science-Reports

A repro with reports of the assignments of the master degree in data science

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reports

A repro with reports of the assignments of the master's in data science

Courses

1 - Introduction to Data Science
Assignment: Assignment 2020/2021 - Introduction to Data Science
Co-authored with: klismam Pereira and Luís Reis
Keywords: Health dataset, EDA, ETL, Machine Learning, Classification

2 - Time Series and Forecasting
Assignment title: Modelling and forecasting of a bridge bearing displacement time series
keywords: Time series, Missing values, ARIMA models, Anomaly detection.

3 - Parallel Computing
Assignment 1: All-pairs shortest path problem
Keywords: Distributed memory environment, C, MPI, Graphs, Shortest path problem.

Assignment 2: Ecosystem Simulation
Keywords: Shared-memory environment, C, OpenMP, Ecosystem Simulation.

4 - Computer Vision
Assignment: Computer Vision: The basics of a self-driving car
Co-authored with: Eduardo Morgado
Github repro: https://github.com/eamorgado/Car-Self-driving-Simulator
Keywords: Computer Vision, line detection, object detection, self-driving, self-steering.

5 - Machine Learning
Assignment 1: Machine Learning 2020/2021 Assignment 1
Co-authored with: Klismam Pereira, Luís Reis and Vânia Guimarães
keywords: Machine Learning, Bayes Decision Boundary, Classification, Model Selection, Grid Search

Assignment 2: Machine Learning 2020/2021 Assignment 2
Co-authored with: Klismam Pereira, Luís Reis and Vânia Guimarães
Keywords: Machine Learning, Classification, Toy datasets

6 - Statistics and Data Analysis
Assignment 1: Pratical Assignment I: Wine Dataset
Co-authored with: João Ferreira
Keywords: Multivariate Data Analysis, Principal Component Analysis, Factor Analysis, Multidimensional Scaling, Wine Dataset.

Assignment 2: Practical Assignment II: Wine Dataset
Co-authored with: João Ferreira
keywords: Clustering methods, hierarchical clustering, k-means, Gaussian Mixture Models, Multinomial Logisti Regression, Linear Disciminant Analysis, Lasso, Quadratic Discriminant.

7 - Big Data and Cloud Computing
Assignment 1: Big Data and Cloud Computing: Assignment 1
Co-authored with: Vânia Guimarães
Description: Creation of a web application hosted in a cloud computing platform (Google App Engine).
Keywords: Cloud computing, Web application, Big Query, Python, Flask, Docker, AutoML, TensorFlow, Spark.

Work 2: Assignment 2: Dask vs PySpark vs Koalas vs Modin
Co-authored with: João Ferreira, Ricardo Faria and Vânia Guimarães
Keywords: Parallel computing, Dask, PySpark, Koalas, Modin, Joblib, Rapids, Performance comparison.

8 - Data Stream Mining
Assignment 1: AutoML for Stream k-Nearest Neighbours - Single-pass Self Parameter Tuning
Co-authored with: Maria Ferreira
Keywords: AutoML, Data Streams, Python implementation, SSPT, kNN.

Assignment 2: Automated Machine Learning for Data Streams
Co-authored with: Maria Ferreira
Description: State of the art of AutoML techniques for Data Streams.
Keywords: AutoML, Data Streams, Survey.

9 - Data-Driven Decision Making
Assignment 1: Assignment 1: Facility Location
Co-authored with: Joana Almeida
keywords: Optimisation, Operational Research, AMPL, Python, k-Center, k-Cover.

Assignment 2: Assignment 2: Evaluation Risk Factors in Health Care
Co-authored with: João Carvalho
keywords: Machine Learning, Classification, Feature Importance

Assignment 3: Assignment 3: Improving Kidney Exchange Programs
Co-authored with: João Ferreira and Ricardo Faria
keywords: Optimisation, Operational Research, Machine Learning.

10 - Advanced Topics in Data Science
Assignment: A classification system to detect questionable information
Co-authored with: Hélder Vieira, Klismam Pereira, Vânia Guimarães
keywords: NLP, Machine Learning, misinformation, fake news, Twitter, social media.