kauvinlucas

Kauvin Lucas's repositories

maven-unicorn-challenge

This is a web app made with Python consisting of a dashboard that was used as submission for a visualization challenge called "Maven Unicorn Challenge" by Maven Analytics

Language:PythonMIT5 20

spark-kubernetes

This repository contains files used to build images to deploy Spark clusters on Kubernetes

Language:PythonMIT3 10

Optimizing-a-Pipeline-in-Azure

The main goal of this project was to build and optimize an Azure ML pipeline using the Python SDK and a provided Scikit-learn Logistic Regression model to solve a classification problem. Hyperdrive was used to optimize the model. This was then compared to an Azure AutoML run to see which of these approaches returns the best tuned model.

Language:Jupyter Notebook200

Spark-StudyClub

#DataEngineeringLATAM

Language:Jupyter Notebook200

big-data-science-notes

My notes of each module in Big Data Science, an online course offered by Semantix Brasil

Language:Jupyter NotebookMIT1 10

bert-sentiment

Language:PythonMIT000

DataCamp-Projects

Notebooks of Datacamp projects

Language:Jupyter Notebook000

dio-analise-de-dados-com-pandas

Neste repositório apresentei os notebooks de analise exploratória e visualização de dados feitos no Python com a ajuda das bibliotecas Pandas e Matplotlib. Este repositório responde ao desafio da plataforma Digital Innovation One.

Language:Jupyter Notebook010

dio-google-cloud-dataproc

Este repositório contêm os arquivos de contagem de palavras gerados no Google Cloud por meio de script de Python e dentro de um ecossistema de Big Data gerenciado em cloud chamado Google DataProc. O repositório em questão responde ao desafio da plataforma Digital Innovation One.

Language:Python010

docker-bigdata

Big Data Ecosystem Docker

Language:VBA000

fifa18-all-player-statistics

A complete catalog of all the players in Fifa 18 and their complete statistics.

Language:Jupyter Notebook000

imersao

Language:Jupyter Notebook000

jupyter-spark-enem-2019

In this project, I analyzed the scores of the ENEM 2019, a standardized test used for admission in Brazilian colleges, in the context of existing socioeconomic disparities between participants. PySpark was used for data ingestion and transformation. Pandas, Statsmodels, Matplotlib/Seaborn/Folium, and Scikit-learn were used for descriptive analysis and data visualization.

Language:Jupyter NotebookMIT010

kauvinlucas

Kauvin Lucas's repositories

maven-unicorn-challenge

spark-kubernetes

Optimizing-a-Pipeline-in-Azure

Spark-StudyClub

big-data-science-notes

bert-sentiment

DataCamp-Projects

dio-analise-de-dados-com-pandas

dio-google-cloud-dataproc

docker-bigdata

fifa18-all-player-statistics

imersao

jupyter-spark-enem-2019

kauvinlucas

Predicting_car_accident_severity

pyspark-stateful-processing-with-twitter-kafka