Jose Cruzado's repositories
plagiarism_detection
On this repository I use the dataset created by Clough and Stevenson to train a plagiarism detection model. The dataset contains around 100 data points and includes 4 types of plagiarism, ranging from near-copy to heavy revision. The algorithm used to classify a text as plagiarised or not was Supoort Vector Machines.
scraping-SBS
In this repository I use Beautiful Soup to automate the information retrieval from the website of the Peruvian financial regulator. The final goal is to calculate and compare the most recent returns and assets of the 4 pension funds that provide services in Peru.
credit_card_fraud_detection
This project is about detecting fraudulent credit card transactions. The dataset tends to be highly imbalanced, with less than 0.2% of the observations labelled as fraudulent. To address this issue we have to take into account the bank's objective (maximizing precision or recall) and restrictions. The performance and efficiency of many classification algorithms (Logistic Regression, XGBoost, Random Forests) were tested and compared.
dashboard-gamebox
Dashboard for gamebox site
employability-estimation
The objective of this project is to analyze the employability of online bootcamps. The main goal is to predict the number of days (after completion of the program) it takes students to find a job.
gamebox-website
Pagina de e-commerce de videojuegos construida con JavaScript, Node.js, SQL y React