This repository contains the final assignment for the Data Mining course with Professor Dr. Fortuna. It contains the.csv and machine learning models that my group mates (Tiger and Ryan) and I developed for a classification task of our choosing. We chose the publicly available breast cancer dataset from the UCR dataset.
We evaluated the accuracy and error performance of a linear model, SVM-classifier, and a neural network on their classification performance. We also performed a limited grid seearch of the hyperparameters for each of the models and reported, in the final report write-up, of the time it took to perform these tasks.
We used the publicly available Breast Cancer classification dataset on the UCR repository. We took note that it was a relatively sterile data set, and therefore there was very little heavy lifting we had to do to preprocess it for ingestion into the classification models.
This was one of the first machine learning-based projects my group mates and I got to work on outside of MOOC assignments, and the first of its kind in Python.