About

DataMining4 MD3 Final Project Final Project

This repository contains the final assignment for the Data Mining course with Professor Dr. Fortuna. It contains the.csv and machine learning models that my group mates (Tiger and Ryan) and I developed for a classification task of our choosing. We chose the publicly available breast cancer dataset from the UCR dataset.

Models

We evaluated the accuracy and error performance of a linear model, SVM-classifier, and a neural network on their classification performance. We also performed a limited grid seearch of the hyperparameters for each of the models and reported, in the final report write-up, of the time it took to perform these tasks.

Dataset

We used the publicly available Breast Cancer classification dataset on the UCR repository. We took note that it was a relatively sterile data set, and therefore there was very little heavy lifting we had to do to preprocess it for ingestion into the classification models.

Conclusions and Notes

This was one of the first machine learning-based projects my group mates and I got to work on outside of MOOC assignments, and the first of its kind in Python.

About

This was the final project for a masters level course at Mcmaster University, Data Mining 4MD3.

Languages

Language:Python 100.0%