stacyxixi / Supervised-Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

## Spring 2018 CS7641 Assignment1: Supervised Learning 



### Description


The code takes csv files of two different data sets "breast-cancer-wisconsin.csv" and "HTRU_2.csv".
It outputs cross-validation curve/heatmap and learning curve images to the directory “exported figures/HTRU_2” and “exported figures/bcdt".
It writes to the console the prediction metrics (AUC, F1 and accuracy scores, learn and predict times) for different learning algorithms. 


### Software Requirements
Python 2.7, matplotlib 2.2.2, numpy 1.15.1, pandas 0.23.4, scikit-learn 020rc1, scipy 1.0.0

### Code
“analyze_pulsar.py” : analyzing "HTRU_2.csv", the HTRU2 pulsar data set

“analyze_bcdt.py” : analyzing "breast-cancer-wisconsin.csv", the Breast Cancer Wisconsin (Prognostic) data set


### Other Files
"README.txt": instructions for running the code
"xwang738-analysis.pdf": report file

“breast-cancer-wisconsin.csv”: data for the Breast Cancer Wisconsin (Diagnostic) Data Set
"breast-cancer-wisconsin.names": information regarding the Breast Cancer Wisconsin (Diagnostic) Data Set
"HTRU_2.csv": data for the HTRU2 dataset
"HTRU_2_info.txt": information regarding the HTRU2 dataset
"/exported figures": directory for output images


About


Languages

Language:Python 64.0%Language:Tcl 26.2%Language:C 4.8%Language:C++ 4.8%Language:PowerShell 0.2%Language:Batchfile 0.0%