Epileptic seizures classification based on iEEG signals using machine learning techniques

This repository contains all the content related to my Bachelor's Degree Final Project at AGH University of Science and Technology. The dissertation with a comprehensive description of used methods can be found here.

Overview

The aim of the project is to predict epileptic seizure based on iEEG signal using classical Machine Learning as well as Deep Learning techniques. The challenge is to distinguish between 10 min long data clips covering an hour prior to a seizure (preictal state), and 10 min iEEG clips of interictal activity (interictal state). The problem boils down to binary classification on imbalanced data.

Description of the challenge in details: Kaggle competition

Data

The data is provided by Melbourne University and can be accessed via website https://www.epilepsyecosystem.org/.

Publication regarding NeuroVista Seizure Prediction Data collection: https://doi.org/10.26188/5b6a999fa2316

Once you have access to dataset, put MATLAB train and test data for each Patient into data/raw folder.

Test set labels and public test names can be updated in data/labels folder.

Project structure

The directories are listed below.

.
├── config_dir          # configuration files directory 
├── data 
│   ├── labels          # test set labels
│   ├── processed       # processed data files
│   └── raw             # original data files (.mat)
├── results
│   ├── logs            # TensorBoard logs for ConvNet
│   └── plots           # saved plots
└── scripts
    ├── EDA             # Exploratory Data Analysis scripts e.g. corruption detection
    ├── models          # proposed models for data classification
    ├── preprocessors   # preprocessing methods and features extraction
    └── thesis          # scripts to generate plots for dissertation

Installation

The code was written in Python 3.8. Use the package manager pip to install requirements. Virtual environment is recommended.

pip install -r requirements.txt

Usage in Linux

Customize project configuration settings that are stored in config.py.
Find corrupted files:

cd scripts/EDA
python3 corruption_detect.py --cfg config_dir.config

Filenames to drop before further analysis will be saved in result directory.

Preprocess raw data.

preprocess data:

cd scripts/preprocessors
python3 preprocess_to_specgram.py --cfg config_dir.config

Preprocessed data will be saved in data/processed directory.

Make classification

train CNN model:

cd scripts/models
python3 CNN_train.py --cfg config_dir.config

run K-nearest neighbors classifier:

cd scripts/models
python KNN_train.py --cfg config_dir.config

Results of classification are displayed in the terminal.

Optimization mode

CNN hyperparameters and KNN parameter K can be optimized using Hyperopt library. For this purpose change 'optim_mode' to True in configuration file.

Optimization is implemented in CNN_optim.py and KNN_optim.py with predefined search space.

Author

Aleksandra Pestka

Faculty of Physics and Applied Computer Science
AGH UST University of Science and Technology

Cracow, January 2021

AlexandraZajdel / Epilepsy-seizure-prediction-BSc