Convert CAP Dataset from EDF to CSV and reduce its size to 10% of original data
Steps:
- Download dataset from https://physionet.org/content/capslpdb/1.0.0/ ( ~ 40.1 GB ) .
- Extract ZIP
- Put
minifier.py
,files.py
andrequirements.txt
in dataset folder (inside/cap-sleep-dataset-1.0.0/
) - Install requirements (
pip install -r requirements.txt
) - Run
minifier.py
- Wait for 1000 years.
Notes:
- Make sure your free diskspace is more than 100GB
- Edit
files.py
if you want to work with only part of dataset. By default it will convert all edf files - It will take looong time to process. So if you decide to leave computer running, make sure your computer don't go to sleep automatically after some time.
- This code will strip csv to 10% of original edf data. For eg,
brux1.edf
converted tobrux1.csv
have 7342592 rows. But I minify data to include first 734259 rows only.