soukudom / PerQoDA

Dataset Quality Assessment with Permutation Testing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PerQoDA

Dataset Quality Assessment with Permutation Testing

Contact person: Katarzyna Wasielewska, email: k.wasielewska@ugr.es

Last update: 11.09.2022


Description

The PerQoDA software is designed to test the quality of a dataset using permutation testing. The method is based on the network clssification problem and requires labeled dataset.

You can use different ML supervised techniques and performance metrics.

Output examples

Score table visualisation

p-value table

The slope

Papers

Camacho, J., Wasielewska, K., Dataset Quality Assessment in Autonomous Networks with Permutation Testing. 7th IEEE/IFIP International Workshop on Analytics for Network and Service Management (AnNet), Budapest, 2022.

Dataset Quality Assessment with Permutation Testing Showcased on Network Traffic Datasets TechRxiv' 22

More info

We used the Weles tool published at https://github.com/w4k2/weles. Data shuffling requires a slight modification of the original code by adding a protocol that supports shuffling methods. Contact us if you need help.

About

Dataset Quality Assessment with Permutation Testing

License:GNU General Public License v3.0


Languages

Language:Python 67.4%Language:Jupyter Notebook 32.6%