micaelCZ / Encrypted-Traffic-Classification-with-Deep-Learning

A repository with models for encrypted traffic classification.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encrypted-Traffic-Classification-with-Deep-Learning

Abstract: This research work addresses a way to classify encrypted traffic through the use of artificial intelligence techniques, specifically deep learning. For this purpose, three experimentation scenarios were proposed in which three models were tested: CNN, Random Forest, and SVM. The datasets used for training the above models correspond to ISCXTor2016 and UNSW-NB15, due to the relevant data they contain for traffic classification, such as IP addresses, type of traffic, whether it is legitimate traffic or traffic coming from an attack, among others. Data collection, preprocessing and feature extraction were performed prior to modeling and training. Finally, the results of the models were evaluated and their performance was compared. In general terms, CNN obtained the best score with respect to the other models in scenario 1 with an accuracy of 99.91%, in scenario 2 the accuracy is 76.38% also surpassing the accuracy metrics obtained by the rest of the models, in scenario 3 the results vary considerably, being that the RF and SVM models obtain relatively better scores than CNN in both accuracy and precision scores. The present work concludes that CNN is the most suitable model for TOR/non-TOR traffic classification and multiclass (network traffic type) classification, but has slight shortcomings in classifying whether traffic is legitimate/non-legitimate. It is possible that this is due to a limitation of the proposed CNN model or factors such as overfitting of traditional models.

About

A repository with models for encrypted traffic classification.


Languages

Language:Jupyter Notebook 100.0%