jeromeku / networktraffic-KDD-tutorial

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KDD 2022 Tutorial

Deep Learning for Network Traffic Data

Abstract

Network traffic data is key in addressing several important cybersecurity problems, such as intrusion and malware detection, and network management problems, such as application and device identification. However, it poses several challenges to building machine learning models. Two main challenges are manual feature engineering and scarcity of training data due to privacy and security concerns. In this tutorial we provide a comprehensive review of recent advances to address these challenges through use of deep learning. Network traffic data can be cast as a multivariate time-series (sequential) data, attributed graph data, or image data to leverage representation learning architectures available in deep learning. To preserve data privacy, generative methods, such as GANs and autoregressive neural architectures can be used to synthesize realistic network traffic data.

In particular, our tutorial is organized into three parts: 1) we describe network traffic data, applications to security and network management, and challenges; 2) we present different deep learning architectures used for representation learning instead of feature engineering of network traffic data; and, 3) we describe use of generative neural models for synthetic generation of network traffic data.

Summary: KDD2022_Tutorial.pdf
Slides: DLNetworkTrafficTutorial.pdf

Tutors' Bio

Manish Marwah is a principal research scientist at Micro Focus. His main research interests are in the broad areas of AI and data science, and their applications to cybersecurity and to cyberphysical systems. His research has led to over 65 refereed papers, several of which have won awards, including at AAAI, KDD, and IGCC. He has twice co-organized -- Data Mining for Sustainability (SustKDD) -- a workshop at KDD. He has been granted 53 patents. Manish received his Ph.D. in Computer Science from University of Colorado, Boulder. He has taught graduate data science courses at Santa Clara University as an adjunct faculty.

Martin Arlitt is a principal research scientist and research team manager at Micro Focus. His general interests are workload characterization of computer servers, performance evaluation of distributed computer systems, and analyzing network traffic to improve IT security. His ~100 research papers have been cited over 13,000 times (per Google Scholar). He has 46 granted patents. He is an ACM Distinguished Scientist, a senior member of the IEEE, and an adjunct assistant professor at the University of Calgary.

About