emilhf / MIDAS

MIDAS: Real-Time Streaming Anomaly Detection in Dynamic Graphs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MIDAS

Microcluster-Based Detector of Anomalies in Edge Streams

GIF demo ...

Table of Contents

Features

  • Finds Anomalies in Dynamic/Time-Evolving Graphs
  • Detects Microcluster Anomalies (suddenly arriving groups of suspiciously similar edges e.g. DoS attack)
  • Theoretical Guarantees on False Positive Probability
  • Constant Memory (independent of graph size)
  • Constant Update Time (real-time anomaly detection to minimize harm)
  • Up to 48% more accurate and 644 times faster than the state of the art approaches

For more details, please read the paper - MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos. AAAI 2020.

Use Cases

  1. Intrusion Detection
  2. Fake Ratings
  3. Financial Fraud

Getting Started

  1. Run make to compile code and create the executable.
  2. Run ./midas -i followed by the input file path and name.

Demo

  1. Run ./demo.sh to compile the code and run it on an example dataset.

Command-Line Options

  • -h --help: produce help message
  • -i --input: input file name
  • -o --output: output file name (default: scores.txt)
  • -r --rows: Number of Hash Functions (default: 2)
  • -b --buckets: Number of Buckets (default: 769)
  • -a --alpha: Temporal Decay Factor (default: 0.6)
  • --norelations : Run MIDAS instead of MIDAS-R
  • --undirected : Treat graph as undirected instead of directed

Input File Format

MIDAS expects the input edge stream to be stored in a single file containing the following three columns in order:

  1. source (int): source ID of the edge
  2. destination (int): destination ID of the edge
  3. time (int): timestamp of the edge

Thus, each line represents an edge. Edges should be sorted in non-decreasing order of their timestamps and the column delimiter should be ,

Datasets

  1. DARPA: Original Format, MIDAS format
  2. TwitterWorldCup2014
  3. TwitterSecurity

MIDAS in other Languages

  1. Rust and Python by Scott Steele
  2. Ruby by Andrew Kane
  3. R by Tobias Heidler

Online Articles

  1. KDnuggets: Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs
  2. Towards Data Science: Controlling Fake News using Graphs and Statistics
  3. Towards Data Science: Anomaly detection in dynamic graphs using MIDAS
  4. Towards AI: Anomaly Detection with MIDAS

Citation

If you use this code for your research, please consider citing our paper.

@article{bhatia2019midas,
  title={MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams},
  author={Bhatia, Siddharth and Hooi, Bryan and Yoon, Minji and Shin, Kijung and Faloutsos, Christos},
  journal={arXiv preprint arXiv:1911.04464},
  year={2019}
}


Webpage https://www.comp.nus.edu.sg/~sbhatia/  ·  Email siddharth@comp.nus.edu.sg  ·  Twitter @siddharthb_

About

MIDAS: Real-Time Streaming Anomaly Detection in Dynamic Graphs

License:Apache License 2.0


Languages

Language:C++ 97.5%Language:Python 1.4%Language:Makefile 0.8%Language:Shell 0.3%