idc9 / NetworkChangePoint

Nonparametric change point detection for growing trees and code to reproduce figures in Banerjee, Bhamidi and Carmichael, 2018

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nonparametric change point detection in growing networks

This repository contains code to compute the change point estimator developed in Fluctuation bounds for continuous time branching processes and nonparametric change point detection in growing networks and reproduces Figure 1 for this paper. Using the code.ipynb shows how to run the code on your own data. The network model is described briefly below, but see the paper for more details.

Network model

Consider the following change point detection problem where a tree is grown to size n. The parameters of the problem are 0 < gamma < 1 (the change point), and f, g which are two, unknown attachment functions. The tree starts with say one node (v_1). At each time step a new vertex is added to the tree by selecting an existing vertex to attach to (where the edge points from the selected parent vertex to the new vertex). Let out-deg(v_i) denote the out-degree of vertex v_i. For vertices v_k, 2 <= k <= n*gamma, arriving in the first phase of the process, vertex v_k connects to existing vertex v_i (1 <= i < k) with probability proportional to f(out-deg(v_i)). For vertices v_k, n gamma < k <= n, arriving in the second phase of the process, vertex v_k connects to existing vertex v_i with probability proportional to g(out-deg(v_i)).

To summarize, the tree evolves according to (unknown) function f, for the first n gamma steps then according to (unknown) function g for the remaining steps. The goal is then to estimate the unknown change point gamma.

When there is no change point, this model referred to as a nonuniform random recursive tree. A famous special case of this model is preferential attachment.

Figure 1

First run

sh run_simulation

to simulate the trees and compute d_n(m) (the results are saved in data/). Then run Figure 1.ipynb to create the figures. The first part is a bit slow and I suggest parallelizing the simulation across a bunch of cluster nodes.

About

Nonparametric change point detection for growing trees and code to reproduce figures in Banerjee, Bhamidi and Carmichael, 2018


Languages

Language:Jupyter Notebook 97.8%Language:Python 2.2%Language:Shell 0.0%