Private repo for my thesis
- As long as all nodes are not activated randomly select a node and simulate spread.
- Count number of randomly selected nodes
- Repeat 1. and 2. 10,000 times
- Average number of randomly selected nodes is an approximation of k = |S*|, size of optimal seed set
- Pre-processing
- Compute inf. score of all nodes using MC simulations
- Compute inf. threshold as inf. score of top 10% of influencers
- Live
- update activation probability
- update influence threshold
- Pure random
- Randomly select any user in G = (V, E) with repetition until |V| users were randomly selected
- Random decay
- Randomly select any user in G = (V, E), with repetition but previously selected user cannot be chosen again until x number of other users have been selected. Stop when |V| users were randomly selected
Date time is displayed in red when program starts and ends.
Choose a dataset (nodes, edges):
[test] (20, 30])
[nethept] (15_229, 62_752)
[dblp] (317_080, 2.09M)
[youtube] (1.13M, 5.97M)
[orkut] (3.07M, 234.37M)
[livejournal] (3.99M, 69.36M)
[twitter] (41.65M, 1.46B)
> dataset name: __
Choose a stage:
[1] pre-process
[2] live
[3] IMM seed set test
[4] test
[5] EXIT PROGRAM
> choice: __
Depends on selected stage. Input requests are displayed one at a time and only disappear after a short delay once all inputs are in.
Input pre_process arguments
> depth (<current default>): __
> minimum weight (<default>): __
Input live arguments
> seed size (<default>): __
> reach (<default>): __
> activation probability threshold (<default>): __
> stream model [rand_repeat, rand_no_repeat](<default>): __
> stream version [1, 2, 3](<default>): __
> stream size (<default>): __
Not yet implemented
When a process finishes running this menu is displayed to offer the user a chance to run another operation without having to reload the dataset.
Continue:
[1] Repeat previous stage with same arguments
[2] Repeat previous stage with new arguments
[3] Change stage
[4] Change dataset
[5] End Program
> choice: __
(example)
nethept/
. imm/
. . basic/
. . . NE_k50_e0,5_ss50.txt
. . live
. . . NE_k50_e0,5_urr_v1_s15229_ss25.txt
. . . progress/
. . . . NE_k50_e0,5_urr_v1_s15229_prg.txt
. . common
. . . NE_k50_e0,5_k50_r5_ap0,8_urr_v1_s15229_ss50.txt
. rtim/
. . pre_process/
. . . NE_d2_m0_infS.txt
. . . NE_infS.txt
. . live/
. . . NE_r1_ap0,8_urr_v1_s15229_ss[seed_set_size].txt
. . . NE_r5_ap0,8_urr_v1_s15229_ss50.txt
. . . progress/
. . . . NE_r1_ap0,8_urr_v1_s15229_prg.txt
. streams/
. . uniform_rand_repeat/
. . . v1/
. . . . NE_urr_v1_s15229_st.txt
. . . v2/
. . . . NE_urr_v2_s15229_st.txt
. logs/
. . imm.log
. . rtim_pre.log
. . rtim_live.log
. . seed_set_scores.log
. . common.log
. . imm.csv
. . rtim_pre.csv
. . rtim_live.csv
. . seed_set_scores.csv
dataset | abbreviation | nodes | edges |
---|---|---|---|
test | TS | 20 | 38 |
nethept | NE | 15_229 | 62_752 |
DBLP | DB | 317_080 | 2_099_731 |
Youtube | YO | 1_134_962 | 5_975_247 |
LiveJournal | LJ | 3_997_441 | 69_362_377 |
Orkut | OR | 3_072_441 | 234_370_165 |
TW | 41_652_230 | 1_468_365_182 |
keyword | abbreviation | description |
---|---|---|
infScores | infS | Associated file contains user influence scores |
seedSet | ss | Associated file contains list of user of a given seed set |
stream | st | Associated file is a list of users simulating a stream of users connecting to the Internet |
uniformily random repeat | urr | Stream was generated by choosing users uniformily at random with possible repitition |
uniformily random no repeat | urnr | Stream was generated by choosing users uniformily at random without possible repitition |
parameter | abbreviation | description |
---|---|---|
seed set size | k[value] | size of seed set we want to generate |
epsilon | e[value] | parameter to determine precision of result |
parameter | abbreviation | description |
---|---|---|
depth | d[value] | Maximum exploration with which we computed the influence score during the first stage |
minimum weight | m[value] | minimum edge weight necessary to consider random walk |
reach | r[value] | the top % of users we would like to target during the live stage (helps determine the influence score threshold) |
activation probability | ap[value] | the activation probability threshold at which we consider a user to be activated during the live stage |
max seed set size | k[value] | the maximum seed set size we want to generate during the live stage |