matenure / FastGCN

The sample codes for our ICLR18 paper "FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling""

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running on Reddit dataset is extremely slow

cai-lw opened this issue · comments

I downloaded the processed Reddit data set form #8 (comment), and then run train_batch_multiRank_inductive_reddit_Mixlayers_sampleA.py with default parameters. It takes about 10 minutes for a single epoch. However the paper reported 638.6 seconds for the WHOLE training process. I am ~200x slower than your reported speed.

I am running on an AWS m5.2xlarge instance with the same CPU spec as your machine (8 vCPUs = 4 core 8 thread, 2.5GHz). All dependencies are simply installed by pip.

The default parameter did not do any sampling: main(None).
Change the "None" into 100 or 200

@matenure It works. Thank you.
Could you change the default behavior of this code, or tell people how to change it in README? The README says it is "the final model" but it isn't since it didn't do any sampling.

Thanks. Your update has been merged.