Parallelized Embedding
bruriah1999 opened this issue · comments
Hey,
I'm trying to process a directed graph, the scales are about 5 million nodes and 100 million edges.
I've managed to load the graph from a csv file, i get a very nice Graph object (within 5 minutes).
I'm now trying to embedd the graph with grape.embedders.Node2VecSkipGramEnsmallen, but it doesn't seem to succeed, I've let it run for over 10 hours.
In order to make it faster, i did enable the Graph's vector_source, vector_cumulative_node_degree and vector_reciprocal_sqrt_degrees.
Reading your paper, it seems that the embedding process could be parallelized, but i can't find the way to do that.
I'd appreciate if you could describe what part/s of the embedding process are parallelized? and how can i make it run in parallel?
Thank you,
Bruria.
Hi! I am not sure I understand the issue you have encountered. Would you be available to do a short call to investigate this?
If yes, I am available on the GRAPE Telegram group and Discord channel to set it.
trying the same with 5.24M nodes and 13.89M edges.
Where can I find the TG or discord?
Both are available in the README
Hi @bruriah1999 were you able to solve your issue?