PageRank vs your formulation of PageRank
mdanb opened this issue · comments
Hi,
Thank you for this interesting work! I just have a few clarification questions about the formulation of the (original) PageRank that you presented in the paper. You mention that PageRank can be calculated via r = Ar, but from what I understand from PageRank, this isn't the complete formulation of PageRank. PageRank also includes terms that take into account what they call "rank sinks." I'm wondering why this is missing in your equations.
Thank you!
Indeed, regular PageRank includes a globally uniform teleport vector to counteract so-called spider traps (or rank sinks). However, this is only needed for directed graphs. In this work we only consider undirected graphs. Furthermore, PPNP is based on personalized PageRank (PPR), for which the teleport vector uses only the root node. This already counteracts spider traps and thus no global teleport is needed.
Does this answer your question?
APPNP leverages that we don't need one PPR vector for every node, but only one PPR vector per class. This is just a matter of interpretation, though. The main difference between PPNP and APPNP is that APPNP doesn't exactly calculate the PPR vectors, but instead approximates them via power iterations.
You can actually go even further with this PPR approximation, which then leads to the massively scalable PPRGo model: https://arxiv.org/abs/2007.01570