Tiiiger / SGC

official implementation for the paper "Simplifying Graph Convolutional Networks"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

node embeddings from SGC

titaniumrain opened this issue · comments

commented

Quick questions. It looks like that SGC does not generate node embedding. Instead, SGC generates N*C whereas C refers to the number of classes. Does it imply that SGC could not generate node embeddings?

hi @titaniumrain

you can understand the features after propagation, and before the classifier, as the note embeddings. This is A^kX where A is the (normalized) adjacency matrix, X is the input features.

commented

Hey Tianyi, thank you so much for answering the question promptly. Really appreciated. I did consider A^K * X as the embedding, however, (not trying to be a pain here), the outcome does not fit into the classical definition of node embedding since the output is fixed to the feature dimension. For example, if the node feature dimension is 128 then I can not get 64-d node embedding from SGC unless there is an additional step in place. See if it makes sense at your side.

hi @titaniumrain

that is correct.

I am curious, why do you want a lower-dimensional feature?

Is it because the input features are too high-dimensional (and sparse)? In that case, can you do PCA or other dimensionality reduction approach after the propagation? not sure, just throwing out some thoughts.

commented

Hi Tianyi, indeed it was because of the feature size. We plan to reduce the dimensionality after the propagation. Since I was dogmatic to follow the equation precisely, I am 100% for sure if any dimensionality reduction methods (applied after the propagation) are theoretically solid?

hi @titaniumrain

I think whether the dimensionality reduction step is theoretically solid or not really depends on the problem you want to solve. It it difficult for me to comment without knowing the problem structure.

commented

Oh... my bad to be brief. We are considering the link prediction (LP) problem, which is natural (or inevitable) to leverage node embeddings. The paper by you and your colleagues focused on the classification experiments, and didn't touch LP performance, hence the motivation at our end. See if it clarifies things.

In addition, I noticed that StellarGraph (a great lib!) implemented SGC, they laxly treat the learned N*C matrix as the node embedding matrix, which is more or less debatable.

No worries. Again just throwing out some ideas. In this case, can you learn a low-dimensional embedding layer? for example just learn a linear layer that project from (large) feature dimension to (low) embedding dimension using the supervision you have from link prediction.

Unfortunately, I am not very familiar with link prediction but I assume you are taking the inner project between nodes and that's why you are concerned about the computational cost. Learning this linear layer should be very fast because you can still precompute the propagation, which is usually the computation bottleneck.

commented

Yep, what you mentioned is likely what we may end up doing so. It is still in the spirit of SGC but not totally breaking away from it. Many thanks for your assistance!!!