node embeddings from SGC

Question

node embeddings from SGC

titaniumrain opened this issue 4 years ago · comments

Quick questions. It looks like that SGC does not generate node embedding. Instead, SGC generates N*C whereas C refers to the number of classes. Does it imply that SGC could not generate node embeddings?

Tianyi · Answer 1 · Thu Nov 12 2020 16:25:18 GMT+0800 (China Standard Time)

hi @titaniumrain

you can understand the features after propagation, and before the classifier, as the note embeddings. This is A^kX where A is the (normalized) adjacency matrix, X is the input features.

V · Answer 2 · Thu Nov 12 2020 17:05:54 GMT+0800 (China Standard Time)

Hey Tianyi, thank you so much for answering the question promptly. Really appreciated. I did consider A^K * X as the embedding, however, (not trying to be a pain here), the outcome does not fit into the classical definition of node embedding since the output is fixed to the feature dimension. For example, if the node feature dimension is 128 then I can not get 64-d node embedding from SGC unless there is an additional step in place. See if it makes sense at your side.

Tianyi · Answer 3 · Thu Nov 12 2020 17:08:35 GMT+0800 (China Standard Time)

hi @titaniumrain

that is correct.

I am curious, why do you want a lower-dimensional feature?

Is it because the input features are too high-dimensional (and sparse)? In that case, can you do PCA or other dimensionality reduction approach after the propagation? not sure, just throwing out some thoughts.

V · Answer 4 · Thu Nov 12 2020 17:40:29 GMT+0800 (China Standard Time)

Hi Tianyi, indeed it was because of the feature size. We plan to reduce the dimensionality after the propagation. Since I was dogmatic to follow the equation precisely, I am 100% for sure if any dimensionality reduction methods (applied after the propagation) are theoretically solid?

Tianyi · Answer 5 · Fri Nov 13 2020 04:04:19 GMT+0800 (China Standard Time)

hi @titaniumrain

I think whether the dimensionality reduction step is theoretically solid or not really depends on the problem you want to solve. It it difficult for me to comment without knowing the problem structure.

V · Answer 6 · Fri Nov 13 2020 04:37:13 GMT+0800 (China Standard Time)

Oh... my bad to be brief. We are considering the link prediction (LP) problem, which is natural (or inevitable) to leverage node embeddings. The paper by you and your colleagues focused on the classification experiments, and didn't touch LP performance, hence the motivation at our end. See if it clarifies things.

In addition, I noticed that StellarGraph (a great lib!) implemented SGC, they laxly treat the learned N*C matrix as the node embedding matrix, which is more or less debatable.

Tianyi · Answer 7 · Fri Nov 13 2020 07:46:56 GMT+0800 (China Standard Time)

No worries. Again just throwing out some ideas. In this case, can you learn a low-dimensional embedding layer? for example just learn a linear layer that project from (large) feature dimension to (low) embedding dimension using the supervision you have from link prediction.

Unfortunately, I am not very familiar with link prediction but I assume you are taking the inner project between nodes and that's why you are concerned about the computational cost. Learning this linear layer should be very fast because you can still precompute the propagation, which is usually the computation bottleneck.

V · Answer 8 · Fri Nov 13 2020 20:13:52 GMT+0800 (China Standard Time)

Yep, what you mentioned is likely what we may end up doing so. It is still in the spirit of SGC but not totally breaking away from it. Many thanks for your assistance!!!