PyGCL / PyGCL

PyGCL: A PyTorch Library for Graph Contrastive Learning

Home Page:https://PyGCL.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The augmentation "Node Dropping"

HoytWen opened this issue · comments

Hi, I‘m curious about the augmentation "Dropping Node", I find both of your implementation and the code published by the author of GraphCL just isolated the selected nodes but don't remove the selected nodes from the node feature matrix. In this situation, when we do the graph classification task and use some operations like summation, the isolated nodes will still have an impact on the final learned representation. So, shouldn't we remove the selected nodes from the feature matrix or this is a standard for graph augmentation?

Screen Shot 2022-04-24 at 9 51 40 PM

Thanks for your interests on our work! For graph-level tasks we usually aggregate the node-level embeddings using a readout function. As the selected nodes have already been masked, they will not contribute to the node-level embeddings. That being said, we usually do not directly take node features into the final embeddings.

Thanks for your reply, I understand the selected nodes will not participate in the message passing and thus will not affect other node-level embeddings. But meanwhile, they still go through the transformation and output their own node embeddings because they are not removed from the initial feature matrix. If I understand it rightly, the GNN is basically an MLP layer for the isolated nodes and the final readout function will sum all the embeddings of all the nodes, including that selected one. So, I am still confused about how to rule out the node features from the final embeddings?

Uhhh I understand your concern. It is also possible to completely mask the dropped nodes by for example masking the feature matrix as well, so that they will not contribute to the graph-level embeddings, but I suspect that it wouldn't affect much. Here we employ node dropping randomly, which means that in every epoch different nodes will be masked.