AnacletoLAB / grape

🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why does get_all_node_embedding return a list of embedding matrix?

chansigit opened this issue · comments

Could you:

  1. Provide the code you are using
  2. Mention the embedding method you are using
  3. Describe what is the behaviour you expect, and why the behaviour you see is unexpected.

I am working a graph g_dtw:

embedding = Node2VecGloVeEnsmallen(embedding_size=50, walk_length=10, max_neighbours=20).fit_transform(g_dtw)

and try to get the embedding results:

z1 = embedding.get_all_node_embedding()

only to find the returned z1 is a list containing 2 embedding matrices. Why?

I am not sure what other thing you would expect - let me break it down for you.

  • Node2Vec is a node embedding approach based on sampling random walks to be used as input for NLP models, such as Word2Vec CBOW and SkipGram, or GloVe, such as the option you choose.
  • All three models, CBOW, SkipGram and GloVe, are characterized by TWO word embedding. These node embeddings have different interpretations depending on the selected embedding, which I may characterize as follows:
    • In CBOW, the first embedding is the 'context representation' of a node, while the second embedding is the 'central representation' of a node. The model learns to bind the two embeddings in such a way the dot product between true contextual nodes and central nodes is maximal.
    • In SkipGram, the result is somewhat similar but inverted.
    • In GloVe, you have a first node embedding that is interpretable as the source node embedding, with the latter being the destination node embedding. This is because the model tunes the dot product of the source and destination node embedding in such a way as to estimate the co-occurrence of the two nodes in random walks.

I hope this clarifies to you why these models have two distinct node embedding with different characteristics.

If you are referring to the choice of several libraries to not make these or several other features available, you should go ask them.

Thank you so much for your teaching!

I thought only one embedding would be returned.

No worries, I understand that there is some confusion regarding these topics. Could you take a minute to describe to me how the library experience could be improved to help you more intuitively understand what was happening?

docstring!
I tried to understand the output by myself but this function does not seem to have a docstring. Descriptions about returned list should clarify the function's design.

you are so devoted to grape. I am glad to introduce your work to my colleagues.