benedekrozemberczki / karateclub

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Home Page:https://karateclub.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gl2vec doc’s may be inconsistent with the code

dmbaker opened this issue · comments

The doc’s state:
“The procedure assumes that nodes have no string feature present and the WL-hashing defaults to the degree centrality. However, if a node feature with the key "feature" is supported for the nodes the feature extraction happens based on the values of this key.”

However, the code is:
documents = [
WeisfeilerLehmanHashing(
graph, self.wl_iterations, False, self.erase_base_features
)
for graph in graphs

The code has the WL parameter “attributed” set to False. Se this way, WL will not use “feature” even if it exists. Are the doc’s incorrect or does the code need to be changed to allow access to “feature” of the graph edges? Further, the code in _create_line_graph does not harvest any “feature” from the edges.

Hi, I'm looking into this

So, though to the conversion of the graphs to line graphs, any node feature is lost. I am relatively certain that the text in the class documentation is mistakenly copied from the description of the Graph2Vec class.

Just so you know- I have updated the documentation accordingly (see pull request #135 ).

If you change _create_line_graph to:

def _create_line_graph(self, graph):
    r"""Getting the embedding of graphs.
    Arg types:
        * **graph** *(NetworkX graph)* - The graph transformed to be a line graph.
    Return types:
        * **line_graph** *(NetworkX graph)* - The line graph of the source graph.
    """
    lgraph = nx.line_graph(graph)
    lgraph.add_nodes_from((node, graph.edges[node]) for node in lgraph)
    node_mapper = {node[0]: (i, node[1]) for i, node in enumerate(list(lgraph.nodes(data=True)))}
    edges = [[node_mapper[edge[0]][0], node_mapper[edge[1]][0]] for edge in lgraph.edges]
    line_graph = nx.from_edgelist(edges)
    line_graph .add_nodes_from(node_mapper.values())
    return line_graph

And add the paramerter: use_node_attribute: Optional[str] = None, to GL2Vec init like in Graph2Vec, then GL2Vec would support user defined features.

Could you please add this to a pull request, and I'll integrate it further as necessary?