THUDM / CogDL

CogDL: A Comprehensive Library for Graph Deep Learning (WWW 2023)

Home Page:https://cogdl.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

关于自定义数据集

hhtttyy opened this issue · comments

❓ Questions & Help

您好 我是一名GNN初学者 我个比较基础的问题
我本想参考 自定义数据集 来构建 但是示例的x没有节点名称 但我的数据格式是 <节点名称,特征向量> <节点名称,节点名称>这种键值对形式去创建节点和边 我如何在输入x中加入节点名称呢?

Hi @hhtttyy,

感谢对cogdl的关注。你可以对<节点名称>做一下编码。比如借助dict来将节点名称编码到0~(n-1)这n个编号。

感谢及时的回复! 我成功构建了图并进行了存储 但是在experiment时我遇到了一些麻烦
正如前面所说 我的数据只有节点特征与边 没有节点标签 当使用unsup_graphsage进行训练时无法获取损失函数出现 NotImplementedError 是我使用的方法有不妥的地方吗?

dataset = MyNodeClassificationDataset(nodefeature=result,edges=edge_index)  #success
experiment(dataset=dataset, model="unsup_graphsage",checkpoint_path="mypaper_unsup_graphsage.pt")
*** Running (`mypaper_data.pt`, `unsup_graphsage`, `node_classification_dw`, `unsup_graphsage_mw`)

Traceback (most recent call last):
File "/home/hantianyi/Graph/cogdl_dataset.py", line 124, in
experiment(dataset=dataset, model="unsup_graphsage",checkpoint_path="aminer_unsup_graphsage.pt")
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/experiments.py", line 358, in experiment
return raw_experiment(args)
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/experiments.py", line 262, in raw_experiment
results = [train(args) for args in variant_args_generator(args, variants)]
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/experiments.py", line 262, in
results = [train(args) for args in variant_args_generator(args, variants)]
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/experiments.py", line 144, in train
dataset_wrapper = dw_class(dataset, **data_wrapper_args)
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/wrappers/data_wrapper/node_classification/node_classification_dw.py", line 7, in init
super(FullBatchNodeClfDataWrapper, self).init(dataset)
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/wrappers/data_wrapper/base_data_wrapper.py", line 15, in init
self.loss_fn = dataset.get_loss_fn()
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/datasets/customized_data.py", line 99, in get_loss_fn
return _get_loss_fn(self.metric)
File "/home/hantianyi/anaconda3/lib/python3.8/site-packages/cogdl/datasets/customized_data.py", line 27, in _get_loss_fn
raise NotImplementedError
NotImplementedError
`

Hi @hhtttyy ,

experiment API默认需要做evaluate(也就是需要节点标签)。如果你只是想生成一个embedding的话,可以试试这个用法 https://github.com/THUDM/cogdl/blob/master/examples/generate_emb.py

是的 我才发现继承的NodeDataset需要y标签来得到损失函数 但是我没有标签 只有所有节点的embedding和节点之间的连接关系 我想通过model="unsup_graphsage",dw='embedding_link_prediction_dw' 做链接预测任务 cogdl能否对这样的数据进行构建?