dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Home Page:http://dgl.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[GraphBolt] TorchData Pytorch support

mfbalin opened this issue · comments

🔨Work Item

IMPORTANT:

  • This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
  • DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.

Project tracker: https://github.com/orgs/dmlc/projects/2

Description

pytorch/pytorch#124907 (comment)

Here, torch developers say that future versions of pytorch may not support torchdata properly. It might become a problem to support later PyTorch versions.

Previously we're trying to deprecate torchdata with torch.utils.data for datapipe-related operations as active development and release of torchdata have been paused(mentioned here).

So for now, both pytorch and torchdata team are deprecating torchdata?

I don't know the exact details. We need to look into it as it is a crucial dependency.

The way we implement DataLoader (https://github.com/dmlc/dgl/blob/658b2086b09bbd76c3d3f488af2b155a1c921052/python/dgl/graphbolt/dataloader.py#L79C7-L79C17) right now isn't perfect. It makes a lot of assumption that might cause problems later. Once those problems hit, we should redesign it. We held off because the torch.data already does a good job, but if we have to, we'll tackle it then.