pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch

Home Page:https://pytorch.org/text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'NoneType' object has no attribute 'Lock'

keremnymn opened this issue Β· comments

πŸ› Bug

Describe the bug A clear and concise description of what the bug is.

Looping through the data that was split gives AttributeError: 'NoneType' object has no attribute 'Lock' This exception is thrown by __iter__ of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)

To Reproduce Steps to reproduce the behavior:

You can try to execute the function from the official documentation, which is on this page:

# import datasets
from torchtext.datasets import IMDB

train_iter = IMDB(split='train')

def tokenize(label, line):
    return line.split()

tokens = []
for label, line in train_iter:
    tokens += tokenize(label, line)
  • PyTorch Version (e.g., 1.0): torch==2.0.1+cpu
  • OS (e.g., Linux): Linux
  • How you installed PyTorch: pip
  • Python version: 3.11
  • torchtext version: 0.15.2
  • Any other relevant information: I had to install portalocker>=2.0.0 otherwise it was giving an error telling me to install this package.

I tried both on my local environment and Google Colab. Same error.

Hi, I have encountered the same problem as yours, did you solve it now?

Updates?

torch.version = 2.1.0+cpu
torchtext.version = 0.16.0+cpu
sys.version= 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 8 2023, 10:42:25) [MSC v.1916 64 bit (AMD64)]

I got the same problem when I run my codes as:

**
train, test = IMDB(split=('train', 'test'))
counter = Counter()
for (label, line) in train:
print(label, line)
counter.update(tokenizer(line))
vocab = Vocab(counter, min_freq=10, specials=('', '', '', ''))
**

then I change my codes as:

**
imdb_root = r'T:\DeepLearningwithPyTorch_Code\DLwithPyTorch-master\Chapter06\aclImdb'
train, test = IMDB(root=imdb_root,split=('train', 'test'))
counter = Counter()
for (label, line) in train:
print(label, line)
counter.update(tokenizer(line))
vocab = Vocab(counter, min_freq=10, specials=('', '', '', ''))
**

now no such error: 'NoneType' object has no attribute 'Lock'

I resolved this in torchtext 0.16.0 by installing portalocker==2.8.2 and then restarting the kernel of my Jupyter notebook.

IMO portalocker should be an explicity dependency, as in #2182.

I just tried using this tutorial
https://pytorch.org/text/stable/tutorials/sst2_classification_non_distributed.html#data-transformation

I am still getting the same error when trying to iterate over dataloader ,

AttributeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 for i in train_dataloader:
2 print(i)

75 frames
/usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py in _cache_check_fn(data, filepath_fn, hash_dict, hash_type, extra_check_fn, cache_uuid)
261 os.makedirs(dirname)
262
--> 263 with portalocker.Lock(promise_filepath, "a+", flags=portalocker.LockFlags.EXCLUSIVE) as promise_fh:
264 promise_fh.seek(0)
265 data = promise_fh.read()

AttributeError: 'NoneType' object has no attribute 'Lock'
This exception is thrown by iter of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)

Tried restarting kernel did'nt work!