coastalcph / lex-glue

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scotus: ValueError: expected sequence of length 64 at dim 2 (got 128)

cooelf opened this issue · comments

Hi, thanks for the awesome repo!

I have encountered an issue when running the scripts for scotus.

[INFO|trainer.py:1164] 2022-05-31 13:09:08,068 >> ***** Running training *****
[INFO|trainer.py:1165] 2022-05-31 13:09:08,068 >> Num examples = 100
[INFO|trainer.py:1166] 2022-05-31 13:09:08,068 >> Num Epochs = 10
[INFO|trainer.py:1167] 2022-05-31 13:09:08,068 >> Instantaneous batch size per device = 8
[INFO|trainer.py:1168] 2022-05-31 13:09:08,068 >> Total train batch size (w. parallel, distributed & accumulation) = 64
[INFO|trainer.py:1169] 2022-05-31 13:09:08,068 >> Gradient Accumulation steps = 1
[INFO|trainer.py:1170] 2022-05-31 13:09:08,068 >> Total optimization steps = 20
0%| | 0/20 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/cooelf/lex-glue-main/scotus.py", line 490, in
main()
File "/home/cooelf/lex-glue-main/scotus.py", line 439, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/cooelf/.local/lib/python3.7/site-packages/transformers/trainer.py", line 1254, in train
for step, inputs in enumerate(epoch_iterator):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/home/cooelf/.local/lib/python3.7/site-packages/transformers/data/data_collator.py", line 81, in default_data_collator
batch[k] = torch.tensor([f[k] for f in features])
ValueError: expected sequence of length 64 at dim 2 (got 128)
0%| | 0/20 [00:00<?, ?it/s]

Process finished with exit code 1

It seems to be some problem about the data processing. I have checked the dimension of the features but failed to find anything strange.

Could you give some hints to solve it?

Thanks!

Hi @cooelf, can please make sure that you use the final version of the code (I just made a minor change to scotus.py) and let me know if there is still an issue?

Thanks for your prompt update. It works now!