Error while generating ConLL dataset objects for training the model
sanchit-ahuja opened this issue · comments
Hi,
I have been getting an assertion error when trying to generate data for the ConLL training.
Converting data/conll05/train.prop to conllu format
0%| | 49/989860 [00:01<9:20:53, 29.41it/s]
Traceback (most recent call last):
File "scripts/prop2conllu.py", line 76, in <module>
process(args.prop, args.file)
File "scripts/prop2conllu.py", line 61, in process
sentences.append(prop2conllu(lines[start:i]))
File "scripts/prop2conllu.py", line 41, in prop2conllu
assert len(prds) == len(args)
AssertionError
Can you please have a look at this?
The command that I used to run the script: bash scripts/conll05.sh PTB=ptb SRL=data
@sanchit-ahuja Hi, can you show me some sentence examples?
Here are first 20 lines of the train.prop
file downloaded by the bash script.
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
- * * * *
revitalize (V*) (A0* * *
- (A1*) *) * *
take * (V*) * *
@sanchit-ahuja Sorry for my late reply. I can not figure out what's wrong without further details.
If you wish, please email me privately so that I can send you the data I've processed.
Sure @yzhangcs, I will email you. Closing this issue.