yzhangcs / crfsrl

[COLING'22] Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".

Home Page:https://aclanthology.org/2022.coling-1.370/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error while generating ConLL dataset objects for training the model

sanchit-ahuja opened this issue · comments

Hi,

I have been getting an assertion error when trying to generate data for the ConLL training.

Converting data/conll05/train.prop to conllu format
  0%|                                                                          | 49/989860 [00:01<9:20:53, 29.41it/s]
Traceback (most recent call last):
  File "scripts/prop2conllu.py", line 76, in <module>
    process(args.prop, args.file)
  File "scripts/prop2conllu.py", line 61, in process
    sentences.append(prop2conllu(lines[start:i]))
  File "scripts/prop2conllu.py", line 41, in prop2conllu
    assert len(prds) == len(args)
AssertionError

Can you please have a look at this?
The command that I used to run the script: bash scripts/conll05.sh PTB=ptb SRL=data

@sanchit-ahuja Hi, can you show me some sentence examples?

Here are first 20 lines of the train.prop file downloaded by the bash script.

        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        -                       *               *               *               *      
        revitalize            (V*)           (A0*               *               *      
        -                    (A1*)              *)              *               *      
        take                    *             (V*)              *               *  

@sanchit-ahuja Sorry for my late reply. I can not figure out what's wrong without further details.
If you wish, please email me privately so that I can send you the data I've processed.

Sure @yzhangcs, I will email you. Closing this issue.