hpcaitech / ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

wikiextractor raise BdbQuit

RenyunLi0116 opened this issue Β· comments

πŸ› Describe the bug

Hi All,
When I run the code in language Bert # extractmodule
wikiextractor --json enwiki-latest-pages-articles.xml.bz2
I got raise BdbQuit, this seems to be solved in here , by changing the version of wikiextractor to 3.0.4
But after that, the example code couldn't work due to 3.0.4 does not support --json

Environment

No response

Hi, I have not met this issue before. Can you provide the versions of wikiextractor you tried?

Hi, I have not met this issue before. Can you provide the versions of wikiextractor you tried?

Hi, the original and default is 3.0.6, which would raise BdbQuit.

When use 3.0.4, this would be solved, but unable to --json since 3.0.4 doesn't support this command.

Hi @saleelirenyun We have updated the preprocessing, you can try it, thanks.
preprocessing