erre-quadro / spikex

SpikeX - SpaCy Pipes for Knowledge Extraction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exception: invalid data, magic number is not correct

MLAlex1 opened this issue · comments

commented
  • spikex 0.5.0:
  • Python 3.6:
  • Windows 10:

Description

Hi i installed spikex and downloaded enwiki_core. However when i try to load enwiki_core :

from spikex.wikigraph import load as wg_load
from spikex.pipes import WikiPageX

# load a WikiGraph
wg = wg_load('enwiki_core')

I am getting the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\local\Pathways\nicepaths\lib\site-packages\spikex\wikigraph\wikigraph.py", line 41, in load
    return WikiGraph.load(data_path, meta)
  File "C:\local\Pathways\nicepaths\lib\site-packages\spikex\wikigraph\wikigraph.py", line 81, in load
    wg._wpd = WikiPageDetector.load(data_path)
  File "C:\local\Pathways\nicepaths\lib\site-packages\spikex\wikigraph\wikigraph.py", line 180, in load
    wpd._trie = Trie.from_buff(mmap(bf.fileno(), 0), copy=False)
  File "lib\cyac\trie.pyx", line 1086, in cyac.trie.Trie.from_buff
  File "lib\cyac\trie.pyx", line 1103, in cyac.trie.trie_from_buff
Exception: invalid data, magic number is not correct

cyac version is 1.3 (last one)
Any ideas please?

how was 'enwiki_core' built? maybe it's built by old version cyac.

there is a bug, in from_buff of old version cyac.

@MLAlex1, a new enwiki_core version has been released. Could you please try to see if that error happens again?

Hi I am getting the exact same error - I have tried with the latest enwiki_core and also built my own enwiki_core using the spikex create-wikigraph command.

cyac version 1.4,
cython version 0.29.28,
python version 3.7.3,
Spacy version 3.2.3
spikex version 0.5.2

Edit: A fresh venv with only the above libraries fixed the problem