petermr / dictionary

Collection of Wikidata-based dictionaries for scientific annotation and searching

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dictionary search using Python

petermr opened this issue · comments

Python code to read fulltext.xml or the sections and match words using one or more dictionaries.

Approximate:

  • glob.glob(“/*_methods//*.xml”)

  • elementTree to get the text of the XML files

  • Search text using Python for stem/lowercase/stopwords and return wikiData ID and context

Peter had written a Jupyter Notebook (probably a month ago) which did something similar. Here is the link to the notebook. https://github.com/petermr/ami3/blob/master/src/ipynb/text.ipynb