ELchem / chemdataextractor2

ChemDataExtractor Version 2.0

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ChemDataExtractor

ChemDataExtractor v2 is a toolkit for extracting chemical information from the scientific literature. Python 3.5 to Python 3.8 supported.

Installation

python3 -m pip install chemdataextractor2 --use-feature=2020-resolver

Features

  • HTML, XML and PDF document readers
  • Chemistry-aware natural language processing pipeline
  • Chemical named entity recognition
  • Rule-based parsing grammars for property and spectra extraction
  • Table parser for extracting tabulated data
  • Document processing to resolve data interdependencies

Documentation & Development

Please read the documentation for instructions on contributing to the project.

https://cambridgemolecularengineering-chemdataextractor-development.readthedocs-hosted.com/en/latest/

License

ChemDataExtractor v2 is licensed under the MIT license_, a permissive, business-friendly license for open source software.

MIT license: https://github.com/CambridgeMolecularEngineering/ChemDataExtractor/blob/master/LICENSE

About

ChemDataExtractor Version 2.0

License:Other


Languages

Language:HTML 58.1%Language:Python 41.8%Language:Shell 0.2%