Mblakey / wiswesser

Wiswesser Line Notation Project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The Wiswesser Line Notation (WLN) Project

  • WLN Parser - read and write WLN to/from smiles, inchi, mol files and other chemical line notations.
  • WLN FSM - extract chemical terms from documents, this machine uses greedy matching to return matched WLN sequences from documents.
  • WLN Compresser - compress WLN strings using markov decision processes.
  • WLN Generator - reinforcement learning approach for fast generation of target molecule descriptors without any deep learning.

This is Linux and MacOS software only.

Note: This project is solely created by Michael as part of his PhD work, if you are interested using the project, or find any bugs or issues, reporting them would be extremely helpful.

Requirements

git, cmake, make and a c++ compiler are all essential.
graphviz is an optional install to view wln graphs (not needed for build).

OpenBabel see repo, will be installed as an external dependency.

Build

Run ./bootstrap.sh from the project directory, this will clone and build openbabel as well as linking the library to the parser in cmake. Babel files will be installed to external. Building the projects places all executables into build/.

Project Structure

This repository contains a broad range of functionality using WLN notation for various operations. As such, please read the individual README.txt files for the required area.

Unit Testing

All unit tests are contained in the /test directory.
These include:

  1. compare.sh
  2. reading.sh
  3. writing.sh
  4. file.sh

Unit tests 1-3 operate on the data files in \data. For comparsions agaisnt the old parser in OpenBabel select 1, for reading count tests run 2, writing round trip tests 3. To parse a file of WLN strings, file.sh will attempt conversions on every line.

About

Wiswesser Line Notation Project

License:MIT License


Languages

Language:C++ 93.9%Language:C 3.1%Language:Shell 1.5%Language:ANTLR 1.0%Language:CMake 0.4%