Kohulan / DECIMER-Java

Deep Learning for Chemical Image Recognition (DECIMER)

Home Page:https://cheminf.uni-jena.de

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DOI

DECIMER

The project aims to develop methods to employ deep learning to recognize and interpret chemical structure from images in the printed and online literature with the aim of re-discovering scientific facts about natural products and their meta-data. The method, of course, will be generically applicable to most organic chemistry publications.
The project is co-supervised by Prof. Christoph Steinbeck, Jena, and Prof. Achim Zielesny, Westphalian University of Applied Sciences, Recklinghausen.

  • This repository contains scripts written in Java using CDK Libraries at the back end.
    • Here we have scripts that we used to explore various ways in depicting images.
    • We have scripts used to generate SMILES, Inchis from SDF files, and vice versa.
    • We also have scripts written to curate big chemical databases, to get relevant data for our own use case
    • You can also find tools to check the validation of our machine learning algorithm written in Python. Which is hosted in DECIMER-I2S and DECIMER-Python Repositories.

DECIMER source directory layout

├── DECIMER/src/org/openscience/
    └─decimer/
        ├ ─ ConformerGeneration.java
        ├ ─ DescriptorCalculator.java
        ├ ─ HIsotopeFinder.java
        ├ ─ Image2Array.java
        ├ ─ Image2ArrayRandom.java
        ├ ─ ImageResizer.java
        ├ ─ IndividualAtomCount.java
        ├ ─ LabelCreator.java
        ├ ─ MolWeightCalculator.java
        ├ ─ MoleculeFilters.java
        ├ ─ MoleculeSelector.java
        ├ ─ RejectChargedMols.java
        ├ ─ SmilesDepictor.java
        ├ ─ StrcutureDepitor8bit.java
        ├ ─ StructureDepictor.java
        ├ ─ StructureDepictorGrayscale.java
        └ ─ TanimotoCalculator.java

How to use the scripts

  • The java codes can be used by cloning the repository and compiling them using CDK as referenced library.

    • you have to change the input and output directory in every code before you use them.
    • for example : - Generating Images - We can use SmilesDepictor.java
    e.g: 
    javac -cp cdk-2.3.jar:. SmilesDepictor.java   # Compiling the script on your local directory.
    java -cp cdk-2.3.jar:. SmilesDepictor         # Run the compiled script.

License:

  • This project is licensed under the MIT License - see the LICENSE file for details

Author:

Project Website

Research Group

GitHub Logo

About

Deep Learning for Chemical Image Recognition (DECIMER)

https://cheminf.uni-jena.de

License:MIT License


Languages

Language:Java 100.0%