kmansouri / OPERA

Free and open-source application (command line and GUI) providing QSAR models predictions as well as applicability domain and accuracy assessment for physicochemical properties, environmental fate and toxicological endpoints. ==================>Download the latest compiled version from the "releases" tab and run the executable installer.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OPERA 2.6 command line parallel version crashing

rvaidya opened this issue · comments

Hi Kamel,

The command line parallel version of OPERA 2.6 is crashing for an input that works with the normal version:

INFO: Adding explicit H false
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: Will evaluate 50 descriptors
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: Got 50 descriptor instances
Exception in thread "main" java.lang.NullPointerException
        at net.guha.apps.cdkdesc.CDKDescUtils.isSMILESFormat(CDKDescUtils.java:74)
        at net.guha.apps.cdkdesc.CDKdescBatch.batchDescriptor(CDKdescBatch.java:227)
        at net.guha.apps.cdkdesc.CDKdesc.main(CDKdesc.java:510)
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: output:   CDKtemp/CDKDesc_8_temp.csv
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: type:     all
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor

From using CDK, I know that it is not thread safe - does it need a lock around CDK functionality?

Actually, I'm getting crashes sometimes in the standard version too, but during PaDEL descriptor calculation:

PaDEL calculating 2D descriptors...
Exception in thread "Thread-113" java.lang.NullPointerException
        at org.openscience.cdk1.qsar.AtomValenceTool.getValence(AtomValenceTool.java:95)
        at libpadeldescriptor.ExtendedTopochemicalAtomDescriptor.calculate(Unknown Source)
        at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-112" java.lang.ClassCastException: org.openscience.cdk1.qsar.result.DoubleArrayResult cannot be cast to org.openscience.cdk1.qsar.result.DoubleResult
        at libpadeldescriptor.EStateAtomTypeDescriptor.calculate(Unknown Source)
        at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-107" java.lang.NullPointerException
        at org.openscience.cdk1.qsar.AtomValenceTool.getValence(AtomValenceTool.java:95)
        at libpadeldescriptor.PaDELChiIndexUtils.getValenceElectronCount(Unknown Source)
        at libpadeldescriptor.PaDELChiIndexUtils.evalValenceIndex(Unknown Source)
        at libpadeldescriptor.PaDELChiPathDescriptor.calculate(Unknown Source)
        at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-104" java.lang.NullPointerException
        at org.openscience.cdk1.qsar.AtomValenceTool.getValence(AtomValenceTool.java:95)
        at org.openscience.cdk1.qsar.descriptors.molecular.ChiIndexUtils.getValenceElectronCount(ChiIndexUtils.java:182)
        at org.openscience.cdk1.qsar.descriptors.molecular.ChiIndexUtils.evalValenceIndex(ChiIndexUtils.java:169)
        at org.openscience.cdk1.qsar.descriptors.molecular.ChiChainDescriptor.calculate(ChiChainDescriptor.java:198)
        at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Descriptor calculation completed in 0.240 secs . Average speed: 0.24 s/mol.
PaDEL descriptors calculated for: 1 molecules.

It looks like the first crash might only happen when number of inputs is < than the number of workers.

Thank you Rahul,

The parallel version is only recommended for 5000 chemicals or more at a time, as mentioned on the releases page. But of course, it'll work within your computational resources. So if you run 50,000 with limited RAM it'll crash.
The second example does not seem like a crash to me. If you ran one molecule, it seems that the calculation is finished. Some of the descriptors throw exceptions sometimes. If you don't like to see the full output you should use verbose mode "1" minimum or "0" silent. I hope this helps.