rdk / p2rank

P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.

Home Page:https://rdk.github.io/p2rank/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

No CSV output?

lazear opened this issue · comments

Hi,

I'm trying to run p2rank on some AlphaFold structures, and I'm not seeing <struct_file>_predictions.csv or <struct_file>_residues.csv files being generated.

 ~/p2rank_2.4 $ ./prank predict -c alphafold -f ../AF-Q13526-F1-model_v4.cif
 ~/p2rank_2.4 $ ls test_output/predict_AF-Q13526-F1-model_v4/
params.txt  run.log  visualizations

Here's the contents of the run.log file:

[INFO] Console - predicting pockets for proteins from dataset [AF-Q13526-F1-model_v4.cif]
[INFO] PredictPocketsRoutine - outdir: /home/michael/p2rank_2.4/test_output/predict_AF-Q13526-F1-model_v4
[INFO] FeatureSetup - enabledFeatures: [chem, volsite, protrusion, atom_table]
[INFO] Dataset - processing dataset [AF-Q13526-F1-model_v4.cif] using 0 threads
[INFO] Dataset -
------------------------------------------------------------------------------------------------------------------------
processing [AF-Q13526-F1-model_v4.cif] (1/1)
------------------------------------------------------------------------------------------------------------------------

[INFO] Console - processing [AF-Q13526-F1-model_v4.cif] (1/1)
[INFO] Protein - loading protein [/home/michael/AF-Q13526-F1-model_v4.cif]
[INFO] PdbUtils - loading file [../AF-Q13526-F1-model_v4.cif]
[INFO] Struct - groups in chain A: 163
[INFO] Struct - groups in chain A: 163
[INFO] Struct - 163 groups in chain A
[INFO] Protein - structure atoms: 1282
[INFO] Protein - protein   atoms: 1248
[INFO] Protein - ignoring ligands
[INFO] FeatureSetup - enabledFeatures: [chem, volsite, protrusion, atom_table]
[INFO] Protein - SAS points: 3210
[INFO] Protein - exposed protein atoms: 899 of 1248
[INFO] InstancePredictor - Creating WekaInstancePredictor
[INFO] SLinkClusterer - clustering [27] elements
[INFO] SLinkClusterer - clusters: [id:0( 1 ), id:1( 1 ), id:7( 2 ), id:8( 1 ), id:9( 1 ), id:25( 21 )]
[INFO] SLinkClusterer - clusters together: 27 / 27
[INFO] PocketPredictor - PREDICTING POCKETS.... ====================================
[INFO] PocketPredictor - SAS POINTS: 3210
[INFO] PocketPredictor - LIGANDABLE POINTS: 27
[INFO] PocketPredictor - CLUSTERS: 6
[INFO] PocketPredictor - FILTERED CLUSTERS: 1
[INFO] PocketPredictor - pocket 1 -  surf_atoms:  28   points:  21   score:   10.3
[INFO] OldPymolRenderer - copying [/home/michael/AF-Q13526-F1-model_v4.cif] to [/home/michael/p2rank_2.4/test_output/predict_AF-Q13526-F1-model_v4/visualizations/data/AF-Q13526-F1-model_v4.cif]
[INFO] Console - predicting pockets finished in 0 hours 0 minutes 3.248 seconds
[INFO] Console - results saved to directory [/home/michael/p2rank_2.4/test_output/predict_AF-Q13526-F1-model_v4]
[INFO] Console -
[INFO] Console - ----------------------------------------------------------------------------------------------
[INFO] Console -  finished successfully in 0 hours 0 minutes 3.797 seconds
[INFO] Console - ----------------------------------------------------------------------------------------------

I'm running the latest release of p2rank on WSL

openjdk 17.0.5 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)

I just tried running with -visualizations 0 and I can now create the CSV files - is there a way to generate both kinds of output?

Also, the only file in the visualizations/data/ folder is the input file - any way to get the .pml file or associated data?

Attempting to build from source yields several failing tests:

$ ./unit-tests.sh
cz.siret.prank.program.api.impl.DafaultPrankPredictorTest > runPrediction FAILED
    java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at cz.siret.prank.program.api.impl.DafaultPrankPredictorTest.runPrediction(DafaultPrankPredictorTest.groovy:86)
Running test: Test testDefaultParams(cz.siret.prank.program.params.ConfigLoaderTest)
Running test: Test testOverride(cz.siret.prank.program.params.ConfigLoaderTest)
Running test: Test testTrainEvalFeatureImportances(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalFeatureImportances FAILED
    java.lang.NoClassDefFoundError: Could not initialize class java.awt.Toolkit
        at java.desktop/java.awt.Dimension.<clinit>(Dimension.java:90)
        at org.bounce.net.DefaultAuthenticator.<clinit>(DefaultAuthenticator.java:65)
        at weka.core.packageManagement.PackageManager.setProxyAuthentication(PackageManager.java:192)
        at weka.core.WekaPackageManager.establishWekaHome(WekaPackageManager.java:487)
        at weka.core.WekaPackageManager.<clinit>(WekaPackageManager.java:251)
        at weka.core.ResourceUtils.readProperties(ResourceUtils.java:241)
        at weka.core.ResourceUtils.readProperties(ResourceUtils.java:184)
        at weka.core.Utils.readProperties(Utils.java:183)
        at weka.core.Capabilities.<clinit>(Capabilities.java:104)
        at cz.siret.prank.fforest.FasterTree.getCapabilities(FasterTree.java:143)
        at cz.siret.prank.fforest.FasterForest.getCapabilities(FasterForest.java:608)
        at cz.siret.prank.fforest.FasterForest.buildClassifier(FasterForest.java:622)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEval(TrainEvalRoutineTest.groovy:48)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalFeatureImportances(TrainEvalRoutineTest.groovy:165)
Running test: Test testTrainEvalFF(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalFF FAILED
    java.lang.NoClassDefFoundError: Could not initialize class weka.core.Capabilities
        at cz.siret.prank.fforest.FasterTree.getCapabilities(FasterTree.java:143)
        at cz.siret.prank.fforest.FasterForest.getCapabilities(FasterForest.java:608)
        at cz.siret.prank.fforest.FasterForest.buildClassifier(FasterForest.java:622)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEval(TrainEvalRoutineTest.groovy:48)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalFF(TrainEvalRoutineTest.groovy:119)
Running test: Test testTrainEvalRF(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalRF FAILED
    java.lang.NoClassDefFoundError: Could not initialize class weka.core.Capabilities
        at weka.classifiers.AbstractClassifier.getCapabilities(AbstractClassifier.java:509)
        at weka.classifiers.trees.RandomTree.getCapabilities(RandomTree.java:690)
        at weka.classifiers.trees.RandomForest.getCapabilities(RandomForest.java:228)
        at weka.classifiers.meta.Bagging.buildClassifier(Bagging.java:681)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEval(TrainEvalRoutineTest.groovy:48)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalRF(TrainEvalRoutineTest.groovy:146)
Running test: Test testTrainEvalFF2(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalFF2 FAILED
    java.lang.NoClassDefFoundError: Could not initialize class weka.core.Capabilities
        at weka.classifiers.AbstractClassifier.getCapabilities(AbstractClassifier.java:509)
        at cz.siret.prank.fforest2.FasterForest2Tree.getCapabilities(FasterForest2Tree.java:173)
        at cz.siret.prank.fforest2.FasterForest2.getCapabilities(FasterForest2.java:640)
        at cz.siret.prank.fforest2.FasterForest2.buildClassifier(FasterForest2.java:654)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEval(TrainEvalRoutineTest.groovy:48)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalFF2(TrainEvalRoutineTest.groovy:137)
Running test: Test testTrainEvalFRF(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalFRF FAILED
    java.lang.NoClassDefFoundError: Could not initialize class weka.core.Capabilities
        at weka.classifiers.AbstractClassifier.getCapabilities(AbstractClassifier.java:509)
        at hr.irb.fastRandomForest.FastRandomTree.getCapabilities(FastRandomTree.java:155)
        at hr.irb.fastRandomForest.FastRandomForest.getCapabilities(FastRandomForest.java:568)
        at hr.irb.fastRandomForest.FastRandomForest.buildClassifier(FastRandomForest.java:582)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEval(TrainEvalRoutineTest.groovy:48)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalFRF(TrainEvalRoutineTest.groovy:128)
Running test: Test testTrainEvalResidueMode(cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest)

cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest > testTrainEvalResidueMode FAILED
    java.lang.NoClassDefFoundError: Could not initialize class weka.core.Capabilities
        at cz.siret.prank.fforest.FasterTree.getCapabilities(FasterTree.java:143)
        at cz.siret.prank.fforest.FasterForest.getCapabilities(FasterForest.java:608)
        at cz.siret.prank.fforest.FasterForest.buildClassifier(FasterForest.java:622)
        at cz.siret.prank.utils.WekaUtils.trainClassifier(WekaUtils.groovy:114)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainModel(TrainEvalRoutine.groovy:200)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutine.trainAndEvalModel(TrainEvalRoutine.groovy:158)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.doTestTrainEvalForResidues(TrainEvalRoutineTest.groovy:96)
        at cz.siret.prank.program.routines.traineval.TrainEvalRoutineTest.testTrainEvalResidueMode(TrainEvalRoutineTest.groovy:155)

and

$ ./tests.sh qucik
testing command [./prank.sh traineval -t fpocket.ds -e test.ds -loop 1 -fail_fast 1 -out_subdir TEST/TESTS]
[ERROR] (exit code: 1) time: 2 s
testing command [./prank.sh crossval fpocket.ds -loop 1 -fail_fast 1 -out_subdir TEST/TESTS]
[ERROR] (exit code: 1) time: 1 s
testing command [./prank.sh ploop -t fpocket.ds -e test.ds -loop 1 -fail_fast 1 -r_generate_plots 0 -feature_filters ((-chem.*),(-chem.*,chem.atoms),(protrusion.*,bfactor.*)) -out_subdir TEST/TESTS]
[ERROR] (exit code: 1) time: 2 s
testing command [./prank.sh traineval -t fpocket.ds -e test.ds -loop 1 -fail_fast 1 -tessellation 1 -train_tessellation 3 -out_subdir TEST/TESTS]
[ERROR] (exit code: 1) time: 2 s
commented

@lazear, thank you for your bug report.

It seems to be a problem specific to WSL.
I have tested the same input (AF-Q13526-F1-model_v4.cif) on native Linux (Ubuntu 22.04.1) and Windows (10) and P2Rank is producing all expected files by default (using different distributions of Java 17).

Also I have been able to reproduce your errors on the given input and errors running ./unit-tests.sh on WSL.
WSL evnironment details:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:        22.04
Codename:       jammy

$ java -version
openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu122.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu122.04, mixed mode, sharing)

I wasn't able to find a cause of the errors, but it seems to be related to general problems of running Java on WSL.
I would recommend running P2Rank natively on Windows instead.

Is there any reason you want to run it inside WSL?

commented

Installing JDK on WSL (instead of just JRE) with sudo apt install openjdk-17-jdk seems to have resolved all the errors.

Thanks - this did fix it. I tracked it down to the "headless" version of the jre missing java.awt, which caused some silent failures

commented

@lazear, thanks for reporting back.