KennthShang / PhaGCN2.0

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1. It runs slowly, how to accelerate. Can metagenomes be used as input?

TiAmoTYX opened this issue · comments

commented

input:!python run_Speed_up.py --contigs ComND.Sep_nonredundant.fasta
outputs:
folder pred exist... cleaning dictionary
Dictionary cleaned
Creating Diamond database...
diamond v0.9.14.115 | by Benjamin Buchfink <buchfink@gmail.com>
Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt
Check http://github.com/bbuchfink/diamond for updates.

#CPU threads: 128
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database file: database/ALL_protein.fasta
Opening the database file... [8.6e-05s]
Loading sequences... [1.03251s]
Masking sequences... [19.8524s]
Writing sequences... [0.135956s]
Loading sequences... [8e-06s]
Writing trailer... [0.003759s]
Closing the input file... [1e-05s]
Closing the database file... [1.9e-05s]
Processed 355277 sequences, 89213859 letters.
Total time = 21.0249s
Running Diamond...
diamond v0.9.14.115 | by Benjamin Buchfink <buchfink@gmail.com>
Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt
Check http://github.com/bbuchfink/diamond for updates.

#CPU threads: 128
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
#Target sequences to report alignments for: 25
Temporary directory: database
Opening the database... [1.3e-05s]
Opening the input file... [2.2e-05s]
Opening the output file... [2.6e-05s]
Loading query sequences... [0.463304s]
Masking queries... [20.5479s]
Building query seed set... [0.000739s]
Algorithm: Double-indexed
Building query histograms... [13.9358s]
Allocating buffers... [0.004801s]
Loading reference sequences... [0.296878s]
Building reference histograms... [10.6063s]
Allocating buffers... [0.00364s]
Initializing temporary storage... [0.04697s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 0.
Building reference index... [2.77132s]
Building query index... [2.70239s]
Building seed filter... [0.192776s]
Searching alignments... [194.589s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 1.
Building reference index... [1.98594s]
Building query index... [1.94714s]
Building seed filter... [0.135148s]
Searching alignments... [180.56s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 2.
Building reference index... [2.16275s]
Building query index... [2.58315s]
Building seed filter... [0.242484s]
Searching alignments... [177.161s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 3.
Building reference index... [1.82576s]
Building query index... [1.79691s]
Building seed filter... [0.135568s]
Searching alignments... [178.369s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 0.
Building reference index... [1.7556s]
Building query index... [1.77965s]
Building seed filter... [0.140798s]
Searching alignments... [158.161s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 1.
Building reference index... [1.96956s]
Building query index... [2.29852s]
Building seed filter... [0.239073s]
Searching alignments... [160.846s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 2.
Building reference index... [2.95242s]
Building query index... [2.85216s]
Building seed filter... [0.206348s]
Searching alignments... [158.888s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 3.
Building reference index... [2.5536s]
Building query index... [2.54275s]
Building seed filter... [0.141991s]
Searching alignments... [157.112s]
Processing query chunk 0, reference chunk 0, shape 2, index chunk 0.
Building reference index... [2.41696s]
Building query index... [1.78749s]
Building seed filter... [0.139807s]
Searching alignments... [166.157s]
Processing query chunk 0, reference chunk 0, shape 2, index chunk 1.
Building reference index... [2.36725s]
Building query index... [3.19314s]
Building seed filter... [0.230165s]
Searching alignments... [168.983s]
Processing query chunk 0, reference chunk 0, shape 2, index chunk 2.
Building reference index... [2.08511s]
Building query index... [2.02835s]
Building seed filter... [0.13463s]
Searching alignments... [166.332s]
Processing query chunk 0, reference chunk 0, shape 2, index chunk 3.
Building reference index... [2.54531s]
Building query index... [2.73737s]
Building seed filter... [0.19258s]
Searching alignments... [163.168s]
Processing query chunk 0, reference chunk 0, shape 3, index chunk 0.
Building reference index... [1.88082s]
Building query index... [1.84685s]
Building seed filter... [0.13918s]
Searching alignments... [163.837s]
Processing query chunk 0, reference chunk 0, shape 3, index chunk 1.
Building reference index... [2.73969s]
Building query index... [1.9788s]
Building seed filter... [0.145996s]
Searching alignments... [158.893s]
Processing query chunk 0, reference chunk 0, shape 3, index chunk 2.
Building reference index... [2.10557s]
Building query index... [3.11283s]
Building seed filter... [0.208613s]
Searching alignments... [159.014s]
Processing query chunk 0, reference chunk 0, shape 3, index chunk 3.
Building reference index... [2.51071s]
Building query index... [1.84304s]
Building seed filter... [0.138735s]
Searching alignments... [157.097s]
Processing query chunk 0, reference chunk 0, shape 4, index chunk 0.
Building reference index... [1.77338s]
Building query index... [1.77826s]
Building seed filter... [0.138602s]
Searching alignments... [161.503s]
Processing query chunk 0, reference chunk 0, shape 4, index chunk 1.
Building reference index... [2.79382s]
Building query index... [2.34932s]
Building seed filter... [0.142117s]
Searching alignments... [156.45s]
Processing query chunk 0, reference chunk 0, shape 4, index chunk 2.
Building reference index... [1.99907s]
Building query index... [1.99978s]
Building seed filter... [0.138183s]
Searching alignments... [155.676s]
Processing query chunk 0, reference chunk 0, shape 4, index chunk 3.
Building reference index... [2.63258s]
Building query index... [2.4668s]
Building seed filter... [0.19429s]
Searching alignments... [157.919s]
Processing query chunk 0, reference chunk 0, shape 5, index chunk 0.
Building reference index... [1.98301s]
Building query index... [1.78562s]
Building seed filter... [0.141065s]
Searching alignments... [157.891s]
Processing query chunk 0, reference chunk 0, shape 5, index chunk 1.
Building reference index... [1.91915s]
Building query index... [2.90744s]
Building seed filter... [0.191985s]
Searching alignments... [159.187s]
Processing query chunk 0, reference chunk 0, shape 5, index chunk 2.
Building reference index... [2.89677s]
Building query index... [2.06991s]
Building seed filter... [0.135317s]
Searching alignments... [155.446s]
Processing query chunk 0, reference chunk 0, shape 5, index chunk 3.
Building reference index... [1.76301s]
Building query index... [1.76524s]
Building seed filter... [0.137115s]
Searching alignments... [156.355s]
Processing query chunk 0, reference chunk 0, shape 6, index chunk 0.
Building reference index... [2.70931s]
Building query index... [2.44232s]
Building seed filter... [0.210021s]
Searching alignments... [153.192s]
Processing query chunk 0, reference chunk 0, shape 6, index chunk 1.
Building reference index... [2.81729s]
Building query index... [2.81792s]
Building seed filter... [0.220948s]
Searching alignments... [153.437s]
Processing query chunk 0, reference chunk 0, shape 6, index chunk 2.
Building reference index... [2.04254s]
Building query index... [1.99006s]
Building seed filter... [0.139398s]
Searching alignments... [152.918s]
Processing query chunk 0, reference chunk 0, shape 6, index chunk 3.
Building reference index... [1.75688s]
Building query index... [1.75614s]
Building seed filter... [0.13578s]
Searching alignments... [152.378s]
Processing query chunk 0, reference chunk 0, shape 7, index chunk 0.
Building reference index... [1.77747s]
Building query index... [1.76655s]
Building seed filter... [0.14206s]
Searching alignments... [159.664s]
Processing query chunk 0, reference chunk 0, shape 7, index chunk 1.
Building reference index... [3.05402s]
Building query index... [2.73481s]
Building seed filter... [0.135822s]
Searching alignments... [157.53s]
Processing query chunk 0, reference chunk 0, shape 7, index chunk 2.
Building reference index... [2.03399s]
Building query index... [1.98592s]
Building seed filter... [0.139896s]
Searching alignments... [163.199s]
Processing query chunk 0, reference chunk 0, shape 7, index chunk 3.
Building reference index... [2.51295s]
Building query index... [2.24477s]
Building seed filter... [0.14088s]
Searching alignments... [159.357s]
Processing query chunk 0, reference chunk 0, shape 8, index chunk 0.
Building reference index... [1.83476s]
Building query index... [1.83461s]
Building seed filter... [0.143065s]
Searching alignments... [167.139s]
Processing query chunk 0, reference chunk 0, shape 8, index chunk 1.
Building reference index... [1.99016s]
Building query index... [1.97977s]
Building seed filter... [0.144597s]
Searching alignments... [163.77s]
Processing query chunk 0, reference chunk 0, shape 8, index chunk 2.
Building reference index... [2.99346s]
Building query index... [2.8226s]
Building seed filter... [0.138908s]
Searching alignments... [161.696s]
Processing query chunk 0, reference chunk 0, shape 8, index chunk 3.
Building reference index... [2.26193s]
Building query index... [2.70167s]
Building seed filter... [0.237773s]
Searching alignments... [163.447s]
Processing query chunk 0, reference chunk 0, shape 9, index chunk 0.
Building reference index... [1.76149s]
Building query index... [1.78078s]
Building seed filter... [0.138977s]
Searching alignments... [151.89s]
Processing query chunk 0, reference chunk 0, shape 9, index chunk 1.
Building reference index... [1.91701s]
Building query index... [1.89495s]
Building seed filter... [0.135462s]
Searching alignments... [152.142s]
Processing query chunk 0, reference chunk 0, shape 9, index chunk 2.
Building reference index... [1.95088s]
Building query index... [2.86783s]
Building seed filter... [0.183687s]
Searching alignments... [150.635s]
Processing query chunk 0, reference chunk 0, shape 9, index chunk 3.
Building reference index... [2.64842s]
Building query index... [2.52154s]
Building seed filter... [0.193807s]
Searching alignments... [152.376s]
Processing query chunk 0, reference chunk 0, shape 10, index chunk 0.
Building reference index... [2.63351s]
Building query index... [2.4957s]
Building seed filter... [0.188702s]
Searching alignments... [160.252s]
Processing query chunk 0, reference chunk 0, shape 10, index chunk 1.
Building reference index... [1.9763s]
Building query index... [1.97704s]
Building seed filter... [0.136681s]
Searching alignments... [162.237s]
Processing query chunk 0, reference chunk 0, shape 10, index chunk 2.
Building reference index... [2.75475s]
Building query index... [2.48302s]
Building seed filter... [0.138881s]
Searching alignments... [159.801s]
Processing query chunk 0, reference chunk 0, shape 10, index chunk 3.
Building reference index... [1.77611s]
Building query index... [2.73928s]
Building seed filter... [0.210197s]
Searching alignments... [164.105s]
Processing query chunk 0, reference chunk 0, shape 11, index chunk 0.
Building reference index... [1.76388s]
Building query index... [1.75482s]
Building seed filter... [0.135396s]
Searching alignments... [158.67s]
Processing query chunk 0, reference chunk 0, shape 11, index chunk 1.
Building reference index... [3.03792s]
Building query index... [2.68562s]
Building seed filter... [0.190658s]
Searching alignments... [155.332s]
Processing query chunk 0, reference chunk 0, shape 11, index chunk 2.
Building reference index... [2.71286s]
Building query index... [2.23418s]
Building seed filter... [0.135291s]
Searching alignments... [153.041s]
Processing query chunk 0, reference chunk 0, shape 11, index chunk 3.
Building reference index... [1.72534s]
Building query index... [1.77007s]
Building seed filter... [0.14015s]
Searching alignments... [154.226s]
Processing query chunk 0, reference chunk 0, shape 12, index chunk 0.
Building reference index... [1.84135s]
Building query index... [1.84928s]
Building seed filter... [0.149452s]
Searching alignments... [156.875s]
Processing query chunk 0, reference chunk 0, shape 12, index chunk 1.
Building reference index... [2.80547s]
Building query index... [2.65604s]
Building seed filter... [0.191667s]
Searching alignments... [156.386s]
Processing query chunk 0, reference chunk 0, shape 12, index chunk 2.
Building reference index... [1.9844s]
Building query index... [2.00921s]
Building seed filter... [0.139653s]
Searching alignments... [154.626s]
Processing query chunk 0, reference chunk 0, shape 12, index chunk 3.
Building reference index... [1.7737s]
Building query index... [1.77326s]
Building seed filter... [0.141034s]
Searching alignments... [155.39s]
Processing query chunk 0, reference chunk 0, shape 13, index chunk 0.
Building reference index... [2.5683s]
Building query index... [2.62349s]
Building seed filter... [0.211697s]
Searching alignments...

commented

Hi,wenguang
My project is running slowly. Is there any way to accelerate it?
The sequencing depth of my metagenomic data is not as high, and the selected virus metagenomes are smaller. Can I directly use the metagenomic data as input?

Sorry for my slow response.

The first question you can run our program by multithreading ways. The specific method of multi-threaded running procedures can be referred tohttps://github.com/KennthShang/PhaGCN2.0/issues/3.

The second question, PhaGCN2 does not have the ability to predict whether a sequence is a virus, so I recommend that you use tools like CheckV (https://anaconda.org/bioconda/checkv) determine if a sequence is a virus before using PhaGCN2.0.

Thank you for your question.

All the best,
Wen-Guang