twlab / TEProf2Paper

TEProf2 Pipeline used to find promoters and predict protein sequences from RNA-sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

All reads (read, startread, endread) were 0 after calculating read information at Step 5

ptnaimelmm opened this issue · comments

Hi Nakul,

Thanks for developing this wonderful tool and congratulations on your recent paper on Nature genetics.
I am using your pipeline to analyze my data. I can not move a further step after calculating read information at the Step 5. All read counts were 0 in the filter_read_stats.txt as attached. An error at the next step after running filterReadCandidates.R was listed as follows.

"[mliu63@c720 gtffiles]$ Rscript /home/mliu63/data-jwang8/mliu63/TEProf2Paper/bin/filterReadCandidates.R

Attaching package: ‘Xmisc’

The following object is masked from ‘package:base’:

dir.exists

[1] "Arguments"
[1] "Read Support Minimum in 1 File: 10"
[1] "PE Chimeric Read Support: 1"
[1] "Exonization Percentage Maximum: 0.15"
[1] "Distance Upstream TE must be from start of reference transcript: 2500"
Warning message:
In max(filter_combined_table_final_stat[filter_combined_table_final_stat$uniqid == :
no non-missing arguments to max; returning -Inf
Error in if (row1$strand == "+") { :
missing value where TRUE/FALSE needed
Calls: apply -> FUN -> chooseTopIsoform
Execution halted"

Would you please give any suggestions? Thanks very much in advance.

Best regards,

Mingming

filter_read_stats.txt.txt

Hello thank you for your comments on the paper!

Hmm, I cannot think of why there would be no reads calculated off the bat.

Can you let me know if you are working with single-end or paired-end sequencing data? Could you also run samtools flagstat on your file and let me know what the output is? http://www.htslib.org/doc/samtools-flagstat.html

Finally, can you let me know the versions of python, bedtools, and R that you are using?

Hello thank you for your comments on the paper!

Hmm, I cannot think of why there would be no reads calculated off the bat.

Can you let me know if you are working with single-end or paired-end sequencing data? Could you also run samtools flagstat on your file and let me know what the output is? http://www.htslib.org/doc/samtools-flagstat.html

Finally, can you let me know the versions of python, bedtools, and R that you are using?

Hi Nakul,

Thank you so much for your quick reply.
You questions triggered me to find the lack of bedtools package in my conda environment, which was built with the TEProf2.yml. I happened to find the omittance of bedtools package in the TEProf2.yml file. The lack of bedtools led to the failure of the read counting instead of an error message. I got counted reads after I installed bedtools and ran the pipeline smoothly till now. Thanks again for developing this wonderful package.

Best regards,

Mingming

Glad to hear it worked!