All reads (read, startread, endread) were 0 after calculating read information at Step 5

Question

All reads (read, startread, endread) were 0 after calculating read information at Step 5

ptnaimelmm opened this issue a year ago · comments

Hi Nakul,

Thanks for developing this wonderful tool and congratulations on your recent paper on Nature genetics.
I am using your pipeline to analyze my data. I can not move a further step after calculating read information at the Step 5. All read counts were 0 in the filter_read_stats.txt as attached. An error at the next step after running filterReadCandidates.R was listed as follows.

"[mliu63@c720 gtffiles]$ Rscript /home/mliu63/data-jwang8/mliu63/TEProf2Paper/bin/filterReadCandidates.R

Attaching package: ‘Xmisc’

The following object is masked from ‘package:base’:

dir.exists

[1] "Arguments"
[1] "Read Support Minimum in 1 File: 10"
[1] "PE Chimeric Read Support: 1"
[1] "Exonization Percentage Maximum: 0.15"
[1] "Distance Upstream TE must be from start of reference transcript: 2500"
Warning message:
In max(filter_combined_table_final_stat[filter_combined_table_final_stat$uniqid == :
no non-missing arguments to max; returning -Inf
Error in if (row1$strand == "+") { :
missing value where TRUE/FALSE needed
Calls: apply -> FUN -> chooseTopIsoform
Execution halted"

Would you please give any suggestions? Thanks very much in advance.

Best regards,

Mingming

filter_read_stats.txt.txt

Nakul Shah · Answer 1 · Fri Apr 07 2023 18:35:06 GMT+0800 (China Standard Time)

Hello thank you for your comments on the paper!

Hmm, I cannot think of why there would be no reads calculated off the bat.

Can you let me know if you are working with single-end or paired-end sequencing data? Could you also run samtools flagstat on your file and let me know what the output is? http://www.htslib.org/doc/samtools-flagstat.html

Finally, can you let me know the versions of python, bedtools, and R that you are using?

ptnaimelmm · Answer 2 · Sat Apr 08 2023 06:20:05 GMT+0800 (China Standard Time)

Hello thank you for your comments on the paper!

Hmm, I cannot think of why there would be no reads calculated off the bat.

Can you let me know if you are working with single-end or paired-end sequencing data? Could you also run samtools flagstat on your file and let me know what the output is? http://www.htslib.org/doc/samtools-flagstat.html

Finally, can you let me know the versions of python, bedtools, and R that you are using?

Hi Nakul,

Thank you so much for your quick reply.
You questions triggered me to find the lack of bedtools package in my conda environment, which was built with the TEProf2.yml. I happened to find the omittance of bedtools package in the TEProf2.yml file. The lack of bedtools led to the failure of the read counting instead of an error message. I got counted reads after I installed bedtools and ran the pipeline smoothly till now. Thanks again for developing this wonderful package.

Best regards,

Mingming

Nakul Shah · Answer 3 · Sat Apr 08 2023 10:16:50 GMT+0800 (China Standard Time)

Glad to hear it worked!