Error: no genomic sequence available (check -g option!).

Question

Error: no genomic sequence available (check -g option!).

raita27 opened this issue 2 years ago · comments

"gffread-0.12.7.Linux_x86_64/gffread" -w "transcripts.fa" -g "genome.fa" "stringtie_merged.gtf"

Hello,

I'm trying to output an isoform nucleotide fasta file for IsoformSwitchAnalyzeR using the above gffread command. However, I run into the following error:

Warning: couldn't find fasta record for 'chr1'!
Error: no genomic sequence available (check -g option!).

I've been able to successfully build a hisat2 gencode version m29 (mouse) index and create the merged gtf file using stringtie. It's weird because I do get a file outputted that's about 70,000 kb, but I'm not sure why the error is occurring. Any advice would be much appreciated!

genome fasta file source: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M29/gencode.vM29.transcripts.fa.gz
genome gtf file source: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M29/gencode.vM29.annotation.gtf.gz

hisat2 version: 2.2.0
stringtie version: 2.2.1

paigeduffin · Answer 1 · Thu Jun 16 2022 04:49:07 GMT+0800 (China Standard Time)

Though details are slightly different, I'm getting the same error!

My code:

ml gffread/0.11.6-GCCcore-8.3.0
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/010/994/315/GCA_010994315.2_ASM1099431v2/GCA_010994315.2_ASM1099431v2_genomic.fna.gz
gunzip GCA_010994315.2_ASM1099431v2_genomic.fna.gz
gffread -w pisaster_transcriptome.fa -g GCA_010994315.1_ASM1099431v1_genomic.fna genome_annotation.gff3

Note: genome_annotation.gff3 is from Dryad published Pisaster ochraceus annotation file here: https://doi.org/10.6071/M3ND50

Error message reads:

FASTA index file GCA_010994315.1_ASM1099431v1_genomic.fna.fai created.
Warning: couldn't find fasta record for 'Sc28pcJ_680'!
Error: no genomic sequence available (check -g option!).

Pulling my hair out over this!

paigeduffin · Answer 2 · Thu Jun 16 2022 04:53:01 GMT+0800 (China Standard Time)

I think I just discovered my problem + the route to solve it by reading over the closed github issues until I found someone who had the same issue: #34

The first comment applies to me- my headers/sequence names DO have spaces, and that's a problem.

Hope this helps you too!

Geo Pertea · Answer 3 · Thu Jun 16 2022 05:16:49 GMT+0800 (China Standard Time)

Sequence names (IDs) cannot have spaces - you did not show the content of the genome_annotation.gff3 file you used there, the first column in there should not have spaces either (header does not matter), and clearly after indexing that genome file (with samtools faidx, the same indexing scheme), I saw there was no such contig/chromosome in there called 'Sc28pcJ_680'.

As for @raita27, it looks like a different issue - they seemed to have tried to use a transcripts fasta file as a genome sequence.

paigeduffin · Answer 4 · Thu Jun 16 2022 10:05:34 GMT+0800 (China Standard Time)

Thanks so much for helping me diagnose my issue @gpertea !

xue-222 · Answer 5 · Wed Apr 26 2023 19:33:09 GMT+0800 (China Standard Time)

"gffread-0.12.7.Linux_x86_64/gffread" -w "transcripts.fa" -g "genome.fa" "stringtie_merged.gtf"

Hello,

I'm trying to output an isoform nucleotide fasta file for IsoformSwitchAnalyzeR using the above gffread command. However, I run into the following error:

Warning: couldn't find fasta record for 'chr1'! Error: no genomic sequence available (check -g option!).

I've been able to successfully build a hisat2 gencode version m29 (mouse) index and create the merged gtf file using stringtie. It's weird because I do get a file outputted that's about 70,000 kb, but I'm not sure why the error is occurring. Any advice would be much appreciated!

genome fasta file source: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M29/gencode.vM29.transcripts.fa.gz genome gtf file source: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M29/gencode.vM29.annotation.gtf.gz

hisat2 version: 2.2.0 stringtie version: 2.2.1

Did you solve it?