snakemake-workflows / rna-seq-star-deseq2

RNA-seq workflow using STAR and DESeq2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing input files for rule rseqc_gtf2bed

EasyPiPi opened this issue · comments

When I try to run the pipeline, I get the error:

MissingInputException in line 3 of /home/yixin/Desktop/snakemake_workflow/rna-seq-star-deseq2-1.0.0/rules/qc.smk:
Missing input files for rule rseqc_gtf2bed:
~/Desktop/project_data/Ensembl/index/star/dmel/annotation.gtf

However, I check the file and it is indeed there.

head ~/Desktop/project_data/Ensembl/index/star/dmel/annotation.gtf
#!genome-build BDGP6.22
#!genome-version BDGP6.22
#!genome-build-accession GCA_000001215.4
3R	FlyBase	gene	567076	2532932	.	+	.	gene_id "FBgn0267431"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding";
3R	FlyBase	transcript	567076	2532932	.	+	.	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding";
3R	FlyBase	exon	567076	567268	.	+	.	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; exon_number "1"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding"; exon_id "FBtr0392909-E1";
3R	FlyBase	exon	835376	835491	.	+	.	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; exon_number "2"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding"; exon_id "FBtr0392909-E2";
3R	FlyBase	CDS	835378	835491	.	+	0	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; exon_number "2"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding"; protein_id "FBpp0352251";
3R	FlyBase	start_codon	835378	835380	.	+	0	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; exon_number "2"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding";
3R	FlyBase	exon	869486	869548	.	+	.	gene_id "FBgn0267431"; transcript_id "FBtr0392909"; exon_number "3"; gene_name "Myo81F"; gene_source "FlyBase"; gene_biotype "protein_coding"; transcript_name "Myo81F-RB"; transcript_source "FlyBase"; transcript_biotype "protein_coding"; exon_id "FBtr0392909-E3";

Any hints would be highly appreciated.

Here is my config.yaml:

# path or URL to sample sheet (TSV format, columns: sample, condition, ...)
samples: samples.tsv
# path or URL to sequencing unit sheet (TSV format, columns: sample, unit, fq1, fq2)
# Units are technical replicates (e.g. lanes, or resequencing of the same biological
# sample).
units: units.tsv

# the sequencing adapter
adapter: ACGGATCGATCGATCGATCGAT

ref:
  # the STAR index
  index: "~/Desktop/project_data/Ensembl/index/star/dmel"
  # gtf file with transcripts
  annotation: "~/Desktop/project_data/Ensembl/index/star/dmel/annotation.gtf"

pca:
  labels:
    # columns of sample sheet to use for PCA
    - condition

diffexp:
  # contrasts for the deseq2 results method
  contrasts:
    treated-vs-untreated:
      - control
      - miR983_mutant

params:
  star: ""
  cutadapt-se: ""
  cutadapt-pe: ""

Snakemake does not resolve ~, because it is platform specific. But this is actually a good catch, it should print a better error message in such a case, telling you that ~ is not allowed.

UPdated the error message in the master branch of snakemake.