STOmics / SAW

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Building genome index references are not correct

gringer opened this issue · comments

The genome index run script doesn't work as written for two reasons:

  1. The base directory doesn't include the $refName directory created in the previous step (which has genes and genome as subdirectories)
  2. STAR v2.7.2b (and whatever has been created from it here) won't automatically create the target directory. This has been fixed in later versions of STAR.

https://github.com/STOmics/SAW/blob/411bab897e0d4642715f1f7b60780b545fb21d12/Scripts/pre_buildIndexedRef/README.md?plain=1#L28C1-L34C1

Here is what I ran to get the genome index building working for me:

referenceDir=/mnt/md0/deccles/STAR/gencode_M34_SJ100
mkdir -p ${referenceDir}
export SINGULARITY_BIND=${referenceDir}
# download reference annotation ("This is the main annotation file for most users")
cd ${referenceDir}; mkdir genes; cd genes
wget 'https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M34/gencode.vM34.basic.annotation.gtf.gz'
gunzip gencode.vM34.basic.annotation.gtf.gz
# download genome
cd ${referenceDir}; mkdir genome; cd genome
wget 'https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M34/GRCm39.genome.fa.gz'
gunzip GRCm39.genome.fa.gz
# Generate index
mkdir STAR_SJ100
singularity exec /mnt/md0/deccles/singularityImages/SAW_7.0.sif mapping --runMode genomeGenerate \
  --genomeDir ${referenceDir}/STAR_SJ100 \
  --genomeFastaFiles ${referenceDir}/genome/GRCm39.genome.fa \
  --sjdbGTFfile ${referenceDir}/genes/gencode.vM34.basic.annotation.gtf
  --sjdbOverhang 99 --runThreadN 12

Thank you for your suggestion. We will update it in the future.