outFileNamePrefix is ignored when running in genomeGenerate runMode
Stikus opened this issue · comments
Hello, thanks for great tool.
We are trying to implement STAR in our pipeline and found strange thing - when we're using genomeGenerate
runMode
parameter outFileNamePrefix
is not working properly:
/ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Log.out
content:
STAR version=2.7.11b
STAR compilation time,server,dir=2024-01-25T16:12:02-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
STAR git: On branch master ; commit a72e5fa27331108f524211d667949dc5ff4072e8 ; diff files: CHANGES.md README.md doc/STARmanual.pdf extras/doc-latex/STARmanual.tex extras/doc-latex/parametersDefault.tex extras/docker/Dockerfile source/VERSION
##### Command Line:
/soft/STAR-2.7.11b/bin/Linux_x86_64/STAR --runThreadN 192 --runMode genomeGenerate --genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099 --genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa --sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf --sjdbOverhang 99 --outFileNamePrefix /ref/STAR/GRCh38.d
1.vd1_gencode.v22.annotation_index_STAR099/Test__
##### Initial USER parameters from Command Line:
outFileNamePrefix /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Test__
###### All USER parameters from Command Line:
runThreadN 192 ~RE-DEFINED
runMode genomeGenerate ~RE-DEFINED
genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099 ~RE-DEFINED
genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa ~RE-DEFINED
sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf ~RE-DEFINED
sjdbOverhang 99 ~RE-DEFINED
outFileNamePrefix /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Test__ ~RE-DEFINED
##### Finished reading parameters from all sources
##### Final user re-defined parameters-----------------:
runMode genomeGenerate
runThreadN 192
genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099
genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa
outFileNamePrefix /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Test__
sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf
sjdbOverhang 99
-------------------------------
##### Final effective command line:
/soft/STAR-2.7.11b/bin/Linux_x86_64/STAR --runMode genomeGenerate --runThreadN 192 --genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099 --genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa --outFileNamePrefix /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Test__
--sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf --sjdbOverhang 99
----------------------------------------
Number of fastq files for each mate = 1
ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters: 3000 0.99 10
Finished loading and checking parameters
--genomeDir directory exists and will be overwritten: /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/
As you can see - outFileNamePrefix
we use is /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Test__
and it is parsed, but log is named Log.out
and not Test__Log.out
, like Test___STARtmp
.
If we run command with local prefix:
/soft/STAR-2.7.11b/bin/Linux_x86_64/STAR --runThreadN 192 --runMode genomeGenerate --genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099 --genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa --sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf --sjdbOverhang 99 --outFileNamePrefix Test__
We get warning message:
/soft/STAR-2.7.11b/bin/Linux_x86_64/STAR --runThreadN 192 --runMode genomeGenerate --genomeDir /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099 --genomeFastaFiles /ref/GRCh38.d1.vd1/GRCh38.d1.vd1.fa --sjdbGTFfile /ref/gtf/gencode.v22.annotation.gtf --sjdbOverhang 99 --outFileNamePrefix Test__
STAR version: 2.7.11b compiled: 2024-01-25T16:12:02-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
Feb 02 17:57:30 ..... started STAR run
!!!!! WARNING: Could not move Log.out file from Test__Log.out into /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Log.out. Will keep Test__Log.out
And log stay in run directory. In other mods we don't have this problem.
https://github.com/alexdobin/STAR/blob/master/source/Genome_genomeGenerate.cpp#L101-L111 - looks like this is part of code about moving Log.out
and there is not any outFileNamePrefix
mention:
createDirectory(pGe.gDir, P.runDirPerm, "--genomeDir", P);
{//move Log.out file into genome directory
string logfn=pGe.gDir+"Log.out";
if ( rename( P.outLogFileName.c_str(), logfn.c_str() ) ) {
warningMessage("Could not move Log.out file from " + P.outLogFileName + " into " + logfn + ". Will keep " + P.outLogFileName +"\n", \
std::cerr, P.inOut->logMain, P);
} else {
P.outLogFileName=logfn;
};
};
Hi @Stikus
--outFileNamePrefix
is not used for genome index output, it uses only the path in --genomeDir
.
@alexdobin But --outFileNamePrefix
is still used for STARtmp
directory, even for genome indexing - as you can see on first screen. Why it is not used for Log.out
?
Moreover - as you can see in local prefix example:
!!!!! WARNING: Could not move Log.out file from Test__Log.out into /ref/STAR/GRCh38.d1.vd1_gencode.v22.annotation_index_STAR099/Log.out. Will keep Test__Log.out
--outFileNamePrefix Test__
is somehow used. maybe due to this part of code:
https://github.com/alexdobin/STAR/blob/master/source/Parameters.cpp#L369-L370
outLogFileName=outFileNamePrefix + "Log.out";
inOut->logMain.open(outLogFileName.c_str());
What do you suggest? Do not use --outFileNamePrefix
entirely in genome index command? Is there any way to define 'Log.out' name/prefix for genome indexing? Or should we use manual renaming?
For genome generation, the easiest way is to create the genome directory, cd to it, and run STAR from there with --genomeDir ./
and without --outFileNamePrefix
.
Ok, if this is intended - you can close this issue, I was asking due to inconsistent behavior of --outFileNamePrefix
option.