NationalGenomicsInfrastructure / piper

A genomics pipeline build on top of the GATK Queue framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sthlm2UUSNP problem with dual-indexing

vezzi opened this issue · comments

Hej Johan,
is looks like that sthl2UUSNP does not like dual indexing in the name.

This is how one of the M.Kaller_14_06 samples looks like

tree P1171_102/
P1171_102/
`-- A
    `-- 140702_AC41A2ANXX
        |-- P1171_102_ATTCAGAA-CCTATCCT_L001_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L001_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L002_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L002_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L003_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L003_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L004_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L004_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L005_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L005_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L006_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L006_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L007_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L007_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L008_R1_001.fastq.gz
        `-- P1171_102_ATTCAGAA-CCTATCCT_L008_R2_001.fastq.gz

when I try to convert this project/sample in UUSNPSEQ format I get the following:

sthlm2UUSNP -i /proj/a2010002/nobackup/NGI/analysis_ready/DATA/M.Kaller_14_06/ -o /proj/a2010002/nobackup/NGI/analysis_ready/ANALYSIS/M.Kaller_14_06_UUSNPException in thread "main" java.lang.IllegalArgumentException: requirement failed: Just one sample hit should be possible for regexp, found: 0
        at scala.Predef$.require(Predef.scala:233)
        at molmed.apps.Sthlm2UUSNP$.parseSampleInfoFromFileHierarchy(Sthlm2UUSNP.scala:164)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$9.apply(Sthlm2UUSNP.scala:241)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$9.apply(Sthlm2UUSNP.scala:240)
        at scala.collection.immutable.Stream.map(Stream.scala:376)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2.apply(Sthlm2UUSNP.scala:240)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2.apply(Sthlm2UUSNP.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1.apply(Sthlm2UUSNP.scala:234)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1.apply(Sthlm2UUSNP.scala:233)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1.apply(Sthlm2UUSNP.scala:233)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1.apply(Sthlm2UUSNP.scala:232)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$.runApp(Sthlm2UUSNP.scala:232)
        at molmed.apps.Sthlm2UUSNP$$anonfun$4.apply(Sthlm2UUSNP.scala:39)
        at molmed.apps.Sthlm2UUSNP$$anonfun$4.apply(Sthlm2UUSNP.scala:37)
        at scala.Option.map(Option.scala:145)
        at molmed.apps.Sthlm2UUSNP$delayedInit$body.apply(Sthlm2UUSNP.scala:37)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
        at scala.App$class.main(App.scala:71)
        at molmed.apps.Sthlm2UUSNP$.main(Sthlm2UUSNP.scala:16)
        at molmed.apps.Sthlm2UUSNP.main(Sthlm2UUSNP.scala)

if I remove the double indexing leaving only the first part of it the tool works fine.

It is probably only a matter of change a reg expr.

Yes. This is a problematic regexp. I'll fix it.

This is fixed in 094cafd.