shendurelab / MIPGEN

One stop MIP design and analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

collapsing has terminated

LINLIUNIG opened this issue · comments

Hi Evan,
I recently ran over and over the builder.py scripts and used the following sh script to run for one sample
script_dir=/home/llin/MIPGEN/tools/
fqprefix=/home/llin/fastqfile/trimmedto90/trim90/test/sampletest
pear_read1=/home/llin/fastqfile/trimmedto90/trim90/test/S1R1_90.fastq
pear_read2=/home/llin/fastqfile/trimmedto90/trim90/test/S1R2_90.fastq
mips=/home/llin/fastqfile/trimmedto90/trim90/MIPDE.txt
ext_tag_size=5
lig_tag_size=0
cut_read1=/home/llin/fastqfile/trimmedto90/trim90/test/sampletest.assembled.fastq
gref=/home/llin/fastqfile/trimmedto90/trim90/hsindex/hs.fsa
bamprefix=/home/llin/fastqfile/trimmedto90/trim90/test/sampletest.indexed.sort
pear -f $pear_read1 -r $pear_read2 -o $fqprefix &&
python ${script_dir}mipgen_fq_cutter_se.py $cut_read1 -m ${lig_tag_size},${ext_tag_size} -o $fqprefix &&
bwa mem $gref ${fqprefix}.indexed.fq > ${fqprefix}.indexed.sam &&
samtools view -bS ${fqprefix}.indexed.sam | samtools sort - $bamprefix &&
samtools view -h $bamprefix.bam | python ${script_dir}mipgen_smmip_collapser.py 5 $bamprefix.collapse -m $mips -f 1 -s &&
echo "analysis commands have terminated (successfully or otherwise)"

The Programm always says "collapsing has terminated". The thing I didn't quite get is that all my target sequences went to the file called "*collapse.discordant_arms.sam", They should be the on-site target! Therefore the statistic file are empty. Could you please give me some hint how this could happen?
Many thanks,
Li

I'm not sure -- sometimes MIP probes are low quality/truncated, sometimes the capture fails because of low effective enzyme concentration, and sometimes sequencing quality is low. I would check that the fastq reads actually look promising.

You can also use a newer pipeline to process the data (the basic scripts I wrote are pretty old): https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007956