broadinstitute / Drop-seq

Java tools for analyzing Drop-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DigitalExpression discrepency v.1.13 -> v.2.0.0

Hoohm opened this issue · comments

I'm testing v2.0.0 to integrate it in my pipeline but I'm having different results of umi counts.

Here is some sampled data. The sam file is the final step. Only one cell is in it.
sample.txt

The file is in txt because of githubs restriction to txt.

This is the command I use:
DigitalExpression I=sample.sam O=res.tsv NUM_CORE_BARCODES=1

With version 1.13 I get:

GENE	ATGCG
Gm16547	1
Gm7266	1
Rps27-ps1	1
Snrpe	1
Txn-ps1	1

With version 2.0.0 I get:

GENE	ATGCG

I know that counting should be more restrictive, but I don't think this is the step that deals with the stringency. Am I missing something?

To make sure: you're saying you only get the header output with no results?

Exactly

You were right about the tagging. I was using the old version. I've gone back to the TagReadWithGeneFunction. I don't get the new tags in the bam file. For some reason the new tags don't show up.
Here is one read as example. Before tagging with TagReadWithGeneFunction

HISEQ:185:H5VVMBCXY:1:1101:4571:2619	16	1	3595162	0	29S16M	*	0	0	CTATGGACCTAGACACTGCTCGCTCCCATCCATTAGATCCTGAAG	GGAGIIIGGGAIIIGGGGGGGGAGAGGAGAGG<GGIGGGIGAAGG	XC:Z:GAGTTMD:Z:16	PG:Z:STAR	RG:Z:A	NH:i:5	NM:i:0	XM:Z:TACACACAGA	UQ:i:0	AS:i:15

After:

HISEQ:185:H5VVMBCXY:1:1101:4571:2619	16	1	3595162	0	29S16M	*	0	0	CTATGGACCTAGACACTGCTCGCTCCCATCCATTAGATCCTGAAG	GGAGIIIGGGAIIIGGGGGGGGAGAGGAGAGG<GGIGGGIGAAGG	XC:Z:GAGTTMD:Z:16	GE:Z:Gm38148	XF:Z:CODING	PG:Z:STAR	RG:Z:A	NH:i:5	NM:i:0	XM:Z:TACACACAGA	UQ:i:0	AS:i:15	GS:Z:-

Command used:

TagReadWithGeneExonFunction INPUT=data/sample2.Aligned.merged.bam OUTPUT=data/sample2_gene_exon_tagged.bam ANNOTATIONS_FILE=ref/annotation.chr1.refFlat

I'll keep digging into it using the test data provided.
Is the reflat generation also altered with v2.0.0?

Ok. My bad, found out that I was using TagReadWithGeneExonFunction instead of TagReadWithGeneFunction.