broadinstitute / Drop-seq

Java tools for analyzing Drop-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AssignCellsToSamples Empty Output

drneavin opened this issue · comments

Affected tool(s)

AssignCellsToSamples

Affected version(s)

  • Latest public release version [Version:2.5.1(680c2ea_1642084299)]

Description

This might not be a bug, but I can't nail down what might be causing this issue. After running AssignCellsToSamples on my vcf and bam file, I receive empty outputs. Both have been aligned to hg38 and are using the same chr encoding for chromosomes. At the end of the run it reports Processed [0] SNPs in BAM + VCF but I'm unclear why there would be no overlap between them as I have used this bam and vcf for other demultiplexing tools. Here's the complete log:

[Tue Oct 11 10:37:04 AEDT 2022] AssignCellsToSamples --INPUT_BAM possorted_genome_bam.bam --VCF MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf --OUTPUT assignments.tsv.gz --VCF_OUTPUT out_vcf.vcf --CELL_BARCODE_TAG CB --MOLECULAR_BARCODE_TAG UB --CELL_BC_FILE barcodes.tsv --SAMPLE_FILE Individuals.txt --FUNCTION_TAG XF --EDIT_DISTANCE 1 --READ_MQ 10 --GQ_THRESHOLD 30 --RETAIN_MONOMORPIC_SNPS false --FRACTION_SAMPLES_PASSING 0.5 --IGNORED_CHROMOSOMES X --IGNORED_CHROMOSOMES Y --IGNORED_CHROMOSOMES MT --ADD_MISSING_VALUES true --DNA_MODE false --SNP_LOG_RATE 1000 --GENE_NAME_TAG gn --GENE_STRAND_TAG gs --GENE_FUNCTION_TAG gf --STRAND_STRATEGY SENSE --LOCUS_FUNCTION_LIST CODING --LOCUS_FUNCTION_LIST UTR --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Tue Oct 11 10:37:04 AEDT 2022] Executing as drenea@zeta-4-27.local on Linux 3.10.0-1160.42.2.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_101-b13; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.5.1(680c2ea_1642084299)
INFO    2022-10-11 10:37:04     AssignCellsToSamples    Number of contigs in common: 24.
Contigs only in BAM INPUT(S): GL000008.2, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000208.1, GL000213.1, GL000214.1, GL000216.2, GL000218.1, GL000219.1, GL000220.1, GL000221.1, GL000224.1, GL000225.1, GL000226.1, KI270302.1, KI270303.1, KI270304.1, KI270305.1, KI270310.1, KI270311.1, KI270312.1, KI270315.1, KI270316.1, KI270317.1, KI270320.1, KI270322.1, KI270329.1, KI270330.1, KI270333.1, KI270334.1, KI270335.1, KI270336.1, KI270337.1, KI270338.1, KI270340.1, KI270362.1, KI270363.1, KI270364.1, KI270366.1, KI270371.1, KI270372.1, KI270373.1, KI270374.1, KI270375.1, KI270376.1, KI270378.1, KI270379.1, KI270381.1, KI270382.1, KI270383.1, KI270384.1, KI270385.1, KI270386.1, KI270387.1, KI270388.1, KI270389.1, KI270390.1, KI270391.1, KI270392.1, KI270393.1, KI270394.1, KI270395.1, KI270396.1, KI270411.1, KI270412.1, KI270414.1, KI270417.1, KI270418.1, KI270419.1, KI270420.1, KI270422.1, KI270423.1, KI270424.1, KI270425.1, KI270429.1, KI270435.1, KI270438.1, KI270442.1, KI270448.1, KI270465.1, KI270466.1, KI270467.1, KI270468.1, KI270507.1, KI270508.1, KI270509.1, KI270510.1, KI270511.1, KI270512.1, KI270515.1, KI270516.1, KI270517.1, KI270518.1, KI270519.1, KI270521.1, KI270522.1, KI270528.1, KI270529.1, KI270530.1, KI270538.1, KI270539.1, KI270544.1, KI270548.1, KI270579.1, KI270580.1, KI270581.1, KI270582.1, KI270583.1, KI270584.1, KI270587.1, KI270588.1, KI270589.1, KI270590.1, KI270591.1, KI270593.1, KI270706.1, KI270707.1, KI270708.1, KI270709.1, KI270710.1, KI270711.1, KI270712.1, KI270713.1, KI270714.1, KI270715.1, KI270716.1, KI270717.1, KI270718.1, KI270719.1, KI270720.1, KI270721.1, KI270722.1, KI270723.1, KI270724.1, KI270725.1, KI270726.1, KI270727.1, KI270728.1, KI270729.1, KI270730.1, KI270731.1, KI270732.1, KI270733.1, KI270734.1, KI270735.1, KI270736.1, KI270737.1, KI270738.1, KI270739.1, KI270740.1, KI270741.1, KI270742.1, KI270743.1, KI270744.1, KI270745.1, KI270746.1, KI270747.1, KI270748.1, KI270749.1, KI270750.1, KI270751.1, KI270752.1, KI270753.1, KI270754.1, KI270755.1, KI270756.1, KI270757.1, MT
Contigs only in MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf: M

INFO    2022-10-11 10:37:04     AssignCellsToSamples    Genotype Quality [GQ] not found in header.  Disabling GQ_THRESHOLD parameter
INFO    2022-10-11 10:37:04     AssignCellsToSamples    Found 7 samples in VCF and requested sample list out of 7 requested
INFO    2022-10-11 10:37:04     AssignCellsToSamples    Found 7 samples in VCF and requested sample list out of 7 requested
INFO    2022-10-11 10:37:04     AssignCellsToSamples    Genotype Quality Filter disabled.  Enabling A/T, C/G SNP Filter to eliminate potential allele flipping variants
INFO    2022-10-11 10:37:04     AssignCellsToSamples    Scanning VCF to find potential SNP sites
INFO    2022-10-11 10:37:26     AssignCellsToSamples    Found [127971] potential SNP sites to query.
INFO    2022-10-11 10:37:26     AssignCellsToSamples    Found 7 samples in VCF and requested sample list out of 7 requested
INFO    2022-10-11 10:37:26     AssignCellsToSamples    Genotype Quality Filter disabled.  Enabling A/T, C/G SNP Filter to eliminate potential allele flipping variants
INFO    2022-10-11 10:37:26     AssignCellsToSamples    Found 912 cell barcodes in file
INFO    2022-10-11 10:37:31     SNPUMIBasePileupIterator        Processed     1,000,000 records.  Elapsed time: 00:00:04s.  Time for last 1,000,000:    4s.  Last read position: 1:23,695,822
INFO    2022-10-11 10:37:34     SNPUMIBasePileupIterator        Processed     2,000,000 records.  Elapsed time: 00:00:08s.  Time for last 1,000,000:    3s.  Last read position: 1:42,702,133
INFO    2022-10-11 10:37:38     SNPUMIBasePileupIterator        Processed     3,000,000 records.  Elapsed time: 00:00:12s.  Time for last 1,000,000:    3s.  Last read position: 1:85,583,464
INFO    2022-10-11 10:37:42     SNPUMIBasePileupIterator        Processed     4,000,000 records.  Elapsed time: 00:00:15s.  Time for last 1,000,000:    3s.  Last read position: 1:113,982,031
INFO    2022-10-11 10:37:45     SNPUMIBasePileupIterator        Processed     5,000,000 records.  Elapsed time: 00:00:19s.  Time for last 1,000,000:    3s.  Last read position: 1:153,543,642
INFO    2022-10-11 10:37:49     SNPUMIBasePileupIterator        Processed     6,000,000 records.  Elapsed time: 00:00:23s.  Time for last 1,000,000:    3s.  Last read position: 1:165,869,748
INFO    2022-10-11 10:37:53     SNPUMIBasePileupIterator        Processed     7,000,000 records.  Elapsed time: 00:00:27s.  Time for last 1,000,000:    3s.  Last read position: 10:1,040,160
INFO    2022-10-11 10:37:57     SNPUMIBasePileupIterator        Processed     8,000,000 records.  Elapsed time: 00:00:31s.  Time for last 1,000,000:    3s.  Last read position: 10:17,237,440
INFO    2022-10-11 10:38:01     SNPUMIBasePileupIterator        Processed     9,000,000 records.  Elapsed time: 00:00:35s.  Time for last 1,000,000:    3s.  Last read position: 10:96,567,344
INFO    2022-10-11 10:38:05     SNPUMIBasePileupIterator        Processed    10,000,000 records.  Elapsed time: 00:00:38s.  Time for last 1,000,000:    3s.  Last read position: 11:866,768
INFO    2022-10-11 10:38:09     SNPUMIBasePileupIterator        Processed    11,000,000 records.  Elapsed time: 00:00:42s.  Time for last 1,000,000:    3s.  Last read position: 11:33,708,963
INFO    2022-10-11 10:38:13     SNPUMIBasePileupIterator        Processed    12,000,000 records.  Elapsed time: 00:00:46s.  Time for last 1,000,000:    4s.  Last read position: 11:61,964,676
INFO    2022-10-11 10:38:17     SNPUMIBasePileupIterator        Processed    13,000,000 records.  Elapsed time: 00:00:50s.  Time for last 1,000,000:    4s.  Last read position: 11:62,690,367
INFO    2022-10-11 10:38:20     SNPUMIBasePileupIterator        Processed    14,000,000 records.  Elapsed time: 00:00:54s.  Time for last 1,000,000:    3s.  Last read position: 11:65,499,647
INFO    2022-10-11 10:38:24     SNPUMIBasePileupIterator        Processed    15,000,000 records.  Elapsed time: 00:00:58s.  Time for last 1,000,000:    3s.  Last read position: 11:75,404,743
INFO    2022-10-11 10:38:28     SNPUMIBasePileupIterator        Processed    16,000,000 records.  Elapsed time: 00:01:02s.  Time for last 1,000,000:    3s.  Last read position: 12:1,647,006
INFO    2022-10-11 10:38:32     SNPUMIBasePileupIterator        Processed    17,000,000 records.  Elapsed time: 00:01:06s.  Time for last 1,000,000:    3s.  Last read position: 12:48,938,714
INFO    2022-10-11 10:38:36     SNPUMIBasePileupIterator        Processed    18,000,000 records.  Elapsed time: 00:01:10s.  Time for last 1,000,000:    4s.  Last read position: 12:56,043,444
INFO    2022-10-11 10:38:40     SNPUMIBasePileupIterator        Processed    19,000,000 records.  Elapsed time: 00:01:14s.  Time for last 1,000,000:    3s.  Last read position: 12:66,057,608
INFO    2022-10-11 10:38:44     SNPUMIBasePileupIterator        Processed    20,000,000 records.  Elapsed time: 00:01:18s.  Time for last 1,000,000:    3s.  Last read position: 12:111,843,126
INFO    2022-10-11 10:38:48     SNPUMIBasePileupIterator        Processed    21,000,000 records.  Elapsed time: 00:01:22s.  Time for last 1,000,000:    3s.  Last read position: 12:124,911,844
INFO    2022-10-11 10:38:52     SNPUMIBasePileupIterator        Processed    22,000,000 records.  Elapsed time: 00:01:25s.  Time for last 1,000,000:    3s.  Last read position: 13:75,526,600
INFO    2022-10-11 10:38:55     SNPUMIBasePileupIterator        Processed    23,000,000 records.  Elapsed time: 00:01:29s.  Time for last 1,000,000:    3s.  Last read position: 14:65,075,265
INFO    2022-10-11 10:38:59     SNPUMIBasePileupIterator        Processed    24,000,000 records.  Elapsed time: 00:01:33s.  Time for last 1,000,000:    3s.  Last read position: 15:40,036,151
INFO    2022-10-11 10:39:03     SNPUMIBasePileupIterator        Processed    25,000,000 records.  Elapsed time: 00:01:37s.  Time for last 1,000,000:    3s.  Last read position: 15:60,347,492
INFO    2022-10-11 10:39:07     SNPUMIBasePileupIterator        Processed    26,000,000 records.  Elapsed time: 00:01:41s.  Time for last 1,000,000:    3s.  Last read position: 15:72,199,203
INFO    2022-10-11 10:39:11     SNPUMIBasePileupIterator        Processed    27,000,000 records.  Elapsed time: 00:01:45s.  Time for last 1,000,000:    4s.  Last read position: 16:1,962,138
INFO    2022-10-11 10:39:15     SNPUMIBasePileupIterator        Processed    28,000,000 records.  Elapsed time: 00:01:49s.  Time for last 1,000,000:    4s.  Last read position: 16:18,784,747
INFO    2022-10-11 10:39:19     SNPUMIBasePileupIterator        Processed    29,000,000 records.  Elapsed time: 00:01:53s.  Time for last 1,000,000:    3s.  Last read position: 16:81,084,481
INFO    2022-10-11 10:39:23     SNPUMIBasePileupIterator        Processed    30,000,000 records.  Elapsed time: 00:01:57s.  Time for last 1,000,000:    4s.  Last read position: 17:4,945,715
INFO    2022-10-11 10:39:27     SNPUMIBasePileupIterator        Processed    31,000,000 records.  Elapsed time: 00:02:01s.  Time for last 1,000,000:    3s.  Last read position: 17:19,446,180
INFO    2022-10-11 10:39:31     SNPUMIBasePileupIterator        Processed    32,000,000 records.  Elapsed time: 00:02:05s.  Time for last 1,000,000:    3s.  Last read position: 17:41,690,832
INFO    2022-10-11 10:39:35     SNPUMIBasePileupIterator        Processed    33,000,000 records.  Elapsed time: 00:02:09s.  Time for last 1,000,000:    3s.  Last read position: 17:75,135,402
INFO    2022-10-11 10:39:39     SNPUMIBasePileupIterator        Processed    34,000,000 records.  Elapsed time: 00:02:13s.  Time for last 1,000,000:    3s.  Last read position: 18:12,326,486
INFO    2022-10-11 10:39:43     SNPUMIBasePileupIterator        Processed    35,000,000 records.  Elapsed time: 00:02:17s.  Time for last 1,000,000:    4s.  Last read position: 19:2,272,974
INFO    2022-10-11 10:39:47     SNPUMIBasePileupIterator        Processed    36,000,000 records.  Elapsed time: 00:02:21s.  Time for last 1,000,000:    3s.  Last read position: 19:13,778,176
INFO    2022-10-11 10:39:51     SNPUMIBasePileupIterator        Processed    37,000,000 records.  Elapsed time: 00:02:25s.  Time for last 1,000,000:    3s.  Last read position: 19:37,564,365
INFO    2022-10-11 10:39:55     SNPUMIBasePileupIterator        Processed    38,000,000 records.  Elapsed time: 00:02:29s.  Time for last 1,000,000:    3s.  Last read position: 19:48,330,370
INFO    2022-10-11 10:39:59     SNPUMIBasePileupIterator        Processed    39,000,000 records.  Elapsed time: 00:02:33s.  Time for last 1,000,000:    3s.  Last read position: 19:48,966,653
INFO    2022-10-11 10:40:03     SNPUMIBasePileupIterator        Processed    40,000,000 records.  Elapsed time: 00:02:37s.  Time for last 1,000,000:    3s.  Last read position: 19:49,491,477
INFO    2022-10-11 10:40:07     SNPUMIBasePileupIterator        Processed    41,000,000 records.  Elapsed time: 00:02:41s.  Time for last 1,000,000:    4s.  Last read position: 2:3,576,624
INFO    2022-10-11 10:40:11     SNPUMIBasePileupIterator        Processed    42,000,000 records.  Elapsed time: 00:02:45s.  Time for last 1,000,000:    3s.  Last read position: 2:55,021,586
INFO    2022-10-11 10:40:15     SNPUMIBasePileupIterator        Processed    43,000,000 records.  Elapsed time: 00:02:48s.  Time for last 1,000,000:    3s.  Last read position: 2:96,593,401
INFO    2022-10-11 10:40:19     SNPUMIBasePileupIterator        Processed    44,000,000 records.  Elapsed time: 00:02:52s.  Time for last 1,000,000:    3s.  Last read position: 2:180,872,729
INFO    2022-10-11 10:40:23     SNPUMIBasePileupIterator        Processed    45,000,000 records.  Elapsed time: 00:02:56s.  Time for last 1,000,000:    3s.  Last read position: 2:237,098,356
INFO    2022-10-11 10:40:26     SNPUMIBasePileupIterator        Processed    46,000,000 records.  Elapsed time: 00:03:00s.  Time for last 1,000,000:    3s.  Last read position: 20:44,196,917
INFO    2022-10-11 10:40:30     SNPUMIBasePileupIterator        Processed    47,000,000 records.  Elapsed time: 00:03:04s.  Time for last 1,000,000:    3s.  Last read position: 21:34,239,107
INFO    2022-10-11 10:40:34     SNPUMIBasePileupIterator        Processed    48,000,000 records.  Elapsed time: 00:03:08s.  Time for last 1,000,000:    3s.  Last read position: 22:37,678,541
INFO    2022-10-11 10:40:38     SNPUMIBasePileupIterator        Processed    49,000,000 records.  Elapsed time: 00:03:12s.  Time for last 1,000,000:    3s.  Last read position: 22:39,313,213
INFO    2022-10-11 10:40:42     SNPUMIBasePileupIterator        Processed    50,000,000 records.  Elapsed time: 00:03:16s.  Time for last 1,000,000:    3s.  Last read position: 3:23,919,227
INFO    2022-10-11 10:40:46     SNPUMIBasePileupIterator        Processed    51,000,000 records.  Elapsed time: 00:03:20s.  Time for last 1,000,000:    4s.  Last read position: 3:49,357,204
INFO    2022-10-11 10:40:50     SNPUMIBasePileupIterator        Processed    52,000,000 records.  Elapsed time: 00:03:24s.  Time for last 1,000,000:    3s.  Last read position: 3:129,169,577
INFO    2022-10-11 10:40:54     SNPUMIBasePileupIterator        Processed    53,000,000 records.  Elapsed time: 00:03:28s.  Time for last 1,000,000:    3s.  Last read position: 4:6,717,527
INFO    2022-10-11 10:40:58     SNPUMIBasePileupIterator        Processed    54,000,000 records.  Elapsed time: 00:03:31s.  Time for last 1,000,000:    3s.  Last read position: 4:105,137,453
INFO    2022-10-11 10:41:02     SNPUMIBasePileupIterator        Processed    55,000,000 records.  Elapsed time: 00:03:35s.  Time for last 1,000,000:    3s.  Last read position: 5:14,651,757
INFO    2022-10-11 10:41:05     SNPUMIBasePileupIterator        Processed    56,000,000 records.  Elapsed time: 00:03:39s.  Time for last 1,000,000:    3s.  Last read position: 5:72,195,204
INFO    2022-10-11 10:41:09     SNPUMIBasePileupIterator        Processed    57,000,000 records.  Elapsed time: 00:03:42s.  Time for last 1,000,000:    3s.  Last read position: 5:134,606,541
INFO    2022-10-11 10:41:13     SNPUMIBasePileupIterator        Processed    58,000,000 records.  Elapsed time: 00:03:46s.  Time for last 1,000,000:    3s.  Last read position: 5:171,410,601
INFO    2022-10-11 10:41:16     SNPUMIBasePileupIterator        Processed    59,000,000 records.  Elapsed time: 00:03:50s.  Time for last 1,000,000:    3s.  Last read position: 6:26,138,783
INFO    2022-10-11 10:41:21     SNPUMIBasePileupIterator        Processed    60,000,000 records.  Elapsed time: 00:03:54s.  Time for last 1,000,000:    4s.  Last read position: 6:34,424,667
INFO    2022-10-11 10:41:25     SNPUMIBasePileupIterator        Processed    61,000,000 records.  Elapsed time: 00:03:58s.  Time for last 1,000,000:    3s.  Last read position: 6:73,517,681
INFO    2022-10-11 10:41:29     SNPUMIBasePileupIterator        Processed    62,000,000 records.  Elapsed time: 00:04:02s.  Time for last 1,000,000:    3s.  Last read position: 6:151,097,030
INFO    2022-10-11 10:41:32     SNPUMIBasePileupIterator        Processed    63,000,000 records.  Elapsed time: 00:04:06s.  Time for last 1,000,000:    3s.  Last read position: 7:22,510,435
INFO    2022-10-11 10:41:36     SNPUMIBasePileupIterator        Processed    64,000,000 records.  Elapsed time: 00:04:10s.  Time for last 1,000,000:    3s.  Last read position: 7:76,304,049
INFO    2022-10-11 10:41:40     SNPUMIBasePileupIterator        Processed    65,000,000 records.  Elapsed time: 00:04:13s.  Time for last 1,000,000:    3s.  Last read position: 7:116,559,264
INFO    2022-10-11 10:41:44     SNPUMIBasePileupIterator        Processed    66,000,000 records.  Elapsed time: 00:04:17s.  Time for last 1,000,000:    3s.  Last read position: 8:30,117,511
INFO    2022-10-11 10:41:48     SNPUMIBasePileupIterator        Processed    67,000,000 records.  Elapsed time: 00:04:21s.  Time for last 1,000,000:    3s.  Last read position: 8:99,878,190
INFO    2022-10-11 10:41:52     SNPUMIBasePileupIterator        Processed    68,000,000 records.  Elapsed time: 00:04:25s.  Time for last 1,000,000:    4s.  Last read position: 9:19,376,289
INFO    2022-10-11 10:41:55     SNPUMIBasePileupIterator        Processed    69,000,000 records.  Elapsed time: 00:04:29s.  Time for last 1,000,000:    3s.  Last read position: 9:87,731,195
INFO    2022-10-11 10:41:59     SNPUMIBasePileupIterator        Processed    70,000,000 records.  Elapsed time: 00:04:33s.  Time for last 1,000,000:    3s.  Last read position: 9:128,342,671
INFO    2022-10-11 10:42:03     SNPUMIBasePileupIterator        Processed    71,000,000 records.  Elapsed time: 00:04:37s.  Time for last 1,000,000:    3s.  Last read position: 9:136,940,631
INFO    2022-10-11 10:42:06     SNPUMIBasePileupIterator        Processed    72,000,000 records.  Elapsed time: 00:04:40s.  Time for last 1,000,000:    3s.  Last read position: MT:15,767
INFO    2022-10-11 10:42:10     SNPUMIBasePileupIterator        Processed    73,000,000 records.  Elapsed time: 00:04:44s.  Time for last 1,000,000:    3s.  Last read position: X:12,977,073
INFO    2022-10-11 10:42:14     SNPUMIBasePileupIterator        Processed    74,000,000 records.  Elapsed time: 00:04:48s.  Time for last 1,000,000:    3s.  Last read position: X:72,272,737
INFO    2022-10-11 10:42:18     SNPUMIBasePileupIterator        Processed    75,000,000 records.  Elapsed time: 00:04:51s.  Time for last 1,000,000:    3s.  Last read position: X:154,400,612
INFO    2022-10-11 10:42:21     SNPUMIBasePileupIterator        Processed    76,000,000 records.  Elapsed time: 00:04:55s.  Time for last 1,000,000:    3s.  Last read position: */*
INFO    2022-10-11 10:42:24     SNPUMIBasePileupIterator        Processed    77,000,000 records.  Elapsed time: 00:04:58s.  Time for last 1,000,000:    2s.  Last read position: */*
INFO    2022-10-11 10:42:27     SNPUMIBasePileupIterator        Processed    78,000,000 records.  Elapsed time: 00:05:01s.  Time for last 1,000,000:    2s.  Last read position: */*
INFO    2022-10-11 10:42:29     AssignCellsToSamples    Processed [0] SNPs in BAM + VCF
INFO    2022-10-11 10:42:29     AssignCellsToSamples    Finished!
[Tue Oct 11 10:42:29 AEDT 2022] org.broadinstitute.dropseqrna.barnyard.digitalallelecounts.sampleassignment.AssignCellsToSamples done. Elapsed time: 5.42 minutes.
Runtime.totalMemory()=1682964480

This is the complete output for the assignments file:

#INPUT_BAM=/directflow/SCCGGroupShare/projects/data/experimental_data/CLEAN/scFibroblast_EQTL/scFibroblast_EQTL_Sample1_V1/outs/possorted_genome_bam.bam      INPUT_VCF=/directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/data/fibroblasts/Imputed/MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf   DONOR_FILE=/directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/output/fibroblasts/scFibroblast_EQTL_Sample1/popscle/Individuals.txt        CELL_BC_FILE=/directflow/SCCGGroupShare/projects/data/experimental_data/CLEAN/scFibroblast_EQTL/scFibroblast_EQTL_Sample1_V1/outs/filtered_gene_bc_matrices/Homo_sapiens_GRCh38p10/barcodes.tsv     GQ_THRESHOLD=-1 FRACTION_SAMPLES_PASSING=0.5    READ_MQ=10      FIXED_ERROR_RATE=NA     MAX_ERROR_RATE=NA     LOCUS_FUNCTION=[CODING, UTR]
cell    num_snps        num_umis        ratio   pvalue  FDR_pvalue      bestLikelihood  bestSample      median_likelihood     population_average_likelihood   110_THBP-292    326_THBP-516    195_THBP-453    28_THBP-114  11_THBP-54       181_THBP-178    53_THBP-25

and the output vcf file:

##fileformat=VCFv4.2
##FILTER=<ID=GENOTYPED,Description="Site was genotyped">
##FILTER=<ID=PASS,Description="All filters passed">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Estimated Alternate Allele Dosage : [P(0/1)+2*P(1/1)]">
##FORMAT=<ID=GP,Number=3,Type=Float,Description="Estimated Posterior Probabilities for Genotypes 0/0, 0/1 and 1/1">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy">
##INFO=<ID=ReverseComplementedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been reverse complemented in liftover since the mapping from the previous reference to the current one was on the negative strand.">
##INFO=<ID=SwappedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been swapped in liftover due to changes in the reference. It is possible that not all INFO annotations reflect this swap, and in the genotypes, only the GT, PL, and AD fields have been modified. You should check the TAGS_TO_REVERSE parameter that was used during the LiftOver to be sure.">
##bcftools_filterCommand=filter --include 'MAF>=0.05 & R2>=0.3' -O v --output /directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/data/fibroblasts/Imputed/MAF0.05.dose.vcf /directflow/SCCGGroupShare/projects/DrewNeavin/Fibroblast_eQTLs/data/fibroblast/merged_imputed_AllChrs_includingChr10.vcf.gz; Date=Wed Nov  4 09:23:13 2020
##bcftools_filterVersion=1.9+htslib-1.9
##contig=<ID=1,length=248956422>
##contig=<ID=10,length=133797422>
##contig=<ID=11,length=135086622>
##contig=<ID=12,length=133275309>
##contig=<ID=13,length=114364328>
##contig=<ID=14,length=107043718>
##contig=<ID=15,length=101991189>
##contig=<ID=16,length=90338345>
##contig=<ID=17,length=83257441>
##contig=<ID=18,length=80373285>
##contig=<ID=19,length=58617616>
##contig=<ID=2,length=242193529>
##contig=<ID=20,length=64444167>
##contig=<ID=21,length=46709983>
##contig=<ID=22,length=50818468>
##contig=<ID=3,length=198295559>
##contig=<ID=4,length=190214555>
##contig=<ID=5,length=181538259>
##contig=<ID=6,length=170805979>
##contig=<ID=7,length=159345973>
##contig=<ID=8,length=145138636>
##contig=<ID=9,length=138394717>
##contig=<ID=M,length=16569>
##contig=<ID=X,length=156040895>
##contig=<ID=Y,length=57227415>
##filedate=2017.9.4
##reference=file:/directflow/SCCGGroupShare/projects/DrewNeavin/References/UCSCrefs/hg38/hg38.fa
##source=Minimac3
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  110_THBP-292    326_THBP-516    195_THBP-453  28_THBP-114     11_THBP-54      181_THBP-178    53_THBP-25

Steps to reproduce

The command used is:

AssignCellsToSamples --INPUT_BAM possorted_genome_bam.bam --VCF MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf --OUTPUT assignments.tsv.gz --VCF_OUTPUT out_vcf.vcf --CELL_BARCODE_TAG CB --MOLECULAR_BARCODE_TAG UB --CELL_BC_FILE barcodes.tsv --SAMPLE_FILE Individuals.txt --FUNCTION_TAG XF --EDIT_DISTANCE 1 --READ_MQ 10 --GQ_THRESHOLD 30 --RETAIN_MONOMORPIC_SNPS false --FRACTION_SAMPLES_PASSING 0.5 --IGNORED_CHROMOSOMES X --IGNORED_CHROMOSOMES Y --IGNORED_CHROMOSOMES MT --ADD_MISSING_VALUES true --DNA_MODE false --SNP_LOG_RATE 1000 --GENE_NAME_TAG gn --GENE_STRAND_TAG gs --GENE_FUNCTION_TAG gf --STRAND_STRATEGY SENSE --LOCUS_FUNCTION_LIST CODING --LOCUS_FUNCTION_LIST UTR --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false

Here's an example of the bam file:

NS500239:222:HMTLVBGX2:2:21207:6232:7464        272     1       11879   1       98M     *       0       0       CGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATCTGCTGTCTCT
TAGCCCAGACTTCCCGTGTCCTTTCCACCGGGCCTTTGGGAGGTC      EE/A/EEE/AEEE/EEEEEEEEEEE//AE<//A/EAEEAEEEEEEEAE/EEAEEEAEE/EEEEEEEEEE/E/EEEEAEE<EEE</EEE<AEEEAAAAA      NH:i:4  HI
:i:2  AS:i:92 nM:i:2  RE:A:I  CR:Z:CGGCGCCGGTGTTT     CY:Z:/A6///6/A/E6//     UR:Z:CCCCCCACCG UY:Z:AA/AA///EE UB:Z:CCCCCCACCG BC:Z:ACGCAGCT   QT:Z:AAAA6EEE   RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:2
NS500239:222:HMTLVBGX2:3:22610:4588:7781        272     1       12107   0       1S94M3S *       0       0       CGGGCCATTGTGCATATTCTGGCCCCTGTTGTCTGCATGTAACCTAATACCAC
GACCAGGCATGGGGGAAAGATTGGAGGAAAGTTGAGTGAGAGGAC      /EE////A<////E/EAEEAEE/A//AAA/E//AEEA/E6AAEE/AE/EAEEAE/EE6/AEEEE/<//EEAE//AEEEEEEEEEEEEEE/EEEAAA//      NH:i:6  HI
:i:2  AS:i:80 nM:i:6  RE:A:I  CR:Z:CGTGGACTCGGTAG     CY:Z:////A6E6E6/E//     UR:Z:CCACCCACCG UY:Z:AA<AAE/EEE UB:Z:CCACCCACCG BC:Z:TATGGTTG   QT:Z:/6//AEAA   RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:3
NS500239:222:HMTLVBGX2:4:23611:25156:7680       256     1       14089   0       90M8S   *       0       0       CTTGAGCAAACTCCAAGACATCTTCTACCCCACCACCAGCAATTGTGCCAAGG
GCCATTAGGCTCTCAGCATGACTATTTTTAGAGCCCCGTGTAGGT      /AAA/AEEEE<AEEEEE6AEE/EEEEAE</EA//EEEAAEEEE//EE//EEAEEAEEEEEEE/A<EE////E/EEEEAE/6EEEAE/AEE/EEE//AE      NH:i:6  HI
:i:2  AS:i:84 nM:i:2  RE:A:I  CR:Z:CCACGGGATATCTC     CY:Z://6//////A//6/     CB:Z:CCACGGGATATCTC-1   UR:Z:TTCAATGCCT UY:Z:AAAAAEEEEE UB:Z:TTCAATGCCT BC:Z:ACGCAGCT
   QT:Z:/AA6AE/E   RG:Z:scFibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:4
NS500239:222:HMTLVBGX2:4:23603:13566:2314       256     1       14116   0       1S96M1S *       0       0       GCCCCAACACAAGCAATTGTGCCAAGGTCCATTAGGCTCTCAGCATGACTATT
TTTAGATACCCCGTTTCTGTCACTGAAACCGTTTTTGTGGGAGAA      AAAAA6EEEA/E/EEAE/EE/A/AE/////EAEA//EA<E/E//A//EEEE/E///EEE/AA///A6//<EE/EE<</EE/E//<E/E/EE///EE/A      NH:i:5  HI
:i:2  AS:i:84 nM:i:5  RE:A:I  CR:Z:CGATAACACCATTC     CY:Z:/A6///////6/A/     UR:Z:CCGTGAAGCC UY:Z:AA//A/A/AE UB:Z:CCGTGAAGCC BC:Z:GATGGTTG   QT:Z:AAAA/AE6   RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:4

I probably can't share genotype data easily due to ethics but please let me know if any files would be helpful in reproducing this.

Expected behavior

I would expect the resulting vcf and assignment files to have the expected genotype and individual assignment contents.

Actual behavior

The vcf and assignment files output by AssignCellsToSamples are empty.

Really appreciate your help and let me know if I can provide any additional information that could help identify the issue I am facing.

Given I can't see the entire file, I have to take a guess as to what's wrong - from the lines in the BAM file, I'm not seeing the tags we typically use for gene function annotation. Those fields are gn (gene name) gs (gene strand) gf (gene function). I would pre-process the BAM with TagReadWithGeneFunction first (and your GTF of choice, 10x's should be fine), and then try AssignCellsToSamples on that output.

Yes, that seems to have resolved it. Thanks for your assistance!