AssignCellsToSamples Empty Output
drneavin opened this issue · comments
Affected tool(s)
AssignCellsToSamples
Affected version(s)
- Latest public release version [Version:2.5.1(680c2ea_1642084299)]
Description
This might not be a bug, but I can't nail down what might be causing this issue. After running AssignCellsToSamples
on my vcf and bam file, I receive empty outputs. Both have been aligned to hg38 and are using the same chr encoding for chromosomes. At the end of the run it reports Processed [0] SNPs in BAM + VCF
but I'm unclear why there would be no overlap between them as I have used this bam and vcf for other demultiplexing tools. Here's the complete log:
[Tue Oct 11 10:37:04 AEDT 2022] AssignCellsToSamples --INPUT_BAM possorted_genome_bam.bam --VCF MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf --OUTPUT assignments.tsv.gz --VCF_OUTPUT out_vcf.vcf --CELL_BARCODE_TAG CB --MOLECULAR_BARCODE_TAG UB --CELL_BC_FILE barcodes.tsv --SAMPLE_FILE Individuals.txt --FUNCTION_TAG XF --EDIT_DISTANCE 1 --READ_MQ 10 --GQ_THRESHOLD 30 --RETAIN_MONOMORPIC_SNPS false --FRACTION_SAMPLES_PASSING 0.5 --IGNORED_CHROMOSOMES X --IGNORED_CHROMOSOMES Y --IGNORED_CHROMOSOMES MT --ADD_MISSING_VALUES true --DNA_MODE false --SNP_LOG_RATE 1000 --GENE_NAME_TAG gn --GENE_STRAND_TAG gs --GENE_FUNCTION_TAG gf --STRAND_STRATEGY SENSE --LOCUS_FUNCTION_LIST CODING --LOCUS_FUNCTION_LIST UTR --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Tue Oct 11 10:37:04 AEDT 2022] Executing as drenea@zeta-4-27.local on Linux 3.10.0-1160.42.2.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_101-b13; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.5.1(680c2ea_1642084299)
INFO 2022-10-11 10:37:04 AssignCellsToSamples Number of contigs in common: 24.
Contigs only in BAM INPUT(S): GL000008.2, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000208.1, GL000213.1, GL000214.1, GL000216.2, GL000218.1, GL000219.1, GL000220.1, GL000221.1, GL000224.1, GL000225.1, GL000226.1, KI270302.1, KI270303.1, KI270304.1, KI270305.1, KI270310.1, KI270311.1, KI270312.1, KI270315.1, KI270316.1, KI270317.1, KI270320.1, KI270322.1, KI270329.1, KI270330.1, KI270333.1, KI270334.1, KI270335.1, KI270336.1, KI270337.1, KI270338.1, KI270340.1, KI270362.1, KI270363.1, KI270364.1, KI270366.1, KI270371.1, KI270372.1, KI270373.1, KI270374.1, KI270375.1, KI270376.1, KI270378.1, KI270379.1, KI270381.1, KI270382.1, KI270383.1, KI270384.1, KI270385.1, KI270386.1, KI270387.1, KI270388.1, KI270389.1, KI270390.1, KI270391.1, KI270392.1, KI270393.1, KI270394.1, KI270395.1, KI270396.1, KI270411.1, KI270412.1, KI270414.1, KI270417.1, KI270418.1, KI270419.1, KI270420.1, KI270422.1, KI270423.1, KI270424.1, KI270425.1, KI270429.1, KI270435.1, KI270438.1, KI270442.1, KI270448.1, KI270465.1, KI270466.1, KI270467.1, KI270468.1, KI270507.1, KI270508.1, KI270509.1, KI270510.1, KI270511.1, KI270512.1, KI270515.1, KI270516.1, KI270517.1, KI270518.1, KI270519.1, KI270521.1, KI270522.1, KI270528.1, KI270529.1, KI270530.1, KI270538.1, KI270539.1, KI270544.1, KI270548.1, KI270579.1, KI270580.1, KI270581.1, KI270582.1, KI270583.1, KI270584.1, KI270587.1, KI270588.1, KI270589.1, KI270590.1, KI270591.1, KI270593.1, KI270706.1, KI270707.1, KI270708.1, KI270709.1, KI270710.1, KI270711.1, KI270712.1, KI270713.1, KI270714.1, KI270715.1, KI270716.1, KI270717.1, KI270718.1, KI270719.1, KI270720.1, KI270721.1, KI270722.1, KI270723.1, KI270724.1, KI270725.1, KI270726.1, KI270727.1, KI270728.1, KI270729.1, KI270730.1, KI270731.1, KI270732.1, KI270733.1, KI270734.1, KI270735.1, KI270736.1, KI270737.1, KI270738.1, KI270739.1, KI270740.1, KI270741.1, KI270742.1, KI270743.1, KI270744.1, KI270745.1, KI270746.1, KI270747.1, KI270748.1, KI270749.1, KI270750.1, KI270751.1, KI270752.1, KI270753.1, KI270754.1, KI270755.1, KI270756.1, KI270757.1, MT
Contigs only in MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf: M
INFO 2022-10-11 10:37:04 AssignCellsToSamples Genotype Quality [GQ] not found in header. Disabling GQ_THRESHOLD parameter
INFO 2022-10-11 10:37:04 AssignCellsToSamples Found 7 samples in VCF and requested sample list out of 7 requested
INFO 2022-10-11 10:37:04 AssignCellsToSamples Found 7 samples in VCF and requested sample list out of 7 requested
INFO 2022-10-11 10:37:04 AssignCellsToSamples Genotype Quality Filter disabled. Enabling A/T, C/G SNP Filter to eliminate potential allele flipping variants
INFO 2022-10-11 10:37:04 AssignCellsToSamples Scanning VCF to find potential SNP sites
INFO 2022-10-11 10:37:26 AssignCellsToSamples Found [127971] potential SNP sites to query.
INFO 2022-10-11 10:37:26 AssignCellsToSamples Found 7 samples in VCF and requested sample list out of 7 requested
INFO 2022-10-11 10:37:26 AssignCellsToSamples Genotype Quality Filter disabled. Enabling A/T, C/G SNP Filter to eliminate potential allele flipping variants
INFO 2022-10-11 10:37:26 AssignCellsToSamples Found 912 cell barcodes in file
INFO 2022-10-11 10:37:31 SNPUMIBasePileupIterator Processed 1,000,000 records. Elapsed time: 00:00:04s. Time for last 1,000,000: 4s. Last read position: 1:23,695,822
INFO 2022-10-11 10:37:34 SNPUMIBasePileupIterator Processed 2,000,000 records. Elapsed time: 00:00:08s. Time for last 1,000,000: 3s. Last read position: 1:42,702,133
INFO 2022-10-11 10:37:38 SNPUMIBasePileupIterator Processed 3,000,000 records. Elapsed time: 00:00:12s. Time for last 1,000,000: 3s. Last read position: 1:85,583,464
INFO 2022-10-11 10:37:42 SNPUMIBasePileupIterator Processed 4,000,000 records. Elapsed time: 00:00:15s. Time for last 1,000,000: 3s. Last read position: 1:113,982,031
INFO 2022-10-11 10:37:45 SNPUMIBasePileupIterator Processed 5,000,000 records. Elapsed time: 00:00:19s. Time for last 1,000,000: 3s. Last read position: 1:153,543,642
INFO 2022-10-11 10:37:49 SNPUMIBasePileupIterator Processed 6,000,000 records. Elapsed time: 00:00:23s. Time for last 1,000,000: 3s. Last read position: 1:165,869,748
INFO 2022-10-11 10:37:53 SNPUMIBasePileupIterator Processed 7,000,000 records. Elapsed time: 00:00:27s. Time for last 1,000,000: 3s. Last read position: 10:1,040,160
INFO 2022-10-11 10:37:57 SNPUMIBasePileupIterator Processed 8,000,000 records. Elapsed time: 00:00:31s. Time for last 1,000,000: 3s. Last read position: 10:17,237,440
INFO 2022-10-11 10:38:01 SNPUMIBasePileupIterator Processed 9,000,000 records. Elapsed time: 00:00:35s. Time for last 1,000,000: 3s. Last read position: 10:96,567,344
INFO 2022-10-11 10:38:05 SNPUMIBasePileupIterator Processed 10,000,000 records. Elapsed time: 00:00:38s. Time for last 1,000,000: 3s. Last read position: 11:866,768
INFO 2022-10-11 10:38:09 SNPUMIBasePileupIterator Processed 11,000,000 records. Elapsed time: 00:00:42s. Time for last 1,000,000: 3s. Last read position: 11:33,708,963
INFO 2022-10-11 10:38:13 SNPUMIBasePileupIterator Processed 12,000,000 records. Elapsed time: 00:00:46s. Time for last 1,000,000: 4s. Last read position: 11:61,964,676
INFO 2022-10-11 10:38:17 SNPUMIBasePileupIterator Processed 13,000,000 records. Elapsed time: 00:00:50s. Time for last 1,000,000: 4s. Last read position: 11:62,690,367
INFO 2022-10-11 10:38:20 SNPUMIBasePileupIterator Processed 14,000,000 records. Elapsed time: 00:00:54s. Time for last 1,000,000: 3s. Last read position: 11:65,499,647
INFO 2022-10-11 10:38:24 SNPUMIBasePileupIterator Processed 15,000,000 records. Elapsed time: 00:00:58s. Time for last 1,000,000: 3s. Last read position: 11:75,404,743
INFO 2022-10-11 10:38:28 SNPUMIBasePileupIterator Processed 16,000,000 records. Elapsed time: 00:01:02s. Time for last 1,000,000: 3s. Last read position: 12:1,647,006
INFO 2022-10-11 10:38:32 SNPUMIBasePileupIterator Processed 17,000,000 records. Elapsed time: 00:01:06s. Time for last 1,000,000: 3s. Last read position: 12:48,938,714
INFO 2022-10-11 10:38:36 SNPUMIBasePileupIterator Processed 18,000,000 records. Elapsed time: 00:01:10s. Time for last 1,000,000: 4s. Last read position: 12:56,043,444
INFO 2022-10-11 10:38:40 SNPUMIBasePileupIterator Processed 19,000,000 records. Elapsed time: 00:01:14s. Time for last 1,000,000: 3s. Last read position: 12:66,057,608
INFO 2022-10-11 10:38:44 SNPUMIBasePileupIterator Processed 20,000,000 records. Elapsed time: 00:01:18s. Time for last 1,000,000: 3s. Last read position: 12:111,843,126
INFO 2022-10-11 10:38:48 SNPUMIBasePileupIterator Processed 21,000,000 records. Elapsed time: 00:01:22s. Time for last 1,000,000: 3s. Last read position: 12:124,911,844
INFO 2022-10-11 10:38:52 SNPUMIBasePileupIterator Processed 22,000,000 records. Elapsed time: 00:01:25s. Time for last 1,000,000: 3s. Last read position: 13:75,526,600
INFO 2022-10-11 10:38:55 SNPUMIBasePileupIterator Processed 23,000,000 records. Elapsed time: 00:01:29s. Time for last 1,000,000: 3s. Last read position: 14:65,075,265
INFO 2022-10-11 10:38:59 SNPUMIBasePileupIterator Processed 24,000,000 records. Elapsed time: 00:01:33s. Time for last 1,000,000: 3s. Last read position: 15:40,036,151
INFO 2022-10-11 10:39:03 SNPUMIBasePileupIterator Processed 25,000,000 records. Elapsed time: 00:01:37s. Time for last 1,000,000: 3s. Last read position: 15:60,347,492
INFO 2022-10-11 10:39:07 SNPUMIBasePileupIterator Processed 26,000,000 records. Elapsed time: 00:01:41s. Time for last 1,000,000: 3s. Last read position: 15:72,199,203
INFO 2022-10-11 10:39:11 SNPUMIBasePileupIterator Processed 27,000,000 records. Elapsed time: 00:01:45s. Time for last 1,000,000: 4s. Last read position: 16:1,962,138
INFO 2022-10-11 10:39:15 SNPUMIBasePileupIterator Processed 28,000,000 records. Elapsed time: 00:01:49s. Time for last 1,000,000: 4s. Last read position: 16:18,784,747
INFO 2022-10-11 10:39:19 SNPUMIBasePileupIterator Processed 29,000,000 records. Elapsed time: 00:01:53s. Time for last 1,000,000: 3s. Last read position: 16:81,084,481
INFO 2022-10-11 10:39:23 SNPUMIBasePileupIterator Processed 30,000,000 records. Elapsed time: 00:01:57s. Time for last 1,000,000: 4s. Last read position: 17:4,945,715
INFO 2022-10-11 10:39:27 SNPUMIBasePileupIterator Processed 31,000,000 records. Elapsed time: 00:02:01s. Time for last 1,000,000: 3s. Last read position: 17:19,446,180
INFO 2022-10-11 10:39:31 SNPUMIBasePileupIterator Processed 32,000,000 records. Elapsed time: 00:02:05s. Time for last 1,000,000: 3s. Last read position: 17:41,690,832
INFO 2022-10-11 10:39:35 SNPUMIBasePileupIterator Processed 33,000,000 records. Elapsed time: 00:02:09s. Time for last 1,000,000: 3s. Last read position: 17:75,135,402
INFO 2022-10-11 10:39:39 SNPUMIBasePileupIterator Processed 34,000,000 records. Elapsed time: 00:02:13s. Time for last 1,000,000: 3s. Last read position: 18:12,326,486
INFO 2022-10-11 10:39:43 SNPUMIBasePileupIterator Processed 35,000,000 records. Elapsed time: 00:02:17s. Time for last 1,000,000: 4s. Last read position: 19:2,272,974
INFO 2022-10-11 10:39:47 SNPUMIBasePileupIterator Processed 36,000,000 records. Elapsed time: 00:02:21s. Time for last 1,000,000: 3s. Last read position: 19:13,778,176
INFO 2022-10-11 10:39:51 SNPUMIBasePileupIterator Processed 37,000,000 records. Elapsed time: 00:02:25s. Time for last 1,000,000: 3s. Last read position: 19:37,564,365
INFO 2022-10-11 10:39:55 SNPUMIBasePileupIterator Processed 38,000,000 records. Elapsed time: 00:02:29s. Time for last 1,000,000: 3s. Last read position: 19:48,330,370
INFO 2022-10-11 10:39:59 SNPUMIBasePileupIterator Processed 39,000,000 records. Elapsed time: 00:02:33s. Time for last 1,000,000: 3s. Last read position: 19:48,966,653
INFO 2022-10-11 10:40:03 SNPUMIBasePileupIterator Processed 40,000,000 records. Elapsed time: 00:02:37s. Time for last 1,000,000: 3s. Last read position: 19:49,491,477
INFO 2022-10-11 10:40:07 SNPUMIBasePileupIterator Processed 41,000,000 records. Elapsed time: 00:02:41s. Time for last 1,000,000: 4s. Last read position: 2:3,576,624
INFO 2022-10-11 10:40:11 SNPUMIBasePileupIterator Processed 42,000,000 records. Elapsed time: 00:02:45s. Time for last 1,000,000: 3s. Last read position: 2:55,021,586
INFO 2022-10-11 10:40:15 SNPUMIBasePileupIterator Processed 43,000,000 records. Elapsed time: 00:02:48s. Time for last 1,000,000: 3s. Last read position: 2:96,593,401
INFO 2022-10-11 10:40:19 SNPUMIBasePileupIterator Processed 44,000,000 records. Elapsed time: 00:02:52s. Time for last 1,000,000: 3s. Last read position: 2:180,872,729
INFO 2022-10-11 10:40:23 SNPUMIBasePileupIterator Processed 45,000,000 records. Elapsed time: 00:02:56s. Time for last 1,000,000: 3s. Last read position: 2:237,098,356
INFO 2022-10-11 10:40:26 SNPUMIBasePileupIterator Processed 46,000,000 records. Elapsed time: 00:03:00s. Time for last 1,000,000: 3s. Last read position: 20:44,196,917
INFO 2022-10-11 10:40:30 SNPUMIBasePileupIterator Processed 47,000,000 records. Elapsed time: 00:03:04s. Time for last 1,000,000: 3s. Last read position: 21:34,239,107
INFO 2022-10-11 10:40:34 SNPUMIBasePileupIterator Processed 48,000,000 records. Elapsed time: 00:03:08s. Time for last 1,000,000: 3s. Last read position: 22:37,678,541
INFO 2022-10-11 10:40:38 SNPUMIBasePileupIterator Processed 49,000,000 records. Elapsed time: 00:03:12s. Time for last 1,000,000: 3s. Last read position: 22:39,313,213
INFO 2022-10-11 10:40:42 SNPUMIBasePileupIterator Processed 50,000,000 records. Elapsed time: 00:03:16s. Time for last 1,000,000: 3s. Last read position: 3:23,919,227
INFO 2022-10-11 10:40:46 SNPUMIBasePileupIterator Processed 51,000,000 records. Elapsed time: 00:03:20s. Time for last 1,000,000: 4s. Last read position: 3:49,357,204
INFO 2022-10-11 10:40:50 SNPUMIBasePileupIterator Processed 52,000,000 records. Elapsed time: 00:03:24s. Time for last 1,000,000: 3s. Last read position: 3:129,169,577
INFO 2022-10-11 10:40:54 SNPUMIBasePileupIterator Processed 53,000,000 records. Elapsed time: 00:03:28s. Time for last 1,000,000: 3s. Last read position: 4:6,717,527
INFO 2022-10-11 10:40:58 SNPUMIBasePileupIterator Processed 54,000,000 records. Elapsed time: 00:03:31s. Time for last 1,000,000: 3s. Last read position: 4:105,137,453
INFO 2022-10-11 10:41:02 SNPUMIBasePileupIterator Processed 55,000,000 records. Elapsed time: 00:03:35s. Time for last 1,000,000: 3s. Last read position: 5:14,651,757
INFO 2022-10-11 10:41:05 SNPUMIBasePileupIterator Processed 56,000,000 records. Elapsed time: 00:03:39s. Time for last 1,000,000: 3s. Last read position: 5:72,195,204
INFO 2022-10-11 10:41:09 SNPUMIBasePileupIterator Processed 57,000,000 records. Elapsed time: 00:03:42s. Time for last 1,000,000: 3s. Last read position: 5:134,606,541
INFO 2022-10-11 10:41:13 SNPUMIBasePileupIterator Processed 58,000,000 records. Elapsed time: 00:03:46s. Time for last 1,000,000: 3s. Last read position: 5:171,410,601
INFO 2022-10-11 10:41:16 SNPUMIBasePileupIterator Processed 59,000,000 records. Elapsed time: 00:03:50s. Time for last 1,000,000: 3s. Last read position: 6:26,138,783
INFO 2022-10-11 10:41:21 SNPUMIBasePileupIterator Processed 60,000,000 records. Elapsed time: 00:03:54s. Time for last 1,000,000: 4s. Last read position: 6:34,424,667
INFO 2022-10-11 10:41:25 SNPUMIBasePileupIterator Processed 61,000,000 records. Elapsed time: 00:03:58s. Time for last 1,000,000: 3s. Last read position: 6:73,517,681
INFO 2022-10-11 10:41:29 SNPUMIBasePileupIterator Processed 62,000,000 records. Elapsed time: 00:04:02s. Time for last 1,000,000: 3s. Last read position: 6:151,097,030
INFO 2022-10-11 10:41:32 SNPUMIBasePileupIterator Processed 63,000,000 records. Elapsed time: 00:04:06s. Time for last 1,000,000: 3s. Last read position: 7:22,510,435
INFO 2022-10-11 10:41:36 SNPUMIBasePileupIterator Processed 64,000,000 records. Elapsed time: 00:04:10s. Time for last 1,000,000: 3s. Last read position: 7:76,304,049
INFO 2022-10-11 10:41:40 SNPUMIBasePileupIterator Processed 65,000,000 records. Elapsed time: 00:04:13s. Time for last 1,000,000: 3s. Last read position: 7:116,559,264
INFO 2022-10-11 10:41:44 SNPUMIBasePileupIterator Processed 66,000,000 records. Elapsed time: 00:04:17s. Time for last 1,000,000: 3s. Last read position: 8:30,117,511
INFO 2022-10-11 10:41:48 SNPUMIBasePileupIterator Processed 67,000,000 records. Elapsed time: 00:04:21s. Time for last 1,000,000: 3s. Last read position: 8:99,878,190
INFO 2022-10-11 10:41:52 SNPUMIBasePileupIterator Processed 68,000,000 records. Elapsed time: 00:04:25s. Time for last 1,000,000: 4s. Last read position: 9:19,376,289
INFO 2022-10-11 10:41:55 SNPUMIBasePileupIterator Processed 69,000,000 records. Elapsed time: 00:04:29s. Time for last 1,000,000: 3s. Last read position: 9:87,731,195
INFO 2022-10-11 10:41:59 SNPUMIBasePileupIterator Processed 70,000,000 records. Elapsed time: 00:04:33s. Time for last 1,000,000: 3s. Last read position: 9:128,342,671
INFO 2022-10-11 10:42:03 SNPUMIBasePileupIterator Processed 71,000,000 records. Elapsed time: 00:04:37s. Time for last 1,000,000: 3s. Last read position: 9:136,940,631
INFO 2022-10-11 10:42:06 SNPUMIBasePileupIterator Processed 72,000,000 records. Elapsed time: 00:04:40s. Time for last 1,000,000: 3s. Last read position: MT:15,767
INFO 2022-10-11 10:42:10 SNPUMIBasePileupIterator Processed 73,000,000 records. Elapsed time: 00:04:44s. Time for last 1,000,000: 3s. Last read position: X:12,977,073
INFO 2022-10-11 10:42:14 SNPUMIBasePileupIterator Processed 74,000,000 records. Elapsed time: 00:04:48s. Time for last 1,000,000: 3s. Last read position: X:72,272,737
INFO 2022-10-11 10:42:18 SNPUMIBasePileupIterator Processed 75,000,000 records. Elapsed time: 00:04:51s. Time for last 1,000,000: 3s. Last read position: X:154,400,612
INFO 2022-10-11 10:42:21 SNPUMIBasePileupIterator Processed 76,000,000 records. Elapsed time: 00:04:55s. Time for last 1,000,000: 3s. Last read position: */*
INFO 2022-10-11 10:42:24 SNPUMIBasePileupIterator Processed 77,000,000 records. Elapsed time: 00:04:58s. Time for last 1,000,000: 2s. Last read position: */*
INFO 2022-10-11 10:42:27 SNPUMIBasePileupIterator Processed 78,000,000 records. Elapsed time: 00:05:01s. Time for last 1,000,000: 2s. Last read position: */*
INFO 2022-10-11 10:42:29 AssignCellsToSamples Processed [0] SNPs in BAM + VCF
INFO 2022-10-11 10:42:29 AssignCellsToSamples Finished!
[Tue Oct 11 10:42:29 AEDT 2022] org.broadinstitute.dropseqrna.barnyard.digitalallelecounts.sampleassignment.AssignCellsToSamples done. Elapsed time: 5.42 minutes.
Runtime.totalMemory()=1682964480
This is the complete output for the assignments file:
#INPUT_BAM=/directflow/SCCGGroupShare/projects/data/experimental_data/CLEAN/scFibroblast_EQTL/scFibroblast_EQTL_Sample1_V1/outs/possorted_genome_bam.bam INPUT_VCF=/directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/data/fibroblasts/Imputed/MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf DONOR_FILE=/directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/output/fibroblasts/scFibroblast_EQTL_Sample1/popscle/Individuals.txt CELL_BC_FILE=/directflow/SCCGGroupShare/projects/data/experimental_data/CLEAN/scFibroblast_EQTL/scFibroblast_EQTL_Sample1_V1/outs/filtered_gene_bc_matrices/Homo_sapiens_GRCh38p10/barcodes.tsv GQ_THRESHOLD=-1 FRACTION_SAMPLES_PASSING=0.5 READ_MQ=10 FIXED_ERROR_RATE=NA MAX_ERROR_RATE=NA LOCUS_FUNCTION=[CODING, UTR]
cell num_snps num_umis ratio pvalue FDR_pvalue bestLikelihood bestSample median_likelihood population_average_likelihood 110_THBP-292 326_THBP-516 195_THBP-453 28_THBP-114 11_THBP-54 181_THBP-178 53_THBP-25
and the output vcf file:
##fileformat=VCFv4.2
##FILTER=<ID=GENOTYPED,Description="Site was genotyped">
##FILTER=<ID=PASS,Description="All filters passed">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Estimated Alternate Allele Dosage : [P(0/1)+2*P(1/1)]">
##FORMAT=<ID=GP,Number=3,Type=Float,Description="Estimated Posterior Probabilities for Genotypes 0/0, 0/1 and 1/1">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy">
##INFO=<ID=ReverseComplementedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been reverse complemented in liftover since the mapping from the previous reference to the current one was on the negative strand.">
##INFO=<ID=SwappedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been swapped in liftover due to changes in the reference. It is possible that not all INFO annotations reflect this swap, and in the genotypes, only the GT, PL, and AD fields have been modified. You should check the TAGS_TO_REVERSE parameter that was used during the LiftOver to be sure.">
##bcftools_filterCommand=filter --include 'MAF>=0.05 & R2>=0.3' -O v --output /directflow/SCCGGroupShare/projects/DrewNeavin/Demultiplex_Benchmark/data/fibroblasts/Imputed/MAF0.05.dose.vcf /directflow/SCCGGroupShare/projects/DrewNeavin/Fibroblast_eQTLs/data/fibroblast/merged_imputed_AllChrs_includingChr10.vcf.gz; Date=Wed Nov 4 09:23:13 2020
##bcftools_filterVersion=1.9+htslib-1.9
##contig=<ID=1,length=248956422>
##contig=<ID=10,length=133797422>
##contig=<ID=11,length=135086622>
##contig=<ID=12,length=133275309>
##contig=<ID=13,length=114364328>
##contig=<ID=14,length=107043718>
##contig=<ID=15,length=101991189>
##contig=<ID=16,length=90338345>
##contig=<ID=17,length=83257441>
##contig=<ID=18,length=80373285>
##contig=<ID=19,length=58617616>
##contig=<ID=2,length=242193529>
##contig=<ID=20,length=64444167>
##contig=<ID=21,length=46709983>
##contig=<ID=22,length=50818468>
##contig=<ID=3,length=198295559>
##contig=<ID=4,length=190214555>
##contig=<ID=5,length=181538259>
##contig=<ID=6,length=170805979>
##contig=<ID=7,length=159345973>
##contig=<ID=8,length=145138636>
##contig=<ID=9,length=138394717>
##contig=<ID=M,length=16569>
##contig=<ID=X,length=156040895>
##contig=<ID=Y,length=57227415>
##filedate=2017.9.4
##reference=file:/directflow/SCCGGroupShare/projects/DrewNeavin/References/UCSCrefs/hg38/hg38.fa
##source=Minimac3
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 110_THBP-292 326_THBP-516 195_THBP-453 28_THBP-114 11_THBP-54 181_THBP-178 53_THBP-25
Steps to reproduce
The command used is:
AssignCellsToSamples --INPUT_BAM possorted_genome_bam.bam --VCF MAF0.05.dose_GeneFiltered.vcf.recode.hg38_nochr.vcf --OUTPUT assignments.tsv.gz --VCF_OUTPUT out_vcf.vcf --CELL_BARCODE_TAG CB --MOLECULAR_BARCODE_TAG UB --CELL_BC_FILE barcodes.tsv --SAMPLE_FILE Individuals.txt --FUNCTION_TAG XF --EDIT_DISTANCE 1 --READ_MQ 10 --GQ_THRESHOLD 30 --RETAIN_MONOMORPIC_SNPS false --FRACTION_SAMPLES_PASSING 0.5 --IGNORED_CHROMOSOMES X --IGNORED_CHROMOSOMES Y --IGNORED_CHROMOSOMES MT --ADD_MISSING_VALUES true --DNA_MODE false --SNP_LOG_RATE 1000 --GENE_NAME_TAG gn --GENE_STRAND_TAG gs --GENE_FUNCTION_TAG gf --STRAND_STRATEGY SENSE --LOCUS_FUNCTION_LIST CODING --LOCUS_FUNCTION_LIST UTR --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Here's an example of the bam file:
NS500239:222:HMTLVBGX2:2:21207:6232:7464 272 1 11879 1 98M * 0 0 CGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATCTGCTGTCTCT
TAGCCCAGACTTCCCGTGTCCTTTCCACCGGGCCTTTGGGAGGTC EE/A/EEE/AEEE/EEEEEEEEEEE//AE<//A/EAEEAEEEEEEEAE/EEAEEEAEE/EEEEEEEEEE/E/EEEEAEE<EEE</EEE<AEEEAAAAA NH:i:4 HI
:i:2 AS:i:92 nM:i:2 RE:A:I CR:Z:CGGCGCCGGTGTTT CY:Z:/A6///6/A/E6// UR:Z:CCCCCCACCG UY:Z:AA/AA///EE UB:Z:CCCCCCACCG BC:Z:ACGCAGCT QT:Z:AAAA6EEE RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:2
NS500239:222:HMTLVBGX2:3:22610:4588:7781 272 1 12107 0 1S94M3S * 0 0 CGGGCCATTGTGCATATTCTGGCCCCTGTTGTCTGCATGTAACCTAATACCAC
GACCAGGCATGGGGGAAAGATTGGAGGAAAGTTGAGTGAGAGGAC /EE////A<////E/EAEEAEE/A//AAA/E//AEEA/E6AAEE/AE/EAEEAE/EE6/AEEEE/<//EEAE//AEEEEEEEEEEEEEE/EEEAAA// NH:i:6 HI
:i:2 AS:i:80 nM:i:6 RE:A:I CR:Z:CGTGGACTCGGTAG CY:Z:////A6E6E6/E// UR:Z:CCACCCACCG UY:Z:AA<AAE/EEE UB:Z:CCACCCACCG BC:Z:TATGGTTG QT:Z:/6//AEAA RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:3
NS500239:222:HMTLVBGX2:4:23611:25156:7680 256 1 14089 0 90M8S * 0 0 CTTGAGCAAACTCCAAGACATCTTCTACCCCACCACCAGCAATTGTGCCAAGG
GCCATTAGGCTCTCAGCATGACTATTTTTAGAGCCCCGTGTAGGT /AAA/AEEEE<AEEEEE6AEE/EEEEAE</EA//EEEAAEEEE//EE//EEAEEAEEEEEEE/A<EE////E/EEEEAE/6EEEAE/AEE/EEE//AE NH:i:6 HI
:i:2 AS:i:84 nM:i:2 RE:A:I CR:Z:CCACGGGATATCTC CY:Z://6//////A//6/ CB:Z:CCACGGGATATCTC-1 UR:Z:TTCAATGCCT UY:Z:AAAAAEEEEE UB:Z:TTCAATGCCT BC:Z:ACGCAGCT
QT:Z:/AA6AE/E RG:Z:scFibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:4
NS500239:222:HMTLVBGX2:4:23603:13566:2314 256 1 14116 0 1S96M1S * 0 0 GCCCCAACACAAGCAATTGTGCCAAGGTCCATTAGGCTCTCAGCATGACTATT
TTTAGATACCCCGTTTCTGTCACTGAAACCGTTTTTGTGGGAGAA AAAAA6EEEA/E/EEAE/EE/A/AE/////EAEA//EA<E/E//A//EEEE/E///EEE/AA///A6//<EE/EE<</EE/E//<E/E/EE///EE/A NH:i:5 HI
:i:2 AS:i:84 nM:i:5 RE:A:I CR:Z:CGATAACACCATTC CY:Z:/A6///////6/A/ UR:Z:CCGTGAAGCC UY:Z:AA//A/A/AE UB:Z:CCGTGAAGCC BC:Z:GATGGTTG QT:Z:AAAA/AE6 RG:Z:sc
Fibroblast_GWAS_Sample1_V1:MissingLibrary:1:HMTLVBGX2:4
I probably can't share genotype data easily due to ethics but please let me know if any files would be helpful in reproducing this.
Expected behavior
I would expect the resulting vcf and assignment files to have the expected genotype and individual assignment contents.
Actual behavior
The vcf and assignment files output by AssignCellsToSamples
are empty.
Really appreciate your help and let me know if I can provide any additional information that could help identify the issue I am facing.
Given I can't see the entire file, I have to take a guess as to what's wrong - from the lines in the BAM file, I'm not seeing the tags we typically use for gene function annotation. Those fields are gn (gene name) gs (gene strand) gf (gene function). I would pre-process the BAM with TagReadWithGeneFunction first (and your GTF of choice, 10x's should be fine), and then try AssignCellsToSamples on that output.
Yes, that seems to have resolved it. Thanks for your assistance!