Frequent reporting of a null allele for 10X data

Question

Frequent reporting of a null allele for 10X data

vincentwalter opened this issue 8 months ago · comments

Thanks for creating arcasHLA!

Since 10X references that arcasHLA does a good job at genotyping I tried to use it with scRNA data.

I extracted reads using the following command:

arcasHLA extract \
    --single --unmapped \
    sample_alignments.bam

And then genotyped:

 arcasHLA genotype \
    -g A,B,C,DQA1,DQB1,DRB1 \
    -p caucasian \
    --single \
    sample_alignments.extracted.fq.gz

This results in the following genotypes being assigned:

locus	subject_1	subject_2	subject_3	subject_4	subject_5
A	A01:01:140,A11:303	A24:608N,A24:608N	A01:01:143,A03:01:103	A29:01:01,A29:01:01	A03:01:119,A24:608N
B	B07:386N,B07:386N	B07:386N,B40:01:02	B07:386N,B57:01:01	B07:386N,B07:386N	B07:386N,B07:386N
C	C07:01:106,C07:02:104	C03:392N,C07:02:101	C07:02:128,C07:02:128	C15:05:02,C15:05:02	C07:02:101,C07:02:101
DQA1	DQA103:01:11,DQA103:03:01	DQA101:02:01,DQA101:02:01	DQA101:03:01,DQA101:05:01	DQA101:01:01,DQA101:05:01	DQA105:05:01,DQA101:02:01
DQB1	DQB103:02:01,DQB103:01:01	DQB106:304N,DQB106:02:01	DQB106:352,DQB105:01:01	DQB105:01:01,DQB105:01:01	DQB103:01:46,DQB106:02:01
DRB1	DRB104:01:01,DRB104:01:01	DRB108:01:01,DRB108:01:01	DRB110:01:01,DRB113:01:01	DRB110:01:01,DRB101:02:01	DRB111:04:01,DRB115:01:01

Since the subjects were selected because they share a CD8+ T cell response to a certain antigen I'm not surprised that they share class B alleles. However, I don't think it's plausible that they all share B*07:386N, since it's a null allele. In case of subjects 1, 4 & 5 the result even implies that B*07:386N makes up both alleles.

Do you have an explanation for this?
Is there a way to exclude null alleles from the reference?

locus	subject_1	subject_2	subject_3	subject_4	subject_5
A	A01:01:140,A11:303	A24:608N,A24:608N	A01:01:143,A03:01:103	A29:01:01,A29:01:01	A03:01:119,A24:608N
B	B07:386N,B07:386N	B07:386N,B40:01:02	B07:386N,B57:01:01	B07:386N,B07:386N	B07:386N,B07:386N
C	C07:01:106,C07:02:104	C03:392N,C07:02:101	C07:02:128,C07:02:128	C15:05:02,C15:05:02	C07:02:101,C07:02:101
DQA1	DQA103:01:11,DQA103:03:01	DQA101:02:01,DQA101:02:01	DQA101:03:01,DQA101:05:01	DQA101:01:01,DQA101:05:01	DQA105:05:01,DQA101:02:01
DQB1	DQB103:02:01,DQB103:01:01	DQB106:304N,DQB106:02:01	DQB106:352,DQB105:01:01	DQB105:01:01,DQB105:01:01	DQB103:01:46,DQB106:02:01
DRB1	DRB104:01:01,DRB104:01:01	DRB108:01:01,DRB108:01:01	DRB110:01:01,DRB113:01:01	DRB110:01:01,DRB101:02:01	DRB111:04:01,DRB115:01:01