Sumstat standardizer fail to handle reverse strand
hsun3163 opened this issue · comments
hsun3163 commented
hsun3163 commented
Also the flip should be "TRUE", it is in the allele_flip_qc function()
Actually the flipping is not a issue, id of the snp is misleading but the actual ref and alt are correct.
This is due to the order of A0 and A1 in the index are defined by alphabetical order, as shown below. This was kept for getting some other function running. After the dependency are resolved this is to be changed.
def namebyordA0_A1(df,cols=['CHR','POS','A0','A1']):
df.columns = cols
prefix = df[[x for x in cols if x not in ['CHR','POS','A0','A1']]+['CHR','POS']].astype(str).agg(':'.join, axis=1)
names = []
for p,A0,A1 in zip(prefix,df.A0,df.A1):
tmp = A0+':'+A1 if A0 > A1 else A1 +':'+ A0
names.append('_'.join([p,tmp]))
return names
hsun3163 commented
hsun3163 commented
doesnt seem to be any residual impact, close for now.