Inconsistency in Predictive Match Categories

Question

Inconsistency in Predictive Match Categories

zabeen opened this issue 2 years ago · comments

Describe the bug
Prediction match categories are assigned using decimal probability values and this seems to lead to some inconsistency in assigning the grade of Mismatch.

Example from CBU search run on UAT, match probabilities from same donor, locus A and B:

"A": {
    "MatchProbabilities": {
        "ZeroMismatchProbability": {
            "Decimal": 0.0000000032175224840974139724,
            "Percentage": 0
        },
        "OneMismatchProbability": {
            "Decimal": 0.9997048020482089823501507505,
            "Percentage": 100
        },
        "TwoMismatchProbability": {
            "Decimal": 0.0002951947342684898495425796,
            "Percentage": 0
        }
    },
    "MatchCategory": "Potential",
    "PositionalMatchCategories": {
        "Position1": "Potential",
        "Position2": "Potential"
    }
},
"B": {
    "MatchProbabilities": {
        "ZeroMismatchProbability": {
            "Decimal": 0.0,
            "Percentage": 0
        },
        "OneMismatchProbability": {
            "Decimal": 0.9989001643539674846590959957,
            "Percentage": 100
        },
        "TwoMismatchProbability": {
            "Decimal": 0.0010998356460324716380113067,
            "Percentage": 0
        }
    },
    "MatchCategory": "Mismatch",
    "PositionalMatchCategories": {
        "Position1": "Mismatch",
        "Position2": "Potential"
    }
}

}

Expected behaviour
Both locus A and B should have been assigned one Mismatch and one Potential grade, as they both had the same percentage probabilities, and the percentages are what will be displayed to end-users.

Atlas Build & Runtime Info (please complete the following information):

Azure, UAT-ATLAS
Atlas Version: 1.4.2
GitHub commit ID of Atlas build: 3f635714

Dr Zabeen Patel · Answer 1 · Thu Nov 24 2022 02:45:00 GMT+0800 (China Standard Time)

Testing

Tested on UAT having deployed a copy of stable/1.4.2 with the fix commit cherry-picked on top: 04c4704
re-ran the patient HLA that lead to the bug being discovered:

"searchHlaData": {
    "a": {
      "position1": "01:XX",
      "position2": "01:XX"
    },
    "b": {
      "position1": "08:XX",
      "position2": "08:XX"
    },
    "c": {
      "position1": "07:XX",
      "position2": "07:XX"
    },
    "drb1": {
      "position1": "01:XX",
      "position2": "03:XX"
    }

match categories are now as expected, i.e., same percentage values lead to same categories ✅:

"A": {
    "MatchProbabilities": {
        "ZeroMismatchProbability": {
            "Decimal": 0.0000000032175224840974139724,
            "Percentage": 0
        },
        "OneMismatchProbability": {
            "Decimal": 0.9997048020482089823501507505,
            "Percentage": 100
        },
        "TwoMismatchProbability": {
            "Decimal": 0.0002951947342684898495425796,
            "Percentage": 0
        }
    },
    "MatchCategory": "Mismatch",
    "PositionalMatchCategories": {
        "Position1": "Mismatch",
        "Position2": "Exact"
    }
},
"B": {
    "MatchProbabilities": {
        "ZeroMismatchProbability": {
            "Decimal": 0.0,
            "Percentage": 0
        },
        "OneMismatchProbability": {
            "Decimal": 0.9989001643539674846590959957,
            "Percentage": 100
        },
        "TwoMismatchProbability": {
            "Decimal": 0.0010998356460324716380113067,
            "Percentage": 0
        }
    },
    "MatchCategory": "Mismatch",
    "PositionalMatchCategories": {
        "Position1": "Mismatch",
        "Position2": "Exact"
    }
}

Similar improvements were observed throughout the result set. ✅
Match probability values remain unchanged (whole result set was checked). ✅

Testing Passed ✅