Incorrect Donor Potential and Exact Match Counts

Question

Incorrect Donor Potential and Exact Match Counts

zabeen opened this issue 2 years ago · comments

Describe the bug
When running a scoring request, potentialMatchCount was lower than expected, suggesting potential matches were not being counted correctly. In addition, exactMatchCount was also higher than expected, as it is calculated by subtracting the potential count from the total count.

To Reproduce
Submit this scoring request to ScoreBatch function on Matching Algorithm functions app:

{
    "donorsHla": [
        {
            "donorId": "donor-id",
            "a": {
                "position1": "*23:XX",
                "position2": "*33:XX"
            },
            "b": {
                "position1": "*14:02",
                "position2": "*49:XX"
            },
            "c": {
                "position1": "*07:XX",
                "position2": "*08:XX"
            },
            "dqb1": {
                "position1": "*05:01:01",
                "position2": "*06:02:01"
            },
            "drb1": {
                "position1": "*01:02",
                "position2": "*15:01"
            }
        }
    ],
    "patientHla": {
        "a": {
            "position1": "*02:01:01",
            "position2": "*33:05"
        },
        "b": {
            "position1": "*14:02:01",
            "position2": "*49:01:01"
        },
        "c": {
            "position1": "*07:01:01",
            "position2": "*08:02:01"
        },
        "dqb1": {
            "position1": "*05:01:01",
            "position2": "*06:02:01"
        },
        "drb1": {
            "position1": "*01:02:01",
            "position2": "*15:01:01"
        }
    },
    "scoringCriteria": {
        "lociToScore": [
            0,
            1,
            2,
            3,
            4,
            5
        ],
        "lociToExcludeFromAggregateScore": [
            3
        ]
    }
}

Expected behaviour
Expected potentialMatchCount to be 4, but was actually 2, and exactMatchCount to be 5, but it was 7.

Extract of full result below:

{
    "donorId": "donor-id",
    "scoringResult": {
        "totalMatchCount": 9,
        "potentialMatchCount": 2,
        "exactMatchCount": 7,
        "searchResultAtLocusA": {
            "scoreDetailsAtPositionOne": {
                "matchConfidence": "Mismatch"
            },
            "scoreDetailsAtPositionTwo": {
                "matchConfidence": "Potential"
            }
        },
        "searchResultAtLocusB": {
            "scoreDetailsAtPositionOne": {
                "matchConfidence": "Exact"
            },
            "scoreDetailsAtPositionTwo": {
                "matchConfidence": "Potential"
            }
        },
        "searchResultAtLocusC": {
            "scoreDetailsAtPositionOne": {
                "matchConfidence": "Potential"
            },
            "scoreDetailsAtPositionTwo": {
                "matchConfidence": "Potential"
            }
        },
        "searchResultAtLocusDqb1": {
            "scoreDetailsAtPositionOne": {
                "matchConfidence": "Definite"
            },
            "scoreDetailsAtPositionTwo": {
                "matchConfidence": "Definite"
            }
        },
        "searchResultAtLocusDrb1": {
            "scoreDetailsAtPositionOne": {
                "matchConfidence": "Exact"
            },
            "scoreDetailsAtPositionTwo": {
                "matchConfidence": "Exact"
            }
        }
    }
}

Atlas Build & Runtime Info (please complete the following information):

Runtime Environment: Azure cloud
Atlas Version: 1.4.2

Dr Zabeen Patel · Answer 1 · Tue Dec 06 2022 20:54:03 GMT+0800 (China Standard Time)

Investigation

Individual match confidences assigned at each scored position seemed correct.
It was only the total sum of Potential confidences that was wrong.
This is the point where Potential matches are counted during result aggregation: [code]
This is the definition of IsPotentialMatch: code
- For some reason, only loci where both positions are Potential are being considered in the aggregation.
- git blame doesn't point to any specific requirement covering this decision (the code was inherited from the parent Nova.SearchAlgorithm codebase) so it was possibly a genuine misunderstanding of how potential matches should be counted.
- This is highlighted by the unit tests covering this logic: example1 example2
- It does not seem as though IsPotentialMatch is used for any purpose other than counting potential matches; fixing this part of the scoring service should be a fairly isolated change.
Note, the reason it has taken so long to spot this bug is that this prop is not used by live search, and so would not have been spotted by testers, as it would have only been visible in the raw results file.

Dr Zabeen Patel · Answer 2 · Tue Dec 06 2022 22:27:46 GMT+0800 (China Standard Time)

Testing

Re-ran ScoreBatch request: Request Body
Comparison of results before and after show the potential and exact match counts are now correct. ✅
- ScoreBatchResults.xlsx

Testing Passed ✅