koaning / doubtlab

Doubt your data, find bad labels.

Home Page:https://koaning.github.io/doubtlab/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doubt Reason based on Margin

amitness opened this issue · comments

In Active Learning literature, there is a heuristic called margin where we check the difference in probabilities between the first and second highest predicted class. If the margin is very low, it could indicate doubt.

Example:

cat: 0.9, dog: 0.05, elephant: 0.05 -> margin = 0.9 - 0.05 = 0.85 (high margin)

cat: 0.4, dog: 0.4, elephant: 0.2 -> margin = 0.4 - 0.4 = 0 (low margin)

Sorry, just realized there is already an issue on this 😅