Why set FUZZY_THRESHOLD = 0.95 ?
jerrychan807 opened this issue · comments
FUZZY_THRESHOLD = 0.95
vulnerable = all(ratios.values()) and min(ratios.values()) < FUZZY_THRESHOLD < max(ratios.values()) and abs(ratios[True] - ratios[False]) > FUZZY_THRESHOLD / 10
I am confused about this part of the code.
Can you explain the reason for this judgment? Thank you very much~
A) Couple of lines up you can see if any(original[_] == contents[True][_] != contents[False][_] for _ in (HTTPCODE, TITLE)):
. Hence, DSSS expects either 100% the same content or it uses that same FUZZY_THRESHOLD
:
B) quick_ratio()
is sometimes acting erratic. It is far from precise. Hence, you can expect it to have really wide range for error. Because of that same reason I've put that 5% range just below the perfect
C) One more thing. If fuzzy (quick) comparison says 90% you can be pretty sure that it something around that value, but not that same value. Hence, from my own experience, in TRUE cases you'll either have a perfect 100% score or you'll have a fuzzy below 95% score.