Provide guidelines on getting accuracy feedback in the wild
biancadanforth opened this issue · comments
Bianca Danforth commented
Part of maintaining a Fathom ruleset in production is monitoring its performance over time. As mentioned in #143 , the web is constantly changing and a ruleset's accuracy could decline over time.
- How do we monitor a ruleset's health in the wild?
- How does this feedback get integrated into ruleset and training/test data updates?
- How do we do this in a privacy-preserving way?
- Could the same UI for crowdsourcing page labeling (#141 ) be applicable here?