mozilla / fathom

A framework for extracting meaning from web pages

Home Page:http://mozilla.github.io/fathom/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provide guidelines on getting accuracy feedback in the wild

biancadanforth opened this issue · comments

Part of maintaining a Fathom ruleset in production is monitoring its performance over time. As mentioned in #143 , the web is constantly changing and a ruleset's accuracy could decline over time.

  • How do we monitor a ruleset's health in the wild?
  • How does this feedback get integrated into ruleset and training/test data updates?
  • How do we do this in a privacy-preserving way?
  • Could the same UI for crowdsourcing page labeling (#141 ) be applicable here?