sjacks26 / FlockWatch

Build better data collections by finding new collection terms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add option to ignore hashtags and/or usernames

sjacks26 opened this issue · comments

Might only want to ignore these things for bigrams. Might also want to ignore them for co-occurring terms.

Option to ignore handles in all results added in 855d4c1

The logic of ignoring hashtags for bigrams:

  • Hashtags are themselves ngrams. They often contain more than one word.
  • Is there a realistic case where one hashtag doesn't have discriminatory power but two hashtags do?