astanway / sportspurge

Enough is enough. Bayes to the rescue.

Home Page:https://sportspurge.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

x

Filter sports for days. Labeled data continuously scraped from news and Twitter sources. Optimized for fewer false positives (classifying non-sports as sports) at the expense of missing more sports than necessary, as deleting non-sports content is more detrimental than not deleting sports content. Current stats:

> python classify.py 
Training set size: 192044

MultinomialNB
             precision    recall  f1-score   support

         -1       0.96      0.98      0.97     33749
          1       0.97      0.96      0.97     30265

avg / total       0.97      0.97      0.97     64014

[[32944   805]
 [ 1200 29065]]

LogisticRegression
             precision    recall  f1-score   support

         -1       0.97      0.97      0.97     33749
          1       0.97      0.97      0.97     30265

avg / total       0.97      0.97      0.97     64014

[[32717  1032]
 [  932 29333]]

Ensemble at .9 threshold:
             precision    recall  f1-score   support

         -1       0.92      0.99      0.96     33749
          1       0.99      0.90      0.95     30265

avg / total       0.95      0.95      0.95     64014

[[33510   239]
 [ 2882 27383]]

x

About

Enough is enough. Bayes to the rescue.

https://sportspurge.com


Languages

Language:Python 91.2%Language:CSS 4.7%Language:Shell 4.2%