garcer3 / text-classification-small-datasets

Building a text classifier with extremely small datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text Classification With Extremely Small Datasets

Accompanying blog : https://towardsdatascience.com/text-classification-with-extremely-small-datasets-333d322caee2

Credits:

  1. Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. "Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media”. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Fransisco, US, August 2016.
  2. Potthast et al. (2016) https://webis.de/downloads/publications/papers/stein_2016b.pdf
  3. Terrier Stop Word list : https://github.com/terrier-org/terrier-desktop/blob/master/share/stopword-list.txt
  4. Downworthy : https://github.com/snipe/downworthy
  5. Dale Chall Easy word list: http://www.readabilityformulas.com/articles/dale-chall-readability-word-list.php

About

Building a text classifier with extremely small datasets


Languages

Language:Jupyter Notebook 99.2%Language:Python 0.8%