david26694 / datathon-racism

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

datathon-racism

Key modelling ideas:

  • Train with all sentences
  • Cross-validation:
    • Split according to sentences
    • Hold-out set, nothing fancy
  • "Safe" threshold selection

Learnings:

  • Huggingface is great for open source
  • NLP models are bad at generalisation.

About


Languages

Language:Python 76.1%Language:Jupyter Notebook 23.9%