Course materials for "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley) Syllabus: http://people.ischool.berkeley.edu/~dbamman/info256.html
Notebook | Description |
---|---|
1.words/EvaluateTokenizationForSentiment | The impact of tokenization choices on sentiment classification. |
1.words/ExploreTokenization | Different methods for tokenizing texts (whitespace, NLTK, spacy, regex) |
1.words/TokenizePrintedBooks | Design a better tokenizer for printed books |
1.words/Text_Complexity | Implement type-token ratio and Flesch-Kincaid Grade Level scores for text |
2.compare/ChiSquare, Mann-Whitney Tests | Explore two tests for finding distinctive terms |