llzes / anlp21

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

anlp21

Course materials for "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley) Syllabus: http://people.ischool.berkeley.edu/~dbamman/info256.html

Notebook Description
1.words/EvaluateTokenizationForSentiment The impact of tokenization choices on sentiment classification.
1.words/ExploreTokenization Different methods for tokenizing texts (whitespace, NLTK, spacy, regex)
1.words/TokenizePrintedBooks Design a better tokenizer for printed books
1.words/Text_Complexity Implement type-token ratio and Flesch-Kincaid Grade Level scores for text
2.compare/ChiSquare, Mann-Whitney Tests Explore two tests for finding distinctive terms

About

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)


Languages

Language:Jupyter Notebook 96.5%Language:Python 3.5%