bjprogrammer / Comparing-corpora-using-corpora-statistics-My_first_taste_of_python

Compared writing styles of two authors with different personalities and designation using nltk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comparing-corpora-with-corpora-statistics

Corpora used from Gutenberg- Speeches & Letters of Abraham Lincoln,1832-1865 and Mark Twain’s Letters and Speeches 1901-1906

Step 1- Preprocessing(Tokenization,Normalization,Stemming, Lemmatization)

Step 2- Frequency Distribution(Selected top 50 words,bigrams and trigrams)

Step 3- Comparison of author writing styles based on results found in step 2

About

Compared writing styles of two authors with different personalities and designation using nltk


Languages

Language:Jupyter Notebook 100.0%