strider187 / Twitter-Streaming-and-Sentiment-Analysis

In this repository, I stream data from Twitter using Tweepy and also perform Sentiment Analysis on about 200 tweets using VaderSentiment and Naive Bayes Classifier from nltk.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


In this repository, I stream data from Twitter using Tweepy and also perform Sentiment Analysis on about 200 tweets using VaderSentiment and Naive Bayes Classifier from nltk.



Tweepy is an easy-to-use Python library for accessing the Twitter API. You can have a look at the docs here. To install Tweepy use the command:
pip install tweepy


JSON (JavaScript Object Notation) is a lightweight data-interchange format. Head over to this page to have a look at the docs. To install JSON library, ue the command:
pip install jsonlib


Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. Pandas user guide and documnetation is avaiable here. To install pandas, use the command:
pip install pandas


NumPy is the fundamental package for scientific computing with Python. To know more about numpy, visit this page. Tp install numpy, use the command:
pip install numpy

re - Regular expression operations

You can have a look at this page to know more about re. To install re, use the command:
pip install regex


VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains. Have a look at the project page here. The pip install command for VaderSentiment is:
pip install vaderSentiment

nltk - Natural Language Toolkit

NLTK or Natural Language Toolkit is a leading platform for building Python programs to work with human language data. Head over to this page to take a look at the documentation of nltk. To install nltk, use the following command:
pip install nltk
In this project, I have used quite a few nltk modules especially, the nltk.corpus modules. There are a number of datasets which have been downloaded from the nltk.corpus module. The packages are:

  • Stopwords
  • Twitter Samples (twitter_samples) To download these modules, simply swtich your command line to Python mode by simply using the python command.
    Once in the python mode, use the following set of command:
    import nltk'PACKAGE NAME')
    A good example of this is:'stopwords')


In this repository, I stream data from Twitter using Tweepy and also perform Sentiment Analysis on about 200 tweets using VaderSentiment and Naive Bayes Classifier from nltk.


Language:Jupyter Notebook 100.0%Language:Python 0.0%