leedstyh / reddit-sentiment-analysis

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

reddit-sentiment-analysis

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

Program Parameters

subs = []           sub-reddit to search
post_flairs = {}    posts flairs to search || None flair is automatically considered
goodAuth = {}       authors whom comments are allowed more than once
uniqueCmt = True    allow one comment per author per symbol
ignoreAuthP = {}    authors to ignore for posts
ignoreAuthC = {}    authors to ignore for comment 
upvoteRatio = float upvote ratio for post to be considered, 0.70 = 70%
ups = int           define # of upvotes, post is considered if upvotes exceed this #
limit = int         define the limit, comments 'replace more' limit
upvotes = int       define # of upvotes, comment is considered if upvotes exceed this #
picks = int         define # of picks here, prints as "Top ## picks are:"
picks_ayz = int     define # of picks for sentiment analysis

Sample Output

It took 216.65 seconds to analyze 5862 comments in 80 posts in 4 subreddits.

Posts analyzed saved in titles

10 most mentioned picks:
GME: 197
BB: 72
FB: 56
PLTR: 36
TSLA: 25
PLUG: 17
RC: 15
NIO: 14
SPCE: 10
TLRY: 10

        Bearish   Neutral   Bullish   Total/Compound
GME   0.087  0.763    0.150      0.161
BB   0.058   0.768     0.175      0.261
FB   0.119  0.708    0.173      0.127
PLTR  0.062  0.804     0.134      0.235
TSLA  0.124  0.690    0.187      0.195

Data:

Includes US stocks with market cap > 100 Million, and price above $3. It doesn't include penny stocks.
You can download data from here:
Source (US stocks): https://www.nasdaq.com/market-activity/stocks/screener?exchange=nasdaq&letter=0&render=download\

Implementation: I am using sets for 'x in s' comparison, sets time complexity for "x in s" is O(1) compare to list: O(n).

Limitations: It depends mainly on the defined parameters for current implementation: It completely ignores the heavily downvoted comments, and there can be a time when the most mentioned ticker is heavily downvoted, but you can change that in upvotes variable.

About

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.


Languages

Language:Python 100.0%