PetrKorab / FinVADER

VADER sentiment classifier updated with financial lexicons

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pypi License: MIT

FinVADER

VADER sentiment classifier updated with financial lexicons

VADER (Valence Aware Dictionary and sEntiment Reasoner) classifier is a mainstream model for sentiment analysis using a general-language human-curated lexicon, including linguistic features expressed on social media. As such, the model works worse on texts that use domain-specific language, such as finance or economics.

FinVADER improves VADER's classification accuracy, including two finance lexicons: SentiBignomics, and Henry's word list. SentiBigNomics is a detailed financial lexicon for aspect-based sentiment analysis with approximately 7300 terms containing a polarity score ranging in [-1,1] for each item. Henry's lexicon covers 189 words appearing in the company earnings press releases.

FinVADER outperforms VADER on Financial PhraseBank data:

finvader_accuracy vader_accuracy

The code for this benchmark test is here


Installation

FinVADER requires Python 3.8 - 3.11, and NLTK.

To install using pip, use:

pip install finvader

Data requirements

It requires complete text data without NaN values and empty strings. Remove them in the pre-processing part.

Usage

  • Import the library:
from finvader import finvader
  • Select lexicons:
def finvader(text = 'str',                    # Text
             indicator = 'str',               # VADER's indicator: 'pos'/'neg'/'neu'/'compound' 
             use_sentibignomics: bool= False, # Use SentiBignomics lexicon
             use_henry: bool= False):         # Use Henry's lexicon
) 
  • Use the classifier:
text = "The period's sales dropped to EUR 30.6 m from EUR 38.3 m, according to the interim report, released today."

scores = finvader(text, 
                  use_sentibignomics = True, 
                  use_henry = True, 
                  indicator = 'compound' )

Documentation, examples and tutorials

Example of using the classifier:

import pandas as pd                                            # read data
data = pd.read_csv("ecb_speeches.csv")
from finvader import finvader                         
data['finvader'] = data.contents.apply(finvader,               # apply FinVADER and create a new column in data df
                                   use_sentibignomics = True,  # Use Lexicon 1
                                   use_henry = True,           # Use Lexicon 2
                                   indicator="compound")       # Use VADER's compound indicator

For examples of coding, read these tutorials:

FinVADER: Sentiment Analysis for Financial Applications here

Fine-tuning VADER Classifier with Domain-specific Lexicons here


Please visit here for any questions, issues, bugs, and suggestions.

About

VADER sentiment classifier updated with financial lexicons

License:Apache License 2.0


Languages

Language:Python 97.5%Language:Jupyter Notebook 2.5%