Unlocking the blackbox of sentiment analysis framework using spaCy

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.

An NLP pipeline in spaCy has 3 components:

Tagger (POS Tagger): Is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc.,
Parser ( Dependency Parser): Dependency parsing is the task of extracting a dependency parse of a sentence that represents its grammatical structure and defines the relationships between “head” words and words, which modify those heads.
NER (Named Entity Recognition): Named-entity recognition is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
Please go though this link to learn more about spaCy: https://spacy.io/usage/training

To acheive this, we are using self-annotated dataset using Prodigy. The train dataset has following columns/attributes:

textID: text identifier
text: review/tweets in the form of text
sentiment: The emotion associated with the text (Positive, Negative, and Neutral)
selected_text: Part of text that plays a major contribution in deciding the sentiment of the text

The test dataset has following columns/attributes:

textID: text identifier
text: review/tweets in the form of text
sentiment: The emotion associated with the text(Positive, Negative, and Neutral)
selected_text: Predict the word/words that influence the polarity of the text