dasmith / stanford-corenlp-python

Python wrapper for Stanford CoreNLP tools v3.4.1

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sentiment Analysis Confidence Scores

shawnbeaulieu opened this issue · comments

Hello,

For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.

Thank you.

Hi, I had the same question but I think I figured it out
with the setting:
nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate(some_sentence, properties={ 'annotators': 'sentiment', 'outputFormat': 'json', 'timeout': 1000, })
The result: res is structured in the way:
res['sentence'] = [result_first_sentence, result_second_sentnce, ..., result_last_sentence]
and you can look into the result for each sentence, the result is output in a dict format with keys

['index',
 'parse',
 'basicDependencies',
 'enhancedDependencies',
 'enhancedPlusPlusDependencies',
 'sentimentValue',
 'sentiment',
 'sentimentDistribution',
 'sentimentTree',
 'tokens']

And the 'sentimentDistribution' should be the one you are looking for
so if you are interested in the sentiment distribution of the first sentence, then:
res['sentences'][0]['sentimentDistribution']

commented

@vacous What does the sentimentDistribution represent? I receive an array of five numbers? Do you know what those five numbers mean?

@AlexFine it represents the probabilities for "--(very negative)", "-", "0", "+", "++" sentiment in the sentence.
Please see https://nlp.stanford.edu/sentiment/ for more details.

commented

@vacous thanks