Sentiment Analysis Confidence Scores
shawnbeaulieu opened this issue · comments
Hello,
For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.
Thank you.
Hi, I had the same question but I think I figured it out
with the setting:
nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate(some_sentence, properties={ 'annotators': 'sentiment', 'outputFormat': 'json', 'timeout': 1000, })
The result: res
is structured in the way:
res['sentence'] = [result_first_sentence, result_second_sentnce, ..., result_last_sentence]
and you can look into the result for each sentence, the result is output in a dict format with keys
['index',
'parse',
'basicDependencies',
'enhancedDependencies',
'enhancedPlusPlusDependencies',
'sentimentValue',
'sentiment',
'sentimentDistribution',
'sentimentTree',
'tokens']
And the 'sentimentDistribution' should be the one you are looking for
so if you are interested in the sentiment distribution of the first sentence, then:
res['sentences'][0]['sentimentDistribution']
@vacous What does the sentimentDistribution represent? I receive an array of five numbers? Do you know what those five numbers mean?
@AlexFine it represents the probabilities for "--(very negative)", "-", "0", "+", "++" sentiment in the sentence.
Please see https://nlp.stanford.edu/sentiment/ for more details.