witlox / sentiment-utils

Valence Aware Dictionary and sEntiment Reasoner (Vader in Scala)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vader Sentiment in Scala with Spark Build Status Coverage Status

FOSSA Status

This is a Scala conversion of https://github.com/cjhutto/vaderSentiment.
It contains UDF wrappers for the functions, and outputs a map of the sentiments corresponding to a text.
Build with Java 1.8, Scala 2.11.8, SBT 0.13.15 and Spark 2.1

Map => { positivity polarity, negative polarity, neutral polarity, compound polarity }

Compound polarity:

  • positive sentiment: compound score >= 0.5
  • neutral sentiment: (compound score > -0.5) and (compound score < 0.5)
  • negative sentiment: compound score <= -0.5

Direct calling in Scala:

SentimentIntensityAnalyser.polarityScores("sentimental text here") 

or as UDF:

import io.witlox.sentiment.Vader._

...

// using a UDF you would have to unwrap the map to it's respective columns, here the functions
// are split by element of the map, so simply add them column by column

val dfWithPositiveSentiment = df.withColumn("positive", positive($"text"))

val dfWithCompoundSentiment = dfWithPositiveSentiment.withColumn("sentiment", compound($"text"))

Also included is a bucketing function for dates. In order to group sentiments (for example in Tweets) we need to group them.

import io.witlox.utils.TimePartition

// create a TimePartition on year (smallest available is milliseconds)
val tp = TimePartition("year")

val bucket = tp.bucketize("2012-01-01T01:01:01.000Z")

or as UDF:

val tp = TimePartition("year")

val dfWithBucket = df.withColumn("bucket", tp.bucket($"ISODateTimeFormatString"))

License

FOSSA Status

About

Valence Aware Dictionary and sEntiment Reasoner (Vader in Scala)

License:GNU Lesser General Public License v3.0


Languages

Language:Scala 100.0%