whosaidthat
Features
# | Feature | Description |
---|---|---|
0 | utterance length | number of words in the line |
1 | average word length | average length of words in the line |
2 | word diversity | type-token ratio for this line |
3 | stop words ratio | percentage of words in this line that are stop words |
4 | neologisms ratio | percentage of words in this line that are not in our vocabulary |
5 | number of numbers | how many numbers this line contains |
6 | number of profanity words | how many profanity words this line contains |
7 | subjectivity | subjectivity score form textblob |
8 | polarity | polarity score form textblob |
9 | question count | number of sentences in this line that are questions |
10 | exclamation count | number of sentences in this line that end in exclamation marks |
11 | ellipses count | number of ellipses this line contains |
12 to 12+N-1 | top words | number of words in this line that are also in each character's top 20 most frequent words, for the N main characters of the show |
Characters
Big Bang Theory (45,825 lines, 7 characters)
Amy (3,473), Bernadette (2,687), Howard (5,858), Leonard (9,765), Penny (7,659), Raj (4,680), Sheldon (11,703),
The Simpsons (67,955 lines, 5 characters)
Bart (13,139), Homer (28,447), Lisa (10,945), Marge (13,367), Ned Flanders (2,057)
Desperate Housewives (18,437 lines, 4 characters)
Bree (4,130), Gabrielle (4,564), Lynette (4,618), Susan (5,125)