##Purpose
This is my playground for learning to use node js as a server and build svg charts with d3. It's not tested on production server, so be warned.
What it should do:
a) A node js http server for classifying tweets by humans in a browser.
b) With these classifications train a Bayesian classifier.
c) Produce some statistical representations based on it.
And why?
In January/February 2013 (mostly german speaking) women on Twitter started to post personal stories of experienced sexism and harassment under the hashtag #aufschrei (#outcry). Soon many more people where using the hashtag to post opinions, links, troll comments, spam. I want to analyze this large amount of tweets and maybe contribute the results (if any usefull) to the aufschreiStat project.
##Current state in words
The result data is NOT RELIABLE! yet.
##Current state in pictures
##Requirements
http://nodejs.org/ http://bower.io/ http://www.mongodb.org/
run npm install
in root folder to install the required node.js-packages
run bower install
in /static/
to install the required client side js-packages
##Usage
Copy "config.dist.js" and rename it to "config.js"
now change the connection details to your settings
const mongo_settings = {
"hostname": "localhost",
"port": 27017,
"username": "aufschreib",
"password": "ohsosecret",
"name": "aufschreib",
"db": "aufschreib"
};
Put your base JSON file named "tweets.json" into the /data/ folder
used format of a tweet must be the same twitter uses
[
{
"created_at": "Thu, 31 Jan 2013 18:22:47 +0000",
"id_str": "297047589672343473",
"source": "<a href="http://client.url/">Client</a>",
"text": "Some Tweet text with #hashtags, @usernames and http/https-links",
"user": {
"profile_image_url": "http://a0.twimg.com/profile_images/nr/some.png",
"screen_name": "TwitterUser"
}
},
...
]
or implement another in file "prepare.js"
Edit "consts.js"
const cats = [
{
id: 'outcry',
name: 'Aufschrei',
icon: 'icon-bullhorn',
color: '#5e8c6A'
},
...
];
if you edit the categories you need to set the parameter for the Bayesian filter, too.
'Specify the classification thresholds for each category. To classify an item in a category with a threshold of x the probably that item is in the category has to be more than x times the probability that it's in any other category. Default value is 1.' Source
const thresholds = {
spam: 3,
troll: 2,
report: 2,
comment: 1,
outcry: 1
};
run in \bin
node "longifyurls.js"
expand twitters short urls (t.co) through http://www.longurlplease.com/ expanded urls will then be checked for other short urls services, too.
a file "urls.json" with the expanded urls will be created and used
run in \bin
node "prepare.js"
collections are created and data is filled aaaaaaandddddd wait until the process finishes
We're nearly there
Edit "config.js" if you want to change where to access the server
const server_settings = {
listento: '0.0.0.0',
port: 8081
};
now run
node "app.js"
and open the adress with your browser
default username is: admin
password is: totalsupergehaim
Happy classifing!