Bachelor Thesis - Sentiment Analysis of Nordic Languages

This project is a bachelor thesis in computer science at Halmstad University. The project is done by Fredrik Mårtensson and Jesper Holmblad. The project aims to investigate the possibility of extracting a tonality through sentiment analysis of the Nordic languages; Swedish, Danish and Norwegian.

The webservice can be started through Flask framework using python server.py

Load a model

Start the MachineLearning.py file, edit layer1/layer2 and use following parameters/Settings:

LSTM model:

Batch_size 50 Epochs: 40 Units: 80 This will give approximatelu 65.8% correlation

GRU model:

Batch_size 50 Epochs: 15 Units: 100 This will give approximatelu 61.7% correlation

Modifications can be made, good hardware required. Check size of files in order to know how much ram that is required.

Configuration:

Check that right directory and files is selected, alternative if CSV or XMLX exists inside the folder.

Save the model that is created and add it to the MLHandler.py and start the webservice through server.py

Embedding

The embedding can be found at: http://vectors.nlpl.eu/repository/

Swedish:

ID: 69
Vector size: 100
Corpus: Swedish CoNLL17 corpus
Algorithm: Word2Vec Continuous Skipgram
Lemmatization: False

Danish:

ID: 38
Vector size: 100
Corpus: Danish CoNLL17 corpus
Algorithm: Word2Vec Continuous Skipgram
Lemmatization: False

Norwegian:

ID: 58
Vector size: 100
Corpus: Norwegian-Bokmaal CoNLL17 corpus
Algorithm: Word2Vec Continuous Skipgram
Lemmatization: False

About

Sentimental Analysis of Nordic Languages

MIT License

Languages

Language:Python 52.6%Language:Hack 24.6%Language:MATLAB 18.9%Language:M 3.9%