arthur0804 / 562proj

final project for COMP 562 (2019 Fall)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Predict book tones using its reviews

Introduction

This is the group project for class COMP 562 Fall 2019 at UNC. In this project, we help Novelist, which is a publisher founded by UNC alumni to solve a problem in their work. In brief, they define "tones" for each book to help readers find the most appealing books. https://www.ebscohost.com/promoMaterials/NoveList-Guide-to-Story-Elements.pdf

However, their current workflow is to have human assessors to read the book and manually assign the label, which is quite time consuming. Therefore, they turn to us to see whether we could develop some Machine Learning algorithms to automatically tag a book using its metadata.

For report, please refer to the report folder.

For scripts, please refer to the scripts folder.

For tex source file, please refer to the tex folder.

Team Members

Name PID Mail
Jiaming Qu 730205251 jiaming AT ad DOT unc DOT edu
Ximing Wen 730347350 ximing AT live DOT unc DOT edu
Jiesong He 730264869 j DOT he AT unc DOT edu
Wan Zhang 730341932 wanz63 AT live DOT unc DOT edu

Dataset

As we have an agreement with Novelist for data privacy, we do not upload the dataset to a public reporsitory. For anyone who is interested in the project or the data, please reach out to the staff at Novelist. Their contact info could be found at their website: https://www.ebscohost.com/novelist/novelist-contact-us

Models

Unigram model with TF-IDF weighting + Logistic Regression

Acknowledgement

We appreciate Novelist for providing the data and Dr. Yue Wang from the School of Information and Library Science for surpervising and giving suggestions.

About

final project for COMP 562 (2019 Fall)


Languages

Language:Jupyter Notebook 69.3%Language:TeX 30.7%