zapersea / Yun_Cup

"Yun Cup" Scenic Reputation Evaluation Score Forecast 3th Solution

"Yun Cup" Scenic Reputation Evaluation Score Forecast

Introduction

This package includes 3th solution for the "Yun Cup" Scenic Reputation Evaluation Score Forecast.

Directory

model: machine learning model & deel learning model meta feature for stacking purpose.
preprocess: preprocesss for machine learning model.
stacking: stacking model.
yuntext: deep learning model(including detailed instructions to setup).

Ensemble

Stacking get better performence in LB.

Score

model	score
FastText	0.54018 (pretrained embedding)
Ridge	0.54449
Select-K-Best	~0.543
Word2vec	0.549
CNN	0.556
RCNN	0.555
Capsule	0.549
HAN(LSTM-Attention)	0.550
RNN	0.547

Failed

Data Augment
TF-IDF-CD
Crawl comments from scenic reputation website to pretrain word embeddings.
Pseudo-Labelling

Reference

Kaggle Toxic Comment Classification Challenge
Large Scale Multi-label Text Classification With Deep Learning
Convolutional Neural Networks for Sentence Classification
Recurrent Convolutional Neural Networks for Text Classification
Neural Machine Translation of Rare Words with Subword Units
A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification

Acknowledgments

WindWard
fpc

About

"Yun Cup" Scenic Reputation Evaluation Score Forecast 3th Solution

Languages

Language:Python 100.0%