Pathairush / learn_nlp_with_bags_of_popcorn

Practicing NLP (natural language processing) techniques from the IMDB sentiment analysis dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Practice NLP with Bags of Popcorn

Introduction

This repository is a hands-on implementation for the IMDB movie review sentimental analysis. The part of tutorial section is following Bags of Popcorn kaggle competition. It's provided a step-by-step improvement leveraging different NLP techniques with the same dataset. I've extended the tutorial based on my interest such as adding Deep learning model, or Multiple input features.

Learning objectives

  • Familiar with the tools

    • nltk
    • gensim
    • keras
  • Learn the processeses

    • Data cleansing (Remove, Filtering, Tokenizing, Lemmatization, Padding, etc.)
    • Feature prrocessing (BOW, TFIDF, Feature engieering, Word Embbedding)
    • Input structure required by different models (Tabular, Sequences, Multi-input)
    • Model structure (Tree-based model, Deel learning, Recurrent NN, etc.)
  • Applied to others

    • Binary classification (sentimental analysis)
    • Multi-class classification (types of review)
    • Regression (polarization score)
    • Applied text features with other kinds of problem (propensity, churn, etc.)

Roadmap

About

Practicing NLP (natural language processing) techniques from the IMDB sentiment analysis dataset.


Languages

Language:Python 100.0%