EishaMazhar / Product-Title-Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem:

When an e-commerce site is scaled and they upload 1M or more products, manual labeling of categories becomes difficult but crucial.

Methodology:

I have performed 3-tier product title classification on Lazada's dataset using LSTMs in this project.

Metrics:

Metrics Used for comparison:

  • Accuracy Score
  • Cohen's kappa coefficient
    • Cohen's kappa coefficient is a statistic that is used to measure inter-rater reliability for qualitative items.

Data Source:

Source: Mined from Lazada (E-commerce website), data is also available at https://arxiv.org/abs/1804.01000 Train Data Size: 11, 446 Test Data Size: 5, 528

Reference Paper

https://github.com/EishaMazhar/Product-Title-Classification/blob/master/Product%20Title%20vs%20text%20classifictaion.pdf

About


Languages

Language:Jupyter Notebook 100.0%