ekohrt / predict-videogame-tags-from-description

A data science project that uses a dataset of Steam games to try to predict a game's tags from just its description.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Predict a Game's Tags from its Description

Each video game on the online Steam store has a text description and a set of user-defined tags. Is it possible to predict a game's tags from just the text description? We test this hypothesis on a heavily-cleaned dataset of 80000 Steam Games from Kaggle.

To predict the tags, we vectorize the text using TF-IDF and train a scikit-learn MultiOutputClassifier, which uses a separate Linear Regression classifier for each tag. The dataset contains 424 unique tags, which is far too many for this kind of model, but since most tags are used only a few times, we chose to limit the set to just the top 30 most frequent tags.

About

A data science project that uses a dataset of Steam games to try to predict a game's tags from just its description.


Languages

Language:Jupyter Notebook 100.0%