adaj / ted-nlp

TED-Talks transcripts text processing playground (TF-iDF featurizer and LDA topic modeling)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TED talks - NLP tutorial

In this project, you will get a look to how to extract patterns in text data. The dataset used is TED talks transcripts, where the problem will be "what are the most important words?" and "what are the main subjects/topics that those TEDs are talking about?". In NLP context, these questions take part of feature extraction and topic modeling problems. I hope it helps you to comprehend the power of these techniques to provide rich information from text data.

Prerequisites

This project uses top used python packages, maybe you already have them installed. If you don't, just run pip install [package_name] on you environment.

Authors

  • Adelson de Araujo Junior - adaj

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

TED-Talks transcripts text processing playground (TF-iDF featurizer and LDA topic modeling)


Languages

Language:Jupyter Notebook 100.0%