walkenho / LaTeX2Wordcloud

An webhosted app (latex2wordcloud.herokuapp.com) providing text cleaning and analysis functionality for general text and LaTeX formatted text documents.

Home Page:https://latex2wordcloud.herokuapp.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LaTeX 2 Wordcloud

A Streamlit app that allows the user to clean text data and visualize word frequencies.

The app offers several data cleaning steps and in the visualizes the most frequent words in the text using a bar chart.

As special feature, it uses the LaTeXStripper library to allows users to clean LaTeX formatted files from their formatting allowing the analysis of the actual content instead of formatting features ;)

It currently offers the following cleaning options:

  • Split Hyphenation
  • Lemmatization
  • Deletion of Stopwords
  • Deletion of Punctuation
  • Deletion of Single Characters
  • Deletion of LaTeX Formatting

A beta version is currently deployed under the following URL: https://latex2wordcloud.herokuapp.com/

About

An webhosted app (latex2wordcloud.herokuapp.com) providing text cleaning and analysis functionality for general text and LaTeX formatted text documents.

https://latex2wordcloud.herokuapp.com/


Languages

Language:Jupyter Notebook 99.7%Language:Python 0.3%Language:Shell 0.0%