borisfoko / Spark-Text-Clustering

Text clustering in spark with scala using LDA Model on a TF-IDF matrix

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark-Text-Clustering

The following project demonstrates how to use LDA Models in Scala in a Spark environment on TF-IDF matrixs of texts, in order to cluster those in different topics.

Requirements

Java (jdk-13.0.1) Scala (scala-sdk-2.12.10) Spark (Spark-3.0.0 and sbt-1.3.10)

About

Text clustering in spark with scala using LDA Model on a TF-IDF matrix

License:MIT License


Languages

Language:Scala 100.0%