majewm / scala-spark-bigdata-ml-clustering

Scala applications using Spark for Machine Learning - clustering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scala Applications for Big Data using Spark MLlib

Info

  • Building applications for clustering using Apache Spark (Machine Learning) and Scala.

Dependencies

  • Scala: 2.12.12
  • Apache Spark: 3.0.0
  • Maven Scala Plugin: 2.15.2
  • OpenJDK: Java 1.8
  • SBT 1.2.1
  • Windows binaries for Hadoop versions: winutils
  • Apache Hive Configuration Properties

Technology stack

  • IntelliJ IDEA Ultimate

Features

  • SparkSession, clustering, BisectingKMeans, GaussianMixture, KMeans, LDA, ClusteringEvaluator, transform, ...

About

Scala applications using Spark for Machine Learning - clustering

License:MIT License


Languages

Language:Scala 100.0%