Those projects were developed over the course of Monash MDS. This unit focuses on big data processing, including latest big data technologies (Spark), and NoSQL database (MongoDB). The data processing covers data frames, and various advanced data analytics for big data. Programming exercises and assignments use Spark, MongoDB, Data Frames, and ML Lib.
Two of the projects developed are:
-
Big Data Processing, analysis and visualisation: Pyspark
RDD
based analysis of structured and unstructured data. -
Comparison of Machine Learning Algorithms on Big Data: Pyspark Dataframe based analysis and implementation of 4 algorithms using
Spark MLib
andMongoDB
.