ECE-Engineer / MachineLearning-BigData-Project

This project is about providing a GUI that interfaces with a custom implementation of a HashTable Algorithm, different simularity metrics used in big data analytics and machine learning, and very large data sets from the Kepler Object API. The GUI will allow the user to display all the Kepler Objects of Interest, display only Kepler Objects of Interest with selected features, or finding the most similar Kepler Object of Interest to the one selected. An additional feature was also added to find the 2 most similar Kepler Objects of Interest in the entire data set if needed.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MachineLearning-BigData-Project

The first part of this project is about providing a GUI that interfaces with a custom implementation of a HashTable Algorithm, different similarity metrics used in big data analytics and machine learning, and very large data sets from the Kepler Object API. The GUI will allow the user to display all the Kepler Objects of Interest, display only Kepler Objects of Interest with selected features, or finding the most similar Kepler Object of Interest to the one selected. An additional feature was also added to find the 2 most similar Kepler Objects of Interest in the entire data set if needed.

The second part of this project is about implementing a custom persistent BTree structure as well as a hash-based cache, interfacing it with the current state of the program and utilize k-means-clustering and z-score normalization to display the clustering of the data to the user. The user may then select the number of clusters they want to make and which ones they want to see.

Instructions After Downloading

---> cd 365
---> gradlew run

Project Decomposition

Assignment One

Task Number Task Title COMPLETED
1 Find An Online API For Keyed Data X
2 Implement A Custom HashTable Class X
3 Create Classes For The Data Being Modeled X
4 Parse The Online Data Through The API X
5 Create A GUI To Display The Data X
6 Implement Similarity Metrics For The Data X
7 Edit The GUI To Take A Key And Display The Most Similar Piece Of Data X

Assignment Two

Task Number Task Title COMPLETED
1 Create A Hash-Based Cache For The Data X
2 Create A Persistent BTree Class Of Keys, With Values In A Separate Persistent Structure Or Files X
3 Preload At Least 1000 Key-Value Pairs X
4 Pre-Categorize Keys Into Clusters Using K-Means Clustering Or A Similar Metric X
5 Edit The GUI To Display The Clustering Of The Objects In The Data X

Latest Stable Copy

Class Project Zip

About

This project is about providing a GUI that interfaces with a custom implementation of a HashTable Algorithm, different simularity metrics used in big data analytics and machine learning, and very large data sets from the Kepler Object API. The GUI will allow the user to display all the Kepler Objects of Interest, display only Kepler Objects of Interest with selected features, or finding the most similar Kepler Object of Interest to the one selected. An additional feature was also added to find the 2 most similar Kepler Objects of Interest in the entire data set if needed.

License:MIT License


Languages

Language:Java 100.0%