Project 2 - Attempting to predict the season based on mean temperature and sea level pressure
Make sure the makefile is available and execute the following:
make
./main
- Allocate memory for a months array and a distance array
- For each testing data, compare each point with each training data and calculate the Euclidean distance
- Store the calculated Euclidean distance in the distance array along with the corresponding month
- Sort the distances array alongside the months array to maintain the order
- Get the smallest knn distances from the distances array as well as the corresponding month
- Calculate the most occurring months and set that as the prediction
- If there is a tie, the first month is selected
- Continue until every testing data point creates a prediction
- Print out the number of correct and incorrect predictions
- Parallel runtime is a lot slower due to the sorting that takes place. Parallel has a sorting of
a very large array while sequential only keeps track of the top k values
- Import the dataset (Month, Mean temperature, Sea Level Pressure) [Status: COMPLETED]
- Figure out which station is which [Status: COMPLETED]
- Generate coordinates the Mean Temperature vs. Sea Level Pressure [Status: COMPLETED]
- Program k-means cluster for a total of 2 or 4 clusters (Winter, Spring, Summer, Fall)
- Design and implement ourselves
- Take 75% of the data randomly as training data
- Take 25% of the data as test data
- Analyze the predictions of the 25% data and see how accurate it is