MapReduce_basicStatistics Cloud Computing project 1
The idea of this project is to get you started with Hadoop and the MapReduce concept. You may have already looked at the WordCount example, both serial and Hadoop implementations. This problem is similar to WordCount except that you will be computing the basic statistics such as min, max, average, and standard deviation of a given data set.
The input to the program will be a text file carrying exactly one floating point number per line. The output should include min, max, average, and standard deviation of these numbers.
Deliverables
You will need to complete the source code and write a report. Zip your work into a file with the name username project1.zip (replace ’username’ with your own) and submit the following:
-
Complete source code
-
A document with the following details: – Transformation of data during the computations, i.e. data type of key, value – The data structure used to transfer between Map and Reduce phases – How the data flow happens through disk and memory during the computation
For further details, click here