Hadoop MapReduce job to Find most frequent english letters using list of plain text documents
in this project, i tried grab some free english books from project gotenburg, please notice that some of those might be copyrighted, use the accompanying scripts to help you grab and extract those files.
Tested and running on Cloudera CDH v 5.8.0
Sample output: https://docs.google.com/spreadsheets/d/1TGAn7JMOv0cJzTEXJ0gc9dt1j2NtzSe_Ar6knbSFL08/edit?usp=sharing