spark-vs-mapreduce Data files are expected to be in a localhost HDFS file system: /data/shuffled.tsv.