Csv2Lucene Note: code is largely a proof of concept. Description: Csv2Lucene's goal is to bulk index a large amount of huge CSV files quickly. The focus is on the record level and not the file level when creating threads. Multi-Threading on record lines instead of files has advantages when it comes to speed as well as recovery. Requirements: - Quickly index a large amount of huge CSV files. - Thread on record lines not on files. Advantages of threading on the record level instead of by file: - Working on a single huge database dump file. - Faster to index until the very end, keeping all threads busy until the last few lines - Simpler recovery from abrupt application halts, since we are reading a smaller set of files at a time. Note: More than one file can be read at a time, when reading the tail end of one file the beginning of the next file is read, keeping a continuous flow of record lines until the last record of the last file is read.