- main.py is the most recent file.
- main_0.py is the first file to run if you want to follow with the steps.
- It's better to pipe the results to less for example, cause the output might be huge:
python main.py | less
- Extract the words from the articles in a smarter way.
- Add multi threading.
- Store the counter results in a file so we don't run it each time, but just fetch it from the file.
- Change the most common K value.
- Fetch articles from the Internet.
- Better show the accuracy results.
- [For those who want to fly] generate graphs with the results for better visualisation.
- Chart.js maybe?
- Make HTML reports, look for html builder, or build it yourself.