Datasets
gidim opened this issue · comments
Gideon Mendels commented
Thanks for this! Your code is readable and properly commented.
Could you recommend any datasets to train this on?
Michael A. Alcorn commented
Glad you found the code useful. Unfortunately, search data sets are generally proprietary, but one thing you could try instead is using a data set that includes document titles (e.g., the "Subjects" of 20 Newsgroups) and treating the titles as the "queries".