airalcorn2 / Deep-Semantic-Similarity-Model

My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Datasets

gidim opened this issue · comments

Thanks for this! Your code is readable and properly commented.
Could you recommend any datasets to train this on?

Glad you found the code useful. Unfortunately, search data sets are generally proprietary, but one thing you could try instead is using a data set that includes document titles (e.g., the "Subjects" of 20 Newsgroups) and treating the titles as the "queries".