sfertman / text_similarity

Text Similarity

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The article:

https://medium.com/@adriensieg/text-similarities-da019229c894

Methods studied in the article

  • Jaccard Similarity

  • Different embeddings+ K-means

  • Different embeddings+ Cosine Similarity

  • Word2Vec + Smooth Inverse Frequency + Cosine Similarity

  • Different embeddings+LSI + Cosine Similarity

  • Different embeddings+ LDA + Jensen-Shannon distance

  • Different embeddings+ Word Mover Distance

  • Different embeddings+ Variational Auto Encoder (VAE)

  • Different embeddings+ Universal sentence encoder

  • Different embeddings+ Siamese Manhattan LSTM

  • Knowledge-based Measures

SOURCES

https://towardsdatascience.com/elmo-contextual-language-embedding-335de2268604

http://robotics.stanford.edu/~scohen/research/emdg/emdg.html#flow_eqw_notopt

http://robotics.stanford.edu/~rubner/slides/sld014.htm

http://jxieeducation.com/2016-06-13/Document-Similarity-With-Word-Movers-Distance/

Problem: Compute distance between points with uncertain locations (given by samples, or differing observations, or clusters).

[Zoom] : Google Sentence Encoder

About

Text Similarity


Languages

Language:Jupyter Notebook 100.0%