tertsch / okc_message_search

parse, explore, rank and search in your OKCupid message dump

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This notebook:

  • parses a dump of OKC messages
  • compares the vocabulary size and most frquent terms in incoming and outgoing messages
  • offers basic search based on tf-idf (e.g. if you search for "entropy" it returns the most relevant messages where that word appears to be important)

Getting started:

Coming up:

  • correlating match percentage and thread length
  • analysing features of entire threads with a person

About

parse, explore, rank and search in your OKCupid message dump


Languages

Language:Jupyter Notebook 100.0%