juewang0607 / search_machine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The file “search_engine.cpp” trains the Hash Table and B tree in C++ The processes of my code can be broken down into following steps:

  1. Defined the loadWordLibrary function to save segmentation lexicon
  2. Defined the loadStopWords function to save unnecessary chinese characters thesaurus.
  3. Defined the loadData to save the music lyrics part
  4. Defined the splitWords to split the lyrics into several words.
  5. Defined the deleteWords to delete the words in unnecessary chinese characters thesaurus.
  6. Defined the buildInvertIndex to build an invert index.
  7. Defined the computeScore to compute the frequency of the words.
  8. Load the data and save the input from the command line.
  9. Get the recommandation.

About