andrewkchan / search-engine-day-1

:mag_right: Dynamic indexer supporting incremental indexing of text documents. Implements a basic vector model and provides free and phrase text search queries.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build Status

search-engine-day-1

Day 1 of my own personal "Build a Search Engine in 5 days" challenge. Desiging the search engine is going to take a lot longer than 5 days; the end goal is to have something simple that I can explain so that others can build it within 5 days.

naive-dynamic-ix

A hash-based inverted index. Uses Berkeley DB for backing disk segment, and a naive dynamic indexing scheme with a single auxiliary index in memory along with a single main index on disk. Supports one word and phrase queries.

Benchmarks are still WIP.

Installation

  1. sudo apt-get install build-essential libdb-dev
  2. pip install bsddb3 (use Python 3)

About

:mag_right: Dynamic indexer supporting incremental indexing of text documents. Implements a basic vector model and provides free and phrase text search queries.


Languages

Language:Python 100.0%