TalalNasir3 / HashingSearch

Using the Hash data structure to perform searching on a given data set.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HashingSearch

Description

Using the Hash data structure to perform searching on a given data set. The data set is extracted line by line and hashed into a hash table. The purpose of the program is only to show how the hashing function works and how you hash a keyword into the table, to later extract it. The program only shows then implementation method but does not contain much functionality such as ranking. It serves as a basis for hash searching.

How to Run:

REQUIREMENTS TO RUN THIS PROGRAM: C += 11;

THE PROGRAM WORKS ON THE BASIS OF USING INVERTED INDEXING ON HASHMAP IN WHICH IT CONSISTS OF TWO KINDS OF LISTS, BUCKETLIST AND POSTINGLIST. THE BUCKETLIST REPRESENTS THE BUCKETS WHICH ARE USED TO STORE VALUES, BY BUCKET LIST WE MEAN THAT THERE MIGHT BE POTENTIAL COLLISIONS SO WE RESOLVE THE COLLISIONS BY SEPARATE CHAINING. THE BUCKETS ARE INDEXED BY USING A HASH FUNCTIONS. ON THE OTHER HAND, EACH BUCKET CONTAINTS A VALUE, THE WORD ITSELF, NOW THIS WORD HAS ITS OWN POSTING, POSTING IS THE POSITION AND THE DOCUMENT ID IN WHICH THIS WORD IS LOCATED. SO IF THE DATASET IS VERY LARGE THEN WE CAN EASILY KNOW THE POSITION OF THE WORD IN THE DOCUMENT IN O(1) TIME. THE POSTING LIST REPRESENTS ALL THE MULTIPLE INSTANCES OF THE SAME WORD. SO THE HASH TABLE CAN BE SEEN AS 3D, 2D BEING THE TABLE OF BUCKET AND INDEX, AND THE POSTING LIST BEING THE 3RD DIMENSION.

About

Using the Hash data structure to perform searching on a given data set.


Languages

Language:C++ 100.0%