UtkarshTiwari123 / Information-Retrieval-System

The aim of the code is to present a solution for retrieving specific passages or paragraphs from documents along with the document names based on user queries.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Information-Retrieval-System

The aim of the code is to present a solution for retrieving specific passages or paragraphs from documents along with the document names based on user queries.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. Contact

About The Project

The retrieval of specific information from documents can be a challenging task, particularly when dealing with complex documents such as policy documents. In these types of documents, there are often multiple sections that contain important information, and it can be time-consuming to search through them manually. This is where the need for an automated solution arises. The aim of our work is to present a solution for retrieving specific passages or paragraphs from documents based on user queries. The solution will involve extracting the text information from each document and then searching for the relevant sections based on user queries

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

Before running the code , you need to install the following python libraries:

  • nltk
pip install nltk
  • re
pip install regex
  • glob
pip install glob2
  • chardet
pip install chardet

Installation

  1. Clone the repo
    git clone https://github.com/github_username/repo_name.git
  2. Install Python packages (Checkout the Prerequisites)
    pip install <Package-name>
  3. Run the main file using the command
    python Main.py

(back to top)

Usage

After installing all the packages successfuly and runing the python file, you will have to choose if you want to exit or enter text or a phrase or a wildcard query.

image

(back to top)

Roadmap

  • Stopword Removal
  • Stemming
  • Return all the paragraphs which contain the phrase or text along with the file name
  • Wildcard Query Handling
  • Spelling Correction

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

Contact

Your Name - Utkarsh Tiwari - utkarsh.samar@gmail.com

Project Link: https://github.com/UtkarshTiwari123/Information-Retrieval-System

(back to top)

About

The aim of the code is to present a solution for retrieving specific passages or paragraphs from documents along with the document names based on user queries.


Languages

Language:Jupyter Notebook 91.5%Language:Python 8.5%