4bdul4ziz / GraphInsight

A multimedia summarisation tool harnessing natural language processing.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GraphInsight: A Graph-based Approach to Summarize multi media content

The project is a Python-based tool that uses various natural language processing (NLP) and computer vision techniques to extract and summarize textual content from various sources such as images, PDFs, and websites. It makes use of several libraries such as NLTK, docx, bs4, cv2, pytesseract, and tika to preprocess the input data and generate a concise and relevant summary.

Installation

  1. Clone the repository git clone https://github.com/4bdul4ziz/GraphInsight.git

  2. Install the required packages using pip -nltk -docx -bs4 -cv2 -pytesseract -tika

Note: pytesseract requires Tesseract OCR to be installed in the system. Please follow the installation instructions for your specific operating system.

Usage

  1. Run the main.py file python main.py

  2. Enter the path to the input file, make sure to have the files in the same directory as the main.py file.

About

A multimedia summarisation tool harnessing natural language processing.


Languages

Language:Python 100.0%