This project leverages Azure AI services for Optical Character Recognition (OCR) to extract text from images and PDF documents. The extracted text is then presented through a user-friendly web interface built using Streamlit, allowing users to easily upload and process their files.
- Text Extraction: Utilizes Azure AI services to accurately extract text from a wide range of images and documents.
- User-Friendly Interface: Streamlit provides an intuitive and interactive interface for users to upload and process their files.
- Customizable: Easily extend or modify the project to suit specific requirements or integrate with other services.
To get started with the project, follow these steps:
-
Clone the Repository:
git clone https://github.com/AjibolaMatthew1/text-extraction-azure.git
-
Install Dependencies:
pip install -r requirements.txt
-
Set Up Azure AI Credentials:
- Obtain Azure AI Cognitive Services API credentials and update the
.env
file.
- Obtain Azure AI Cognitive Services API credentials and update the
-
Run the App:
streamlit run app.py
The app will be accessible at
http://localhost:8501
.
- Launch the app by running
streamlit run app.py
. - Open your web browser and go to
http://localhost:8501
. - Upload an image or PDF document.
- Click the "Extract Text" button.
- View and copy the extracted text.