[Feature] Extract text from documents
mawandm opened this issue · comments
Description
As a user, I'd like to extract text from the document.
Detail
Text extraction is useful to allow for intermediary steps to document ingestion. This will be allow for other processes such as;
- Data cleansing
- Data exclusion based on an exclusion list.
- Approval workflows
Acceptance Criteria
- An API
/v1/extractions/text
in the RAG microservice. - Extraction path added to the API microservice during document processing.
- Persisting the extracted text to an external SQL datasource.