HURIDOCS

HURIDOCS's repositories

uwazi

Uwazi is a web-based, open-source solution for building and sharing document collections

Language:TypeScriptMIT225 28 3366

casebox

Casebox: Secure all your information and team communication in one place

Language:JavaScriptNOASSERTION50 160

A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of different parts of PDF pages, identifying the elements such as texts, titles, pictures, tables and so on.

Language:PythonApache-2.031 6 5

OpenEvSys

OpenEvSys is free open source software designed for use by organisations who need a software tool to manage information on human rights violations

Language:PHPAGPL-3.030 250

pdf-reading-order

Language:Python6 70

preserve

Preserve is a tool for capturing and saving online digital content. Integrated with Uwazi, Preserve captures content from websites, social media and communication platforms, and archives them with accompanying key metadata to ensure evidentiary value by establishing and demonstrating authenticity and chain of custody.

Language:TypeScriptMIT6 4 81

topic-classification

Language:PythonMIT5 15 11

pdf-text-extraction

This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of text extraction from PDF files.

Language:PythonApache-2.04 60

pdf_metadata_extraction

pdf_information_extraction

Language:Python4 8 3

pdf-labeled-data

Language:TypeScriptApache-2.03 110

semantic-search

Language:Python3 13 1

uwazi-design

300

uwazi-fixtures

Language:Shell3 120

pdf-table-of-contents-extractor

This project aims to extract Table of Contents (TOC) information from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of identifying and structuring the document's TOC.

Language:PythonApache-2.02 50