There are 10 repositories under metadata-extraction topic.
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
Free database schema discovery and comprehension tool
Web version of ytmdl. Allows downloading songs with metadata embedded from various sources like itunes, gaana, LastFM etc.
Tern is a software composition analysis tool and Python library that generates a Software Bill of Materials for container images and Dockerfiles. The SBOM that Tern generates will give you a layer-by-layer view of what's inside your container in a variety of formats including human-readable, JSON, HTML, SPDX and more.
Fast, cross-platform Node.js access to ExifTool
Utility to download and extract document metadata from an organization. This technique can be used to identify: domains, usernames, software/version numbers and naming conventions.
ExifLooter finds geolocation on all image urls and directories also integrates with OpenStreetMap
Android application for analyzing installed apps
PhotoStructure for Servers
A collection of tools for forensic analysis
Adult Media Manager is the ultimate media manager for your adult movies and videos. Organize your content for Kodi, Plex, and other media centers.
Digital forensic analysis tool that provides a user-friendly interface for investigating disk images.
MetaData html scraper and parser for Node.js (supports Promises only)
This package implements a complete SpyWare.
🏷️ A JavaScript library for scraping/parsing metadata from a web page.
LazyOwn RedTeam/APT Framework is the first RedTeam Framework with an AI-powered C&C, featuring rootkits to conceal campaigns, undetectable malleable implants compatible with Windows/Linux/Mac OSX, and self-configuring backdoors. With its Web interface and powerful Console Client, it is the best combination for your RedTeam/APT campaigns.
Goblyn is a Python tool focused to enumeration and capture of website files metadata.
Sample code with integration between Data Catalog and RDBMS data sources.
A native Go SDK for the Extensible Metadata Platform (XMP)
Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
A cross-platform library and command-line tool that extracts the currently playing track in Traktor and optionally outputs to a file with configurable formatting.
The this.url class is designed to fetch and parse URL data, returning an object with structured information that can then be used for machine learning algorithms in a database or other storage.
Natural language processing of Gene Expression Omnibus data
A browser extension to display EXIF data of images directly on web pages.
A Python library to read metadata from images created by Stable Diffusion.
Pure Python (no additional dependencies) helper module to read metadata from the header of OpenEXR files.
A pure-dart audio metadata reader
A modern image browser and search tool that uses AI to generate a "semantic map" of your collection.
Big Data tool for metadata extraction (Exif), enrichment (using DeepLearning) and analysis
🚀 Next-gen responsive images in React