There are 6 repositories under url-extractor topic.
Extract and decompose URLs (including emails, which are conceptually a part of URLs) with robust patterns.
A fast tool to fetch URLs from HTML attributes by crawl-in.
A Minimal Yet Powerful Crawler for Extracting all The Internal/External/Fuzz-able Links from a website
Recursively extract urls from a web page for reconnaissance.
Tika based link (URL) extractor for httpreserve
An Apache Drill UDF for working with Twitter tweet text via the twitter-text Java library (https://github.com/twitter/twitter-text/tree/master/java)
🍊🔗 Squeeze some juice from URLs: A URL crawler/extraction library.
The eBay Listing Matcher is a Python script designed to compare and match eBay listings with parts from an Inventree instance. This script utilizes the eBay Trading API and the Inventree API to gather and process data.
A python script to extract URL from the text or paragraph.
Extract article title, description, images, keywords and authors from any URL
Extract URLs,endpoints,paths and word-lists form source files
Extact all URLs from anchor and image tags within a html/xhtml page and its children.
Extract urls from your a file or web address
Website URL Scanner is a simple command-line tool that allows you to scan a website and extract all URLs. It can be useful for various purposes, such as link analysis or checking for broken links.
Extract http/https URLs from any kind of text content.
URL Extractor is a simple Python code designed to extract the domain name from a list of URLs stored in a text file. This application provides a convenient way to extract and process URLs efficiently.
A small tool for extracting all urls from a blob of binary data (ex. PDFs).
File attachment and URL extractor for EML & MSG files using Python
LinkLifter is a Python script that searches for URLs in a given text file or recursively in a directory and its subdirectories. The found URLs, along with the file they are located in, are saved to a CSV file.
Bootcamp Laboratoria - Produto final do sprint 4. Biblioteca no npm para extracao de links em documento markdown.
URL Title Extractor is a Python program that extracts the titles of web pages from a file containing URLs. It uses the requests and BeautifulSoup libraries to extract the title and decode any HTML entities.