CodeDotJS / urlist

A Python script that extracts URLs from a text file containing a list of websites.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

URLs from URL(s)


Purpose

  • This script extracts all the URLs from a text file containing a list of websites, and saves them in JSON format.
  • Handles missing schemas and fixes relative URLs to ensure accurate results.
  • Uses multithreading to concurrently process multiple websites, so it's fast!

Usage

  • Install the required modules
$ pip install aiohttp beautifulsoup4 fake_useragent
  • Download the script
$ curl -OL https://raw.githubusercontent.com/CodeDotJS/urlist/master/extractor.py
  • Run
$ python extractor.py

Note: If you need to save all the links present in the JSON to a text file, you can download

$ curl -OL https://raw.githubusercontent.com/CodeDotJS/urlist/master/generateTxt.py

Reason

I needed a tool to generate thousands of active URLs and dump them as JSON, so I built one.

License

MIT

About

A Python script that extracts URLs from a text file containing a list of websites.

License:MIT License


Languages

Language:Python 100.0%