michaelisprihanto / Broken-Link-Crawler

:snake: + :robot: Python bot that crawls your website looking for dead resources like links and images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Broken Link Crawler

Let's say I have a website and I want to find any dead links and images on this website.

$ python deadseeker.py 'https://healeycodes.github.io/'
> 404 - https://docs.python.org/3/library/missing.html
> 404 - https://github.com/microsoft/solitare2

It's that simple. The website is crawled, and all href and src attributes are sent a request. Errors are reported. This bot doesn't observe robots.txt but you should.

This was for my tutorial on building a dead link checker so it's scope has been kept quite small.

It is not a clever bot. But it is a good bot.

About

:snake: + :robot: Python bot that crawls your website looking for dead resources like links and images

License:MIT License


Languages

Language:Python 100.0%