[Feature] Broken Link Checker - Github Action
JuanPabloDiaz opened this issue Β· comments
Description
I believe that it will be a great idea to setup a GitHub action that runs periodically and checks for broken links.
I just implemented it in my project and I would like to share my knowledge and received feedback to make it better.
Screenshots
here is an example from my own project:
Checklist
- I have checked the existing issues
- I have read the Contributing Guidelines
- I am willing to work on this issue (optional)
Hello JuanPabloDiaz!
Thank you for raising this issue! π Your contribution is valuable to us! π
Please make sure to follow our Contributing Guidelines. πͺπ»
Please only work on an issue if you're assigned; otherwise, the PR will be automatically closed.
Our review team will carefully assess the issue and reach out to you soon! π
We appreciate your patience!
That seems possible @JuanPabloDiaz. @rupali-codes, @Anmol-Baranwal, @aftabrehan, thoughts?
That seems possible @JuanPabloDiaz. @rupali-codes, @Anmol-Baranwal, @aftabrehan, thoughts?
it seems great though, but how does that actually work @JuanPabloDiaz ?
That seems possible @JuanPabloDiaz. @rupali-codes, @Anmol-Baranwal, @aftabrehan, thoughts?
it seems great though, but how does that actually work @JuanPabloDiaz ?
Github action to check for broken links in Markdown, HTML, and text files using Lychee, a fast link checker written in Rust.
Here is a full example of a GitHub workflow file:
It will check all repository links once per day and create an issue in case of errors.
name: Links
on:
repository_dispatch:
workflow_dispatch:
schedule:
- cron: "00 18 * * *"
jobs:
linkChecker:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Link Checker
id: lychee
uses: lycheeverse/lychee-action@v1
- name: Create Issue From File
if: env.lychee_exit_code != 0
uses: peter-evans/create-issue-from-file@v4
with:
title: Link Checker Report
content-filepath: ./lychee/out.md
labels: report, automated issue
The URLs is in json files (this codebase), so maybe try this in a sample private repo to see if it works.
Project implementations:
I'd like to share my experience with Lychee. I came across it yesterday and integrated it into my project. While it was helpful in identifying some broken links, it also generated some false positives (links that appear broken but function correctly). I'm still exploring the tool's functionality to potentially fine-tune its accuracy.
On a separate note, I encountered a permissions issue when trying to open new issues.
I hope this helps
@JuanPabloDiaz do you wanna work on it?
@JuanPabloDiaz do you wanna work on it?
Sure.
How can I test it? I understand that GitHub actions will need to be merge to see them in action. Is there a way to test them in development @rupali-codes ?
@JuanPabloDiaz do you wanna work on it?
Sure. How can I test it? I understand that GitHub actions will need to be merge to see them in action. Is there a way to test them in development @rupali-codes ?
Yeah, we can test them in the forked repo
You are absolutely right lol
But does it work from any branch, or does it have to be in the main branch of the forked repo @rupali-codes ?
You are absolutely right lol
But does it work from any branch, or does it have to be in the main branch of the forked repo @rupali-codes ?
I figured it out. It has to be in the main brand of the forked repo
Errors in CONTRIBUTING.md
[404] https://github.com/CBID2/LinksHub-my-version-/blob/main/CODE_OF_CONDUCT.md | Failed: Network error: Not Found
[404] https://github.com/rupali-codes/LinksHub/blob/maintainers_info/README.md#maintainers- | Failed: Network error: Not Found
71? That number seems low.
I include JSON files in the link checker and the number went up to:
https://github.com/JuanPabloDiaz/LinksHub/actions/runs/9306836967
There are some false positive. After double checking the list of 29 errors. Here are some resources that should be removed:
-
Errors in database/resources/e-book.json
"name": "Learning REACT"
[404] https://drive.google.com/file/d/1AZwshgVyazeIJ95ng6Pg1zUbVQoYX93t/view?usp=share_link | Failed: Network error: Not Found -
Errors in database/youtube/web-development.json
"name": "Easy Tutorials"
[404] https://www.youtube.com/@EasyTutorialsVideo/ | Failed: Network error: Not Found -
Errors in database/ai_tools/design.json
"name": "Adobe Firefly"
[TIMEOUT] https://www.adobe.com/in/products/firefly.html | Timeout
"name": "Designs.ai",
[ERR] https://designs.io/ | Failed: Network error: error sending request for url (https://designs.io/) -
Errors in database/youtube/web3-metaverse.json
"name": "LearnWeb3 DAO"
[404] https://www.youtube.com/@LearnWeb3DAO | Failed: Network error: Not Found -
Errors in database/frontend/colors.json
"name": "ColorWave AI"
[ERR] https://www.colorwave.dev/ | Failed: Network error: error sending request for url (https://www.colorwave.dev/)
"name": "UI Color Picker"
[ERR] https://uicolorpicker.com/ | Failed: Network error: error sending request for url (https://uicolorpicker.com/) -
Errors in database/resources/blogs.json
"name": "JavaScript Tricks-Raj's Blog"
[ERR] https://resourcegallery.live/blog/JavaScript%20Top%2010%20Tips%20&%20Tricks/ | Failed: Network error: error sending request for url (https://resourcegallery.live/blog/JavaScript%20Top%2010%20Tips%20&%20Tricks/) -
Errors in database/cybersecurity/web_application_security.json
"name": "Certified Web Application Penetration Tester (CWAPT) Certification"
[404] https://mile2.com/web-application-penetration-tester-cwapt.html | Failed: Network error: Not Found
You are absolutely right lol
But does it work from any branch, or does it have to be in the main branch of the forked repo @rupali-codes ?
It would be best to do it from the forked repository.