Michael-F-Bryan / mdbook-linkcheck

A backend for `mdbook` which will check your links for you.

Home Page:https://michael-f-bryan.github.io/mdbook-linkcheck/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tests assume network connectivity

tv42 opened this issue · comments

commented
running 5 tests
test emit_valid_suggestions_on_absolute_links ... ok
test skip_web_links ... ok
test check_all_links_in_a_valid_book ... FAILED
test detect_when_a_linked_file_isnt_in_summary_md ... ok
test correctly_find_broken_links ... FAILED

failures:

---- check_all_links_in_a_valid_book stdout ----
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning chapter_1.md
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning nested/README.md
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning nested/sibling.md
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "../chapter_1.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "../chapter_1.md" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "/chapter_1.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "/chapter_1.md" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "../chapter_1.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "../chapter_1.md" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z WARN  linkcheck::validation::filesystem] Not checking that the "Subheading" section exists in "/build/source/tests/all-green/src/chapter_1.md" because fragment resolution isn't implemented
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "/chapter_1.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "/chapter_1.md" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z WARN  linkcheck::validation::filesystem] Not checking that the "Subheading" section exists in "/build/source/tests/all-green/src/chapter_1.md" because fragment resolution isn't implemented
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "sibling.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "sibling.md" resolved to "/build/source/tests/all-green/src/nested/sibling.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./sibling.md" in the context of "/build/source/tests/all-green/src/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "./sibling.md" resolved to "/build/source/tests/all-green/src/nested/sibling.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./chapter_1.md" in the context of "/build/source/tests/all-green/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "./chapter_1.md" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./chapter_1.html" in the context of "/build/source/tests/all-green/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "./chapter_1.html" resolved to "/build/source/tests/all-green/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "nested/README.md" in the context of "/build/source/tests/all-green/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "nested/README.md" resolved to "/build/source/tests/all-green/src/nested/README.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "nested/" in the context of "/build/source/tests/all-green/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "nested/" resolved to "/build/source/tests/all-green/src/nested/README.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::web] Checking "https://www.google.com/" on the web
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::web] Checking "https://crates.io/crates/mdbook-linkcheck" on the web
[2021-06-01T21:27:47Z DEBUG linkcheck::validation] Ignoring "https://nonexistent.forbidden.com/"
thread 'check_all_links_in_a_valid_book' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
 [
     "../chapter_1.md",
     "../chapter_1.md#Subheading",
     "./chapter_1.html",
     "./chapter_1.md",
     "./sibling.md",
     "/chapter_1.md",
     "/chapter_1.md#Subheading",
<    "https://crates.io/crates/mdbook-linkcheck",
<    "https://www.google.com/",
     "nested/",
     "nested/README.md",
     "sibling.md",
 ]

', tests/smoke_tests.rs:201:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- correctly_find_broken_links stdout ----
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning chapter_1.md
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Found a (possibly) broken link to [incomplete link] at 268..285
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning deeply/nested/index.md
[2021-06-01T21:27:47Z DEBUG mdbook_linkcheck::links] Scanning second/directory.md
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./chapter_1.md" in the context of "/build/source/tests/broken-links/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "./chapter_1.md" resolved to "/build/source/tests/broken-links/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::web] Checking "http://this-doesnt-exist.com.au.nz.us/" on the web
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./foo/bar/baz.html" in the context of "/build/source/tests/broken-links/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "../../../../../../../../../../../../etc/shadow" in the context of "/build/source/tests/broken-links/src"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "../../chapter_1.md" in the context of "/build/source/tests/broken-links/src/deeply/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "../../chapter_1.md" resolved to "/build/source/tests/broken-links/src/chapter_1.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./chapter_1.md" in the context of "/build/source/tests/broken-links/src/deeply/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "./second/directory.md" in the context of "/build/source/tests/broken-links/src/deeply/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "../../second/directory.md" in the context of "/build/source/tests/broken-links/src/deeply/nested"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "../../second/directory.md" resolved to "/build/source/tests/broken-links/src/second/directory.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::web] Checking "https://github.com/Michael-F-Bryan/mdbook-linkcheck/issues/3#issuecomment-417400242" on the web
[2021-06-01T21:27:47Z WARN  linkcheck::validation::web] Fragment checking isn't implemented, not checking if there is a "issuecomment-417400242" header in "https://github.com/Michael-F-Bryan/mdbook-linkcheck/issues/3#issuecomment-417400242"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Checking "sibling.md" in the context of "/build/source/tests/broken-links/src/second"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] "sibling.md" resolved to "/build/source/tests/broken-links/src/second/sibling.md"
[2021-06-01T21:27:47Z DEBUG linkcheck::validation::filesystem] Custom validation reported "/build/source/tests/broken-links/src/second/sibling.md" as invalid because An OS-level error occurred
thread 'correctly_find_broken_links' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
 [
     "../../../../../../../../../../../../etc/shadow",
     "./chapter_1.md",
     "./foo/bar/baz.html",
     "./second/directory.md",
     "http://this-doesnt-exist.com.au.nz.us/",
<    "https://github.com/Michael-F-Bryan/mdbook-linkcheck/issues/3#issuecomment-417400242",
     "sibling.md",
 ]

', tests/smoke_tests.rs:201:5


failures:
    check_all_links_in_a_valid_book
    correctly_find_broken_links

test result: FAILED. 3 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.03s

All the links with http: or https: destinations got lost. I'm guessing mdbook-linkcheck is trying to fetch them.

I'd suggest splitting external link checking into two phases: 1. do syntactic checks only and dump out URLs, 2. verify all external URLs in a list are valid. And having a way to disable the tests for the latter.

There is no way to tell whether you can't retrieve a URL because you have no internet access or the website is down, so it makes sense that the links are reported as broken.

If you are running in an environment with no internet you can already set follow-web-links = false under the [mdbook-linkcheck] section of your book.toml and it will only check local URLs.

If you are running this in CI, you should be able to override the configuration option using an environment variable. I don't recall the exact form mdbook uses for this, but it'll be something like setting MDBOOK__LINKCHECK__FOLLOW_WEB_LINKS to false.

commented

I thought follow-web-links=false was the default.

I see it is not:

const CONFIG: &str = r#"follow-web-links = true

What lead me to believe that:

follow-web-links = false

commented

For posterity, the correct form seems to be MDBOOK_LINKCHECK__follow_web_links=false.

However, that is not sufficient to make tests not attempt network connections:

$ strace -f -e connect env MDBOOK_LINKCHECK__follow_web_links=false cargo test
[...]
[pid 382988] connect(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("13.35.89.25")}, 16) = 0
[...]
[pid 382920] connect(5, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("13.35.89.25")}, 16) = -1 EINPROGRESS (Operation now in progress)

So, while that may help me with running mdbook itself, that doesn't help me run mdbook-linkcheck tests.

(I have no idea why port 0 makes an appearance there; that's probably some sort of a bug.)

Also, I still posit that tests should not use the internet by default; I often work with a slow-to-none internet connection, and such timebombs are really frustrating to troubleshoot.

Sorry, that configuration option is for when you are using mdbook to build your book and test that all of its links are correct. For the integration tests it is hard coded to always be on.

Also, I still posit that tests should not use the internet by default

The test suite for this crate needs the internet because we need to test that we can check things on the internet. I don't expect normal users to be running the mdbook-linkcheck test suite because that's only needed by the developers of mdbook-linkcheck, so running the crate's tests without an internet connection isn't a supported use case.

Why do you need to run the crate's test suite?

commented

I'm writing a Nix/NixOS file for it, https://nixos.org/.
Essentially packaging it for a Linux distribution.

Nix builds are run in a hermetically sealed sandbox, for reproducability.

If you are just packaging the program then you shouldn't you only need to do a `cargo build?

The majority of this program's functionality is connecting to the internet and seeing if certain links are valid. If you want to skip those tests (which are the majority), it would be like releasing a new Rust compiler without ever using it to compile a program.

However, if all you want is a sanity check to make sure the code isn't completely broken then you can run the crate's unit tests with cargo test --lib.

commented

Need just cargo build? Sure. Want to make sure it also works? Yes.

Also, for the record, what I value most is detecting broken internal links. Internet-wide linkrot is a disease I don't really wish to fight, and the only thing I could imagine myself really doing in most cases is replacing links with archive.org versions.

I'll dig around the packaging infrastructure to see how to change what it runs.