Y2Z / monolith

⬛️ CLI tool for saving complete web pages as a single HTML file

Home Page:https://crates.io/crates/monolith

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feature request : Rate-limiting CLI option

olivier-fs opened this issue · comments

Hi
Just tried it with en.wikipedia.org, works great for me, CSS, IMG etc. saved page is perfect clone.

Just noticed there is no request rate limiting option.
monolith emitted the required 92 http request to complete its task.
Of course en.wikipedia.org took it as 92 is not "too many".
But It would be nice not to hammer the target site, with a rate-limiting option, e.g. using -r [--rate-limit] .

Again, thanks for this nice tool.

Greetings Olivier,

thank you very much for the suggestion!

Currently monolith makes synchronous requests one after another, and I've never experienced errors associated with making too many requests myself — after all they are requests to different resources, and unlike monolith, our browsers make them all simultaneously, usually without hitting any limits.

But I completely understand that some severs may have rate limiting in place, and such feature could come in handy in those cases.

Could you please give examples of such servers, if you have found any? I'd also like to know if any popular CLI tools you may be aware of provide rate limiting? — just to copy those flags from them and make this feature more familiar to users.

Hi
You're absolutely right

  • I didn't notice the requests were sync and seriliazed
  • This tool is great for single page download, and is not a spider that will dig into hrefs
    So the sync requests design choice is fine.
    Guess I'm way to used to think parallel, as I'm writing lots of axum handlers/requests... But I should have checked all the CLI parameters. Sorry for the noise.

No worries at all! I'll add this flag once monolith is advanced enough to make async calls — it's more of a scraper than browser right now.