Y2Z / monolith

⬛️ CLI tool for saving complete web pages as a single HTML file

Home Page:https://crates.io/crates/monolith

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to extract files in html

wankio opened this issue · comments

With mht i can still extract files inside it, but when i tried to extract html created by this app, it can't

Hi @wankio,

could you elaborate a little more, what do you mean exactly by "extract"? Are you talking about going into the source of the saved HTML page and copying strings from there? They are base64 data URLs, some editors highlight them to be clickable. Also, if you open it in the browser, you should be able to "save image as..." and access embedded assets via inspection tool.

before, i was using WebScrapBook, and some mht extension that allowed me to download page and save it. I can easily right click on the final file and use 7zip or winrar to extract the content from saved file. With that i can check which files is missing

with monolith i feel like with CLI, it archived web much faster but it lack of

  1. ability for extract
  2. automatic set download dir as domain from urls
    https://github.com/Y2Z/monolith/issues/311 > saved as github.com dir or github.com/Y2Z/monolith/issues/311/pagetitle.extension (filename as page title)