Scraping?

Question

Scraping?

xbc5 opened this issue 2 years ago · comments

Is this a good fit for scraping?

I am looking to to scrape some fully rendered web pages -- so JS support needed. I have struggled to find something stable and well supported. I thought that remote control of a popular web browser is the best bet.

I need: text content (headers, titles, paragraphs), links to images; also possibly preserve anchors in a structured way (e.g. so I can render footnotes).

Is it possible with this lib in a generic way? (i.e. without foreknowledge of the page structure)

Thanks.

xbc5 · Answer 1 · Mon Feb 07 2022 17:16:01 GMT+0800 (China Standard Time)

Sorry. RTM. FS.