amoilanen / js-crawler

Web crawler for Node.JS

amoilanen/js-crawler Issues

Crawler is not a function
Updated 3 years ago3
stop crawling
Updated 3 years ago5
Usage
Closed 4 years ago2
Link crawling gets stuck in Wordpress sites
Updated 4 years ago
Crawler completes then cancels the output of "crawledUrls"?
Updated 4 years ago
How to deal with basic auth?
Updated 4 years ago1
Crawler stopped without reason and any error
Updated 4 years ago1
freeze and defrost for saving and resuming a big crawl? enhancement
Updated 5 years ago5
Crawler stopped without reason and any error
Updated 5 years ago
How to assign encoding of response content?
Updated 6 years ago6
js-crawler seems to crawl the same url multiple times
Closed 7 years ago3
I think shouldCrawl code example is incorrect
Closed 6 years ago2
Is it possible to just crawl images using this package?
Updated 6 years ago1
`knownUrls` processing logic is incorrectly using underscore
Updated 7 years ago3
Pair js-crawler with PhantomJS
Updated 7 years ago1
How to deal with ETIMEDOUT error and pending forever?
Updated 7 years ago5
Basics
Closed 7 years ago3
How to deal with shortened URLs
Updated 7 years ago3
Follow redirects
Updated 7 years ago1
robots.txt
Updated 7 years ago1
path in variable
Closed 8 years ago2
shouldCrawl doesnt call onAllFinished
Closed 8 years ago3
forgetCrawled method
Closed 8 years ago2
Evaluate selectors
Updated 8 years ago3
Page that linked to current page
Closed 8 years ago4
Can we promisfy js-crawler
Updated 8 years ago2
getting unknown encoding error on some pages
Closed 8 years ago2
Is it can use by only JavaScript?
Closed 8 years ago6
bug empty response
Closed 8 years ago5
Publishing latest fixes?
Closed 8 years ago3
Add <base> tag support for relative urls
Closed 8 years ago2
Getting every type of url from the page source
Updated 8 years ago1
What the content exactly is when the requested resource are binary, e.g., images or pdf file?
Closed 8 years ago9
When run asynchronous by Executor, depth lost its scope
Closed 8 years ago3
Feature to crawl up to a limited number of pages
Updated 8 years ago1
Ajax crawling
Closed 9 years ago2
Would be awesome to apply a selector to limit scope of crawled links
Updated 9 years ago1
The "depth" for crawling a website completely
Closed 9 years ago4
How can we crawl local websites?
Closed 9 years ago2
Follows Redirects Outside shouldCrawl Function
Closed 9 years ago6
Network Challenges with Depth > 2
Closed 9 years ago6
How do you slow this down so it's not hammering the server you are crawling?
Closed 9 years ago3
Please modify the readme at the configuration examples
Closed 9 years ago1
Ability to pass custom user agents to crawler requests.
Closed 9 years ago4
Know When All Crawling is Complete
Closed 9 years ago4
buggy url resolve
Closed 9 years ago3