Why class instead of instances?
wasabigeek opened this issue · comments
Genuinely curious, it seems a bit unusual as it's not as straightforward to change the start_urls at runtime (if I understood correctly, class instance variables are not thread-safe, so if I change them at runtime, they might wreck havoc in something like Sidekiq?).
Do you happen to have a link read to the code at hand?
i am a casual user myself so I don't know much of the internals of kimurai. It did what I tried to do with it but there were a few strange warnings, so perhaps the API could be improved and warnings reduced.
I agree, it makes much more sense to write:
class GithubSpider < Kimurai::Base
def initialize
@start_urls = ["https://github.com/search?q=Ruby%20Web%20Scraping"]
end
def parse(response, url:, data: {})
...
end
end
GithubSpider.new.crawl!
It could enable things like GithubSpider.new(start_urls: ["https://github.com/search?q=Ruby%20Web%20Scraping"]).crawl!