roach-php / core

The complete web scraping toolkit for PHP.

Home Page:https://roach-php.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using Proxies, possible?

Benoit1980 opened this issue · comments

Hi,

I cannot find it in your documentation but I would like to know if there is a way to change IP before running:

Roach::collectSpider();

Scraping without changing IP is not possible in mose case.

Thank you,

Scraping without changing IP is not possible in mose case.

Create middleware to handle outgoing request.

Link to official documentation:
https://roach-php.dev/docs/spider-middleware#request-middleware

In my free time, want to create ProxyMiddleware, in you can pass an array of proxies to it

https://roach-php.dev/docs/spider-middleware#request-middleware

Thank you. I checked the link but there is no mention of proxies using a middleware.

Looking also for that. Proxies are really import in web scraping topics ;)

Looking also for that. Proxies are really import in web scraping topics ;)

Yes for sure, I am waiting for them to add some kind of example. I am not sure why there is no example on this, as the proxy idea is as important as the spider.

Upvote on this, also looking for an easy way to use proxies with roach.

I need help! I need for test and development, array proxy more 5 items. Contact my on email (in profile) or telegram (this nick).

I created pull-request for added ProxyMiddleware, currently for one proxy.

For use:

	public array $downloaderMiddleware = [
		[ProxyMiddleware::class, ['proxyList' => [
			'http://xxx.xxx.xxx.156:3128',  // http://IP:PORT
		]]]
	];

Then I want to add processing by:

  • an array of proxies,
  • getting from a file,
  • getting from a database

I need help! My for test and development need array proxy more 5 items. Contact my on email (in profile) or telegram (this nick).

I created pull-request for added ProxyMiddleware, currency for one proxy.

For use:

	public array $downloaderMiddleware = [
		[ProxyMiddleware::class, ['proxyList' => [
			'http://xxx.xxx.xxx.156:3128',  // http://IP:PORT
		]]]
	];

Then I want to add processing by:

  • an array of proxies,
  • getting from a file,
  • getting from a database

I think we are all pretty much looking for this solution.

This is part of the 3.0.0 release. See the documentation here: https://roach-php.dev/docs/downloader-middleware#proxy-middleware