Cookies in clientOptions
davidgeisler1998 opened this issue · comments
Hey there :)
I'm using the Spatie crawler to crawl Cookies from a website. To Receive all cookies I need to "Accept" the CookieConsent. Therefor I would like to set a cookie before the crawler visits the website.
I found out that the Crawler swap the "RequestOptions::COOKIES => true" Option to an empty CookieJar.
After that I tried to instanciate a CookieJar and provide it as the value instead true but that didnt work.
Another try was to set a Header option like this:
RequestOptions::HEADERS => [
'User-Agent' => Crawler::DEFAULT_USER_AGENT,
'Set-Cookie' => '{testName=testValue; path=/; secure; HttpOnly}',
]
But this is still not working. Do you have an idea how I can set cookies, before the crawler visits a website?
Hi,
RequestOptions are part of GuzzleHttp library, this includes the cookie Jar. You can check here https://github.com/guzzle/guzzle/blob/429cb6702659329819fb40c9487eac3132bdd80b/src/Client.php#L260 where the cookies config is converted from true to a CookieJar. You can pass a instantiated CookieJar instead of true.
There is a static function from the CookieJar to help instantiate it. Also there is a static method in the SetCookie Class to hel instantiate it and then create a new CookieJar with an array of SetCookie objects. I personally prefer the later as it allow me to config each cookie.
You can read more about the cookies in guzzle here https://docs.guzzlephp.org/en/stable/quickstart.html?highlight=cookie#cookies
Finally Set-Cookie
is a server response header MDN doc. You should use the Cookie
header MDN dock
Regards
Hi again :)
Thanks for your answer.
Now I got another Task and I want to click something with the Browsershot::click() method.
Is it possible, that the click will be executed only once at the start of the crawl and wont be called by the following crawls?
e.g. I want to accept the cookieconsent one time, and after that its already accepted, but i got the problem that all other pages dont know the cookieconsent?!