sjdirect / abot

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

User Agent config value appears to be getting split on spaces and sending requests with multiple user-agent headers

entryspace opened this issue · comments

I was debugging an implementation using Abot2 because I was seeing a lot of 500 errors. To try and get more visibility into the problem, I started outputting the request that was sent be my crawler, where I discovered that the user agent string I had set in the configuration, and which contained two spaces, had been split into three different user-agent headers in every request that was being sent.

As a temporary workaround, I've just removed the spaces for now, but that won't work if trying to match the user-agent of an actual browser.

Just testing this and am not able to replicate. I see this being sent for my test...

GET http://yahoo.com/ HTTP/1.1
User-Agent: Test 1 Test 2 Test 3 Test 4
Accept: /
Host: yahoo.com

What platform are you running on? Possible you have some line break symbols in there somehow? Can you isolate this down to a unit test?

Unable to repro. Closing unless a reproduce-able example is given