ruippeixotog / scala-scraper

A Scala library for scraping content from HTML pages

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Browser.post() doesn't accept form params with the same name

karol-brejna-i opened this issue · comments

post() accepts Map[String, String]

A form data set is a sequence of control-name/current-value pairs constructed from successful controls
(https://www.w3.org/TR/html401/interact/forms.html#h-17.13.3.2)

Some web services use this feature to accept array values.

For example, I've encountered something like this:

<form action="gestion.php" method=post>
<input type="text" size="20" maxlength="20" name="pNomPartie" value="">
<input type="text" size="10" maxlength="20" name="pInvite[]" value="">
<input type="text" size="10" maxlength="20" name="pInvite[]" value="">
</form>

With current implementation sending the form is impossible.

Yes, you're right; the Browser API provides very simple ways to get a page, but they do not cover every possibility of an HTTP request.

The parameter type should be a Seq[(String, String)], and I think we can change it immediately for the next release, since we're replacing it with a wider type. For the formData extractor (which I don't know if you're using or not) the situation is a bit different, since users may be relying on the fact that it returns a Map and I don't want to break source compatibility. For those cases a solution may be to add an arrayFormData extractor returning a Seq[(String, String)] and make it the only formData extractor later in 3.0.

Thanks for the info.

Is there any roadmap presented somewhere?

Hi @karol-brejna-i. I don't have a roadmap for scala-scraper, I implement some of the requested features and improve the library as I find the time to do it. I'll try to implement this particular feature next week. Pull requests are also very welcome :)