gruns / furl

🌐 URL parsing and manipulation made easy.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding dates as strings to url introduces extra characters

UGuntupalli opened this issue · comments

Hello,
Thanks for the great work that went into putting this package together. I just came across this package and have started using it. I ran into an issue and was hoping you can quickly clarify if I am missing something or if it is a bug ?
I am using python 3.7.5 and furl 2.1.0 and getting this weird behavior. Can you kindly help ?

   my_param_arg = dict()
   my_param_arg["start_date"] = '20200301T08:00-0000' 

   my_url = furl("http://www.test.com/")
   my_url.add(my_param_arg).url 

Expected Behavior:
'http://www.test.com/?start_date=20200301T08%3A00-0000&start_date=20200301T08

Actual Behavior:
'http://www.test.com/?start_date=20200301T08%3A00-0000&start_date=20200301T08%3A00-0000'

Can you please explain why the extra characters(%3A00-0000) are being added to the url ? I would appreciate if you can offer a workaround as well.

Hey Uday!

I can't reproduce this issue with Python v3.7 and furl v2.1.0:

>>> from furl import furl
>>> d = dict()
>>> d["start_date"] = '20200301T08:00-0000'
>>> d
{'start_date': '20200301T08:00-0000'}

>>> f = furl("http://www.test.com/")
>>> f.add(d).url
'http://www.test.com/?start_date=20200301T08%3A00-0000'

%3A appears in the final URL because : needs to be URL encoded. %3A is the URL encoding of :, and the 00-0000 in %3A00-0000 is the tail portion of '20200301T08:00-0000' after the :.

@gruns,
Just for my understanding, why should it be encoded ? Additionally is there a way to by-pass that behavior ? The reason I ask is if I try the exact same thing using requests package, it builds the url as expected without encoding the : in the date

In short, : can be optionally encoded in the URL Query. From RFC 3986's grammar:

query       = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

furl does. requests doesn't. For full details, see https://en.wikipedia.org/wiki/Percent-encoding and https://tools.ietf.org/html/rfc3986.

Is it important for your use case that : remain unencoded in the URL? If so, why?

@gruns ,
Yes, I am trying to build a scrapper for CAISO API. (http://www.caiso.com/Documents/OASIS-InterfaceSpecification_v5_1_7Clean_Independent2019Release.pdf). The API url's do not expect that ":" be encoded and hence my ask. I found furl to be very clean and wanted to use it. for my application. So, if I understand correctly, I can't do this with furl - is that fair ?