clean url with nested scheme results in invalid url
Jetinho opened this issue · comments
Julien Jet commented
Hello,
I noticed an error in the clean
url method, for an input url which includes a second 'reference' url scheme, like the following url :
In that case https:// gets transformed into https:/, and and the cleaned url is not valid.
url = 'https://thumbor.forbes.com/thumbor/600x315/smart/https://specials-images.forbesimg.com/dam/imageserve/1038149341/960x0.jpg?fit=scale'
PostRank::URI.clean(url)
#=> "https://thumbor.forbes.com/thumbor/600x315/smart/https:/specials-images.forbesimg.com/dam/imageserve/1038149341/960x0.jpg?fit=scale"
I am going to add a workaround to my project for that case, since a correction would need to dig deeper in the method implemention, but I thought it could be interesting to report it her :).
Julien Jet commented
Actually, the problem comes from normalize
: in postrank-uri.rb:155
u.path = u.path.squeeze('/')
Ilya Grigorik commented
Heh, that's a fun edge case. What's your workaround? If you're for contributing a PR I'd be happy to review and help land it :)