zhllxt / asio2

Header only c++ network library, based on asio,support tcp,udp,http,websocket,rpc,ssl,icmp,serial_port,socks5.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

always got 404 page on http requirement

daohuangpai opened this issue · comments

on test demo ssl_http_client line 40
auto reprss = asio2::https_client::execute("https://github.com/freefq/free", std::chrono::seconds(60));
always got a 404 page

the reason is we suppose to ack a requirement to link https://github.com/freefq/free but execute function result is sending a https://github.com/freefq%2Ffree requirement
my chrome browse got right result by https://github.com/freefq/free also failed 404 page by https://github.com/freefq%2Ffree

reason on code:
on file http_util.hpp line 231 to line 241 percent encoding all remaining url
here is my thoughts:
i think one symbol need percent encoding is to avoid conflicting with its real meaning
a link is like a filesystem,symbol '/' mean levels
suppose github.com/freefq/free to C:/test/scene1
change C:/test/scene1 to C:/test%2Fscene1 outcome is not a valid address
the scene percent encoding can make sense is the folder name with a ‘/’ inside
so we assume 'sce/ne1' is a folder name then percent encoding it change to C:/test/sce%2Fne1
percent encoding on url should be a user initiate behavior so the first paramter should fill with a url already percent-encoded by user when using asio2::https_client::execute

and now my solve is simpling block percent encoding on file http_util.hpp line 231 to line 241
from
if (/*std::isalnum(c) || */unreserved_char[c])
{
r += static_cast<rvalue_type>(c);
}
else
{
r += static_cast<rvalue_type>('%');
rvalue_type h = rvalue_type(c >> 4);
r += h > rvalue_type(9) ? rvalue_type(h + 55) : rvalue_type(h + 48);
rvalue_type l = rvalue_type(c % 16);
r += l > rvalue_type(9) ? rvalue_type(l + 55) : rvalue_type(l + 48);
}
to
r += static_cast<rvalue_type>(c);
and got the correct page result
i dont konw is correct or not or may cause other error on some scene

Yes , this is a bug.
Such a simple bug, I unexpectedly did not considered, also did not test out, too outrageous.

The reasonable modification would like this:
file http_util.hpp line 85:

	static constexpr char unreserved_char[] = {
		//0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 0
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 1
		  0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, // 2 -- here, change the last char from 0 to 1 of this line.
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, // 3
		  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // 4
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, // 5
		  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // 6
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, // 7
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 8
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 9
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // A
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // B
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // C
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // D
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // E
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0  // F
	};

You solve must be cause other error on some scene.

Just now, I merged a bunch of improvements from develop to master, so you can use the latest master branch code directly.

Yes , this is a bug. Such a simple bug, I unexpectedly did not considered, also did not test out, too outrageous.

The reasonable modification would like this: file http_util.hpp line 85:

	static constexpr char unreserved_char[] = {
		//0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 0
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 1
		  0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, // 2 -- here, change the last char from 0 to 1 of this line.
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, // 3
		  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // 4
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, // 5
		  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // 6
		  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, // 7
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 8
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 9
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // A
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // B
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // C
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // D
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // E
		  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0  // F
	};

grateful for your promptly reply!
yes, its work!
but I still have a little questions:
the comments upon unreserved_char on file http_util.hpp line 45 to line 52
// ---- RFC 3986 section 2.2 Reserved Characters (January 2005)
// !#$&'()*+,/:;=?@[]
//
// ---- RFC 3986 section 2.3 Unreserved Characters (January 2005)
// ABCDEFGHIJKLMNOPQRSTUVWXYZ
// abcdefghijklmnopqrstuvwxyz
// 0123456789-_.~
what i understand from this rule is array unreserved_char means whether a char on assii coding is a unreserved characters ,for instance,
'A' is 0x41 as a unreserved characters so unreserved_char[0x41] = 1 ,but 0x2F position representative for '/' should be zero because '/' is a reserved characters base on the rule upon( RFC 3986 section 2.2 Reserved Characters (January 2005)) ,replacing coordinate 0x2F from 0 to 1 make me confuse whether '/' is reserved characters or unreserved characters.

and also ASIO2 is a great modern C++ project! i learn a lot form it,thanks for your awesome job and selfless sharing!

I think the '/' should be unreserved characters. so the '/' should't be encoded.
Otherwise , the https://github.com/freefq/free will be encoded as https://github.com/freefq%2Ffree, this is not what we want.
Now, how to test which characters should be encoded, and which not? you can download a net debugging assistant, then start a tcp server at 127.0.0.1:80, then type http://127.0.0.1/freefq/free in the browser, you can see the http request content in the assistant will be like this: GET /freefq/free HTTP/1.1
So we can be sure that '/' is not encoded when used in the browser.
And i test "C# System.Web.HttpUtility.UrlEncode", "java.net.URLEncoder.encode", as you see in the file http_util.hpp, each of their results is different, and neither fully adhered to the rules of the RFC.
So I combined these results and implemented a url_encode function that I thought was more compatible.

Sorry , my english is poor, as a result, it took me a long time to write these paragraph, and I found it very painful and probably had some grammatical errors.

I think the '/' should be unreserved characters. so the '/' should't be encoded. Otherwise , the https://github.com/freefq/free will be encoded as https://github.com/freefq%2Ffree, this is not what we want. Now, how to test which characters should be encoded, and which not? you can download a net debugging assistant, then start a tcp server at 127.0.0.1:80, then type http://127.0.0.1/freefq/free in the browser, you can see the http request content in the assistant will be like this: GET /freefq/free HTTP/1.1 So we can be sure that '/' is not encoded when used in the browser. And i test "C# System.Web.HttpUtility.UrlEncode", "java.net.URLEncoder.encode", as you see in the file http_util.hpp, each of their results is different, and neither fully adhered to the rules of the RFC. So I combined these results and implemented a url_encode function that I thought was more compatible.

Sorry , my english is poor, as a result, it took me a long time to write these paragraph, and I found it very painful and probably had some grammatical errors.

Yes, I understand. The reality project asking for more flexible to adapt real world.
Thanks for the patience to explain in such detail for me!