HTTP requests do not support customized charsets
LukeWu2020 opened this issue · comments
Current behavior
Even if the charset is specified, An http request with string body will be parsed by ISO_8859_1. Why not use the charset what I have set up? Is it a bug or are you lazy?
Can't you just add a parameter that represents the charset to the append method ?
Expected behavior
jodd.http.HttpBase#populateHeaderAndBody
protected void populateHeaderAndBody(final Buffer target, final Buffer formBuffer, final boolean fullRequest) {
for (String name : headers.names()) {
List<String> values = headers.getAll(name);
String key = capitalizeHeaderKeys ? HttpUtil.prepareHeaderParameterName(name) : name;
target.append(key);
target.append(": ");
int count = 0;
for (String value : values) {
if (count++ > 0) {
target.append(", ");
}
target.append(value);
}
target.append(CRLF);
}
if (fullRequest) {
target.append(CRLF);
if (form != null) {
target.append(formBuffer);
} else if (body != null) {
target.append(body);
}
}
}
jodd.http.Buffer#append(java.lang.String)
public Buffer append(final String string) {
ensureLast();
try {
byte[] bytes = string.getBytes(StringPool.ISO_8859_1);
last.append(bytes);
size += bytes.length;
} catch (UnsupportedEncodingException ignore) {
}
return this;
}
Steps to Reproduce the Problem
/**
* Appends string content to buffer.
*/
public Buffer append(final String string, final String encoding) {
ensureLast();
try {
byte[] bytes = string.getBytes(StringPool.ISO_8859_1);
last.append(bytes);
size += bytes.length;
} catch (UnsupportedEncodingException ignore) {
}
return this;
}
Hi! This should work, we do handle the UTFs, but I will check it out!
@igr Yes. It just happens when the body string is appended to the buffer to be sent. It works well in other scenarios. Please see method jodd.http.Buffer#append(java.lang.String)
@lwu-gd-china Did you specified the charset using charset()
? I mean, if you are sending strings, you can use the charset
to specify the charset and the conversion. If you are sending a buffer, then its just plain bytes, and its up to you to read it on receive side.
Could you please give me a full example of the Http usage? The use case of sending content in different encodings is supported.
Hi, man. The charset was set up already. The issue is that when append body string to buffer, the charset is not used! Here is simple demo and screenshots.
public class BugReview {
public static void main(String[] args) {
HttpRequest httpRequest = new HttpRequest();
httpRequest.charset(Charset.forName("UTF-8").displayName());
httpRequest.set("http://www.baidu.com");
httpRequest.send();
}
}
@igr Look at the latest screenshot. When the body string is append into buffer, ISO_8859_1 is used. Charset should be used what I have set up for request. This method should have a parameter. Do you know what I mean?
Aha, I see! @lwu-gd-china
First, thank you for the screenshots! Let me analyze!
@lwu-gd-china How do you specify body? with body()
or with bodyText()
?
The key difference between body()
and bodyText()
is that the second one, the bodyText()
actually takes into account both media type and/or encoding if set.
Hi man, thanks for you reply very much! I know what happened. I should use bodyText() instead of body(). It will be close.
No worries @lwu-gd-china
Sorry, I need to document this better!