UnicodeEncodeError: 'latin-1' codec can't encode character '\u2082' in position 289702: Body ('₂') is not valid Latin-1.
wlad opened this issue · comments
I'm trying to POST an XML file which has elements like <items id="text">SpO₂</items>
. Request fails with following error:
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2082' in position 289702: Body ('₂') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
Traceback (most recent call last):
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/RequestsLibrary/utils.py", line 138, in decorator
return func(*args, **kwargs)
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/RequestsLibrary/RequestsOnSessionKeywords.py", line 60, in post_on_session
response = self._common_request("post", session, url,
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/RequestsLibrary/RequestsKeywords.py", line 37, in _common_request
resp = method_function(
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/wlad/.venvs/ehrbase/lib/python3.9/site-packages/urllib3/connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/lib/python3.9/http/client.py", line 1257, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.9/http/client.py", line 1302, in _send_request
body = _encode(body, 'body')
File "/usr/lib/python3.9/http/client.py", line 164, in _encode
raise UnicodeEncodeError(
Here is how I send the request (${file} is loaded via Get File
keyword)
${resp}= POST On Session ${SUT} /definition/template/adl1.4 expected_status=anything
... data=${file} headers=${headers}
If I remove ₂
from the payload the request succeeds
What am I missing? The XML files actually starts with
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
Hi @wlad ,
I was able to reproduce the error.
The cause for this behavior is in python's http client.
Here, we have the following code:
def _encode(data, name='data'):
"""Call data.encode("latin-1") but show a better error message."""
try:
return data.encode("latin-1")
except UnicodeEncodeError as err:
raise UnicodeEncodeError(
err.encoding,
err.object,
err.start,
err.end,
"%s (%.20r) is not valid Latin-1. Use %s.encode('utf-8') "
"if you want to send it encoded in UTF-8." %
(name.title(), data[err.start:err.end], name)) from None
the line return data.encode("latin-1")
is the where the error occurs.
As you can see, it tries to decode the data as latin-1, disregarding <?xml version="1.0" encoding="utf-8" standalone="yes"?>
in the xml file.
This issue has been raised in requests, too: psf/requests#1822 (comment)
There is a workaround. If you modify your test case like this, the requests should succeed:
${file}= Get File /path/to/file.xml encoding=latin-1
${file_utf8}= Evaluate """${file}""".encode("utf-8")
${resp}= POST On Session ${SUT} /definition/template/adl1.4 expected_status=anything
... data=${file_utf8} headers=${headers}
or have the file content encoded as latin-1
${file}= Get File /path/to/file.xml encoding=latin-1
${resp}= POST On Session ${SUT} /definition/template/adl1.4 expected_status=anything
... data=${file} headers=${headers}