crossbario / autobahn-testsuite

Autobahn WebSocket protocol testsuite

Home Page:https://crossbar.io/autobahn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

12.5.* test cases invalid utf8 splitting

rhatlapa opened this issue · comments

When using autobahn testsuite installed using pip, I am getting rejected requests from server due some characters in the text frame are considered non UTF-8 encoded (with autobahntestsuite 0.7.5, I don't see this issue). I see the issue with some of the websocket compression tests (some of 12.4.*, e.g. with 12.4.5).

I don't see in the git history anything special which could trigger it, the only idea I have is that somehow during the autobahntestsuite release something weird happend.

Do you have any idea what could cause this issue?

@rhatlapa Do you have your test reports online? I would like to look .. it sounds strange, but maybe there is something.

You did install the testsuite into a dedicated own virtualenv, right? Which Python? Please post exact Python version here (python -V).

Besides the Python version, system locale, we haven't pinned all deps: https://github.com/crossbario/autobahn-testsuite/blob/master/autobahntestsuite/setup.py#L87

Of the unpinned ones, Twisted is the only one really relevant (on non-TLS at least).

We probably should pin everything.

Repeatability in this case (a test suite) is highly important obviously. Even if that means that we don't get fixes from newer versions of deps - as long as those issue not affect the testsuite in a fundamental way.

I have it installed in docker.
Python version is Python 2.7.8
pip list shows Twisted (15.1.0) and txaio (2.1.0)

This is full pip list result:

asn1crypto (0.22.0)
autobahn (0.10.9)
autobahntestsuite (0.7.6)
cffi (1.10.0)
characteristic (14.3.0)
cryptography (1.8.1)
decorator (3.4.0)
enum34 (1.1.6)
idna (2.5)
incremental (16.10.1)
iniparse (0.4)
ipaddress (1.0.18)
Jinja2 (2.9.6)
klein (17.2.0)
linecache2 (1.0.0)
MarkupSafe (1.0)
packaging (16.8)
pip (9.0.1)
pyasn1 (0.1.7)
pyasn1-modules (0.0.5)
pycparser (2.17)
pycrypto (2.6.1)
pycurl (7.19.3.1)
pygobject (3.14.0)
pygpgme (0.3)
pyliblzma (0.5.3)
pyOpenSSL (16.2.0)
pyparsing (2.2.0)
pyserial (2.7)
pyxattr (0.5.3)
rpm-python (4.12.0.1)
service-identity (14.0.0)
setuptools (28.7.1)
six (1.7.3)
slip (0.6.0)
slip.dbus (0.6.0)
traceback2 (1.4.0)
Twisted (15.1.0)
txaio (2.1.0)
ujson (1.35)
unittest2 (1.1.0)
urlgrabber (3.10.1)
Werkzeug (0.12.1)
wheel (0.29.0)
wsaccel (0.6.2)
yum-metadata-parser (1.1.4)
zope.event (4.0.3)
zope.interface (4.1.1)

Regarding test reports, I have them just offline, I have pasted the html report of one of the failing tests with autobahn testsuite 0.7.6 to pastebin: https://pastebin.com/29wPTCxK if it helps.

Looking at your reports, there seem to be some issues with unicode characters appearing (http://autobahn.ws/reports/servers/index.html), see 12.5.5, where the connection is closed with "UTF-8 text message payload ended within Unicode code point at payload octet index 4096". That seems similar to the issue I am seeing.

Please upload all yours reports somewhere so I can look (everything in the generated reports folder). I have no time to wade through raw pasted HTML;)

In my case I see the issue with 12.4.5, see also the json report https://pastebin.com/bZTp1KVR, where important is "remoteCloseReason": "UT002003: Text frame contains non UTF-8 data". In my case I run the test against Wildfly with Undertow, still the close reason message seems as very similar to the one you are seeing.

report.zip

So looking at the logs and the code, I do think this is a bug in the test cases 12.5.*. These should not just stupidly split at possibly non-codepoint boundaries, because we don't want to test peer behavior in this respect here. We do want to roundtrip a bunch of largish uft8 text portions while using WebSocket compression. So the test case should probably use https://github.com/crossbario/autobahn-python/blob/master/autobahn/util.py#L82, or more likely just steal those bits, because we like to stay pinned on the old AB with the rest ..

@rhatlapa so until we have fixed the issue here, you might exclude cases 12.5.* from testing: exclude_cases = ["12.5.*"].

@oberstet thanks, will do

I'm seeing the same problem in my Python websockets library.

@RouquinBlanc and I reached the same conclusion as you.

See python-websockets/websockets#178 (comment) if you're curious.

@oberstet Can we please fix this? The problem seems simple, the test splits a 3-byte code point and thus is invalid. An easy way to fix it is for Autobahn to check for a split code point at the beginning or end of the outgoing message and just exclude it.

The fix is to copy the code from https://github.com/crossbario/autobahn-python/blob/master/autobahn/util.py#L82 and use that to produce the test payloads. Easy, but right now, I just don't have time to get to it .. PRs would be welcome! ;)

For the record, we just hit what looks like the same issue in 12.1.11 (python-hyper/wsproto#54 (comment)). Not really a surprise, because I assume that these tests all use the same code for generating their random fixed-length utf8 strings. But since the issue log so far only mentions seeing it in 12.4.* and 12.5.*, I figured I'd leave a note here.

I assume that these tests all use the same code for generating their random fixed-length utf8 strings

Not a good assumption. 12.1.11 is not broken, only 12.4.* and 12.5.*. These tests are deterministic, they use a corpus text which is the same every time. So the tests are entirely predictable and repeatable. I suspect you have an actual problem with your implementation.

See:
https://vinniefalco.github.io/BeastAssets/reports/autobahn/index.html

@vinniefalco I've seen a passing report before :-). We run these tests on every commit, and this is the first time this test has ever failed, and it passed when we re-run, so there's something nondeterministic going on.

On further investigation, I think the problem is that tox now sets PYTHONHASHSEED by default, and if you look at how the different test numbers are assigned, there's a dict and 12.1 is the first entry in the dict, 12.2 is the second entry in the dict, etc. So when hash randomization is turned on (as is default on 3.3+, and happens on 2.7 if you set PYTHONHASHSEED), then it actually permutes the names of the tests. I'll file a bug...

Likewise found some tests in 12.4 and 12.5 generating illegal ws TEXT messages with partial utf-8... I disabled them from our testing but obviously that's not an ideal situation.

this is fixed via #81