t_decode_utf8_lenient fails on Windows without simdutf
Bodigrim opened this issue · comments
ˌbodʲɪˈɡrʲim commented
(Extracted from #528 (comment))
As witnessed in https://github.com/haskell/text/actions/runs/5507568869/jobs/10037691564?pr=528#step:6:228, cabal test -f-simdutf
can fail on Windows with
error recovery
t_decode_utf8_lenient: FAIL (0.01s)
*** Failed! Falsified (after 99 tests and 55 shrinks):
["!\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL","\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\128\NUL"]
"!\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\65533\NUL" /= "!\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\128\NUL"
Use --quickcheck-replay=572156 to reproduce.
Use -p '/t_decode_utf8_lenient/' to rerun this test only.
ˌbodʲɪˈɡrʲim commented
Looking at the structure of the offending string, it's most likely haskell/bytestring#575, fixed in bytestring-0.11.5.0
. We should raise the bound at which we switch between native validation and bytestring
:
text/src/Data/Text/Internal/Encoding.hs
Line 247 in 26e9cee