Help with tests using cmp.Diff and chinese characters?
bashbunni opened this issue · comments
Hey there,
I'm trying to reproduce an issue in one of the projects I maintain and am using cmp.Diff
to show what went wrong when a test fails. The issue I'm facing now is that I can't read the output, so I'm not able to do much with the information given.
I guess my question is, what can I do with the byte output shown below and would the // +|.[0m.[38;5;252m|
be the string value of what's changed?
glamour_test.go:279: got != want
-want +got:
diff:
string{
... // 78204 identical bytes
0x6d, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d, 0x1b, 0x5b, 0x30, 0x6d, // |m.[38;5;252m.[0m|
0x20, 0x20, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d, 0x31, 0x3a, 0x34, // | .[38;5;252m1:4|
+ 0x1b, 0x5b, 0x30, 0x6d, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d, // +|.[0m.[38;5;252m|
0x3a, 0x39, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d, 0x20, 0x1b, 0x5b, // |:9.[38;5;252m .[|
0x30, 0x6d, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d, 0x20, 0x1b, 0x5b, // |0m.[38;5;252m .[|
... // 340063 identical bytes
}
--- FAIL: TestWrapping (0.33s)
--- PASS: TestWrapping/english_short (0.00s)
--- PASS: TestWrapping/chinese_short (0.00s)
--- FAIL: TestWrapping/chinese_long (0.33s)
FAIL
cmp version: github.com/google/go-cmp v0.5.9
go version: go 1.17
Here's a link to the pull request and an example output we're comparing:
charmbracelet/glamour#249
testdata/issues/long-chinese-text.test
Thank you very much for your great project and I appreciate any guidance you're able to give :)
This output seems to be working as intended. You're comparing two strings that cmp.Diff
has detected contains non-printable characters. For that reason, it switched to a mode where it diffs the raw byte values.
This particular output is saying that the got
has an additional string injected at some offset after 78204:
0x1b, 0x5b, 0x30, 0x6d, 0x1b, 0x5b, 0x33, 0x38, 0x3b, 0x35, 0x3b, 0x32, 0x35, 0x32, 0x6d,
The best ASCII representation of this string is:
.[0m.[38;5;252m
(BTW, the output you are seeing is inspired by the hexdump
utility, which prints the raw hex values on the left, and the best ASCII representation on the right.)
This happens to be an ANSI escape sequence that is common in terminals.