vbatts / go-mtree

File systems verification utility and library, in likeness of mtree(8)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vis and Unvis break on UTF-8

cyphar opened this issue · comments

Sigh. Okay, so if we have an especially well-named file such as AC_Raíz_Certicámara_S.A..pem, go-mtree will not handle it correctly when you call .Path() on an entry which has its name set to the above.

Effectively what happens is that you have a multi-byte encoded character being passed to the lovely Vis and Unvis code -- which obviously break in horrible ways. The string is then mutated in a very ugly way.

IMO the only way of handling this is to rewrite Vis and Unvis in Go...

Thanks @vbatts for using C code written by BSD folks ~20 years ago. This is gonna be fun. 😸

wait. now the vis/unvis is in golang. only if you compile with the build tag cvis.

@vbatts Right. The problem still exists, and it's because the port of Vis/Unvis is still holding on to the notion of bytes when doing a bunch of the operations...

Specifically, byte(some_rune) will lose information. Because a rune can be longer than a single byte.

@vbatts Don't worry, I've got it working now. But IMO Vis and Unvis should be moved to a library. I'm also adding test cases.