Not CentOS 6 compatible
Anapher opened this issue · comments
I'm facing the problem that there is a problem with the string serialisation on CentOS 6 when compiled with unsafe. If compiled without unsafe, it works just fine. Reading from another system works fine, but writing produces invalid data. It also can't read the data it wrote.
Here is a small sample code:
new Serializer(typeof(string)).Serialize("hallo welt")
Result in CentOS 6:
02150a0000000000000000000000000000000000000000
Result in Windows:
02150a680061006c006c006f002000770065006c007400
The code works fine on CentOS 7 & Ubuntu. The file was executed using mono 3.2.8. Would be awesome if you can tell me what's going wrong or fix it directly by yourself. I also looked into the code and encoder.Convert(src + p, totalChars - p, dst, buf.Length, true, out charsConverted, out bytesConverted, out completed);
might be the problem, but I'm not sure.
Okay, got that fixed, the bug came from the unsafe Convert method of Encoder. The managed one worked fine so I wrote this code:
private static bool? _unsafeEncoderFailed = null;
public static unsafe void WritePrimitive(Stream stream, string value)
{
if (value == null)
{
WritePrimitive(stream, (uint) 0);
return;
}
else if (value.Length == 0)
{
WritePrimitive(stream, (uint) 1);
return;
}
var helper = s_stringHelper;
if (helper == null)
s_stringHelper = helper = new StringHelper();
var encoder = helper.Encoder;
var buf = helper.ByteBuffer;
int totalChars = value.Length;
int totalBytes;
fixed (char* ptr = value)
totalBytes = encoder.GetByteCount(ptr, totalChars, true);
WritePrimitive(stream, (uint) totalBytes + 1);
WritePrimitive(stream, (uint) totalChars);
int p = 0;
bool completed = false;
while (completed == false)
{
int charsConverted;
int bytesConverted;
if (_unsafeEncoderFailed.HasValue && _unsafeEncoderFailed.Value)
{
encoder.Convert(value.ToCharArray(p, totalChars - p), 0, totalChars - p, buf, 0, buf.Length, true,
out charsConverted, out bytesConverted, out completed);
}
else
{
fixed (char* src = value)
fixed (byte* dst = buf)
{
encoder.Convert(src + p, totalChars - p, dst, buf.Length, true,
out charsConverted, out bytesConverted, out completed);
}
}
if (_unsafeEncoderFailed == null)
{
var alsoSomethingElseThanZero = false;
for (int i = 0; i < bytesConverted; i++)
{
if (buf[i] != 0)
{
alsoSomethingElseThanZero = true;
break;
}
}
if (!alsoSomethingElseThanZero)
{
var onlyContainsUnicodeZeros = true;
for (int i = 0; i < totalChars; i++)
{
if (value[i] != '\u0000')
{
onlyContainsUnicodeZeros = false;
break;
}
}
if (!onlyContainsUnicodeZeros)
{
_unsafeEncoderFailed = true;
completed = false;
continue;
}
}
_unsafeEncoderFailed = false;
}
stream.Write(buf, 0, bytesConverted);
p += charsConverted;
}
}
Technically, I just put a check if the bytes are all zero. Because that's possible if all unicode chars are 0, I also check if there are chars which are do not have the unicode value zero. If the check fails, it will continue the while and use the managed method for the next time. Pretty easy and shouldn't affect the performance of it.
I didn't quite catch it. Where is the bug? In encoder.Convert()? And only on CentOS 6? That sounds like something that should be fixed on CentOS 6, not in NetSerializer.