coverxit / HeatshrinkDotNet

C# version (.NET Standard 2.0) of heatshrink data compression algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The right size of 'decomp' array

McTALAO opened this issue · comments

I have got a IOT sensor that send an MQTT payload compressed with heatshrink algorithm (11, 4) so I found this library very useful to develop server side decompression.

Following CompressAndExpandAndCheck helpers function I have extracted a function that runs decompression.

static bool Decode(int decoderInputBufferSize, int windowSz, int lookaheadSz, byte[] comp, byte[] decomp, out int polled)
{
	HeatshrinkDecoder decoder = new HeatshrinkDecoder(decoderInputBufferSize, windowSz, lookaheadSz);

	int compSz = comp.Length;
	int decompSz = decomp.Length;

	int sunk = 0;
	polled = 0;

	DecoderPollResult pres;
	while (sunk < compSz && polled < decompSz)
	{
		decoder.Sink(comp, sunk, compSz - sunk, out int count);
		sunk += count;

		do
		{
			pres = decoder.Poll(decomp, polled, decompSz - polled, out count);
			polled += count;
		} while (pres == DecoderPollResult.More && polled < decompSz);
	}

	return decoder.Finish() == DecoderFinishResult.Done;
}

First of all I want to report the necessity of check polled < decompSz (in two places) to prevent an infinite loop if decomp array ends and the check of the status of decoder decoder.Finish() == DecoderFinishResult.Done at the end of function.

I use this function in my test program to check the algorithm in some previously compressed payloads (I have the original and compressed payload made from originally library heatshrink).

static void Main(string[] args)
{
	Console.WriteLine("> Decode " + args[0] + " and compare with " + args[1] + "...");

	byte[] comp = File.ReadAllBytes(args[0]);
	byte[] decompRef = File.ReadAllBytes(args[1]);
	
	byte[] decomp = new byte[comp.Length * 4]; // please note that 4 is an empirical number 
	bool isDone = Decode(1024, 11, 4, comp, decomp, out int polled);
	if (isDone)
	{
		Console.WriteLine("size> " + comp.Length + " -> " + polled + " (" + decompRef.Length + ")" +
			" @ " + (polled / (double)comp.Length));

		Array.Resize(ref decomp, polled);
		bool isEqual = decompRef.SequenceEqual(decomp);
		Console.WriteLine("isEqual> " + isEqual);
	}
	else
	{
		Console.WriteLine("WARN> (maybe) the 'decomp' array is too short!");
	}

	Console.WriteLine("> See you later! [cit. Dogui]");
}

So the question now is what is the right size of decomp array.
As you see in the code at the moment I suppose the size is less than 4 times new byte[comp.Length * 4] of the compressed array. This is generally true because in my case the compress ratio is normally between 2.5 and 3.5 (the data has much redundancy).

...mumble mumble...

To avoid this we could use a MemoryStream, as in the following example.

static bool Decode(int bufferSize, int windowSz, int lookaheadSz, byte[] comp, MemoryStream stream)
{
	HeatshrinkDecoder decoder = new HeatshrinkDecoder(bufferSize, windowSz, lookaheadSz);

	int compSz = comp.Length;
	int sunk = 0;
	byte[] buffer = new byte[bufferSize];

	DecoderPollResult pres;
	while (sunk < compSz)
	{
		decoder.Sink(comp, sunk, compSz - sunk, out int count);
		sunk += count;

		do
		{
			pres = decoder.Poll(buffer, out int polled);
			stream.Write(buffer, 0, polled);
		} while (pres == DecoderPollResult.More);
	}

	return decoder.Finish() == DecoderFinishResult.Done;
}

So, the previous Main could be changed as follows.

using (MemoryStream stream = new MemoryStream())
{
	Decode(1024, 11, 4, comp, stream);
	byte[] decomp = stream.ToArray();

	Console.WriteLine("size> " + comp.Length + " -> " + decomp.Length + " (" + decompRef.Length + ")" +
		" @ " + (decomp.Length / (double)comp.Length));

	bool isEqual = decompRef.SequenceEqual(decomp);
	Console.WriteLine("isEqual> " + isEqual);
}

Is there any other optimization that can be done?

Thanks.