101arrowz / fflate

High performance (de)compression in an 8kB package

Home Page:https://101arrowz.github.io/fflate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Occasional CRC Errors When Streaming Data into Zip using AsyncZipDeflate

Masty88 opened this issue · comments

Discussed in #192

Originally posted by Masty88 December 1, 2023
Occasional CRC Errors When Streaming Data into Zip using AsyncZipDeflate

Context
I am using fflate to fetch a list of 3D geographic files in various formats along with orthophotos in JPEG from Amazon S3. When retrieving the files, I use response.body.getReader() to stream the data into a ZIP folder.

Issue
When using AsyncZipDeflate or ZipDeflate (even with compression level set to 0), I encounter CRC errors intermittently - sometimes immediately, other times sporadically (about one in every two attempts). However, if I use the array buffer directly without streaming, or if I use ZipPassThrough for streaming, it works flawlessly 100% of the time.

Steps to Reproduce
Fetch a list of files from Amazon S3.
Stream the data into a ZIP folder using AsyncZipDeflate or ZipDeflate.
Occasionally encounter CRC errors in the resulting ZIP file.
Expected Behavior
The ZIP file should be created without CRC errors, similar to when using ZipPassThrough or directly passing the array buffer.

Actual Behavior
CRC errors occur intermittently when using AsyncZipDeflate or ZipDeflate for streaming data into a ZIP folder.

Additional Information
The files being fetched are 3D geographic files in various formats along with JPEG orthophotos.
The issue seems to be specific to the streaming process with AsyncZipDeflate or ZipDeflate.
StackBlitz Reproduction
I have created a StackBlitz project to demonstrate the issue:

https://js-7tnzqy.stackblitz.io
https://stackblitz.com/edit/js-7tnzqy?file=download.js

import { Zip, AsyncZipDeflate } from 'fflate';

async function downloadAndCompress(urlsToDownload) {
  console.log(urlsToDownload);
  let chunks = [];

  const zipFile = new Zip();

  const zipCompletionPromise = new Promise((resolve) => {
    zipFile.ondata = (err, dat, final) => {
      if (err) {
        throw err; // or handle error as you see fit
      }
      chunks.push(dat);
      if (final) {
        const blob = new Blob(chunks, { type: 'application/zip' });
        const url = URL.createObjectURL(blob);
        resolve(url);
      }
    };
  });

  const downloadAndStreamToZip = async (url) => {
    const response = await fetch(url);
    const fileName = url.split('/').pop();
    const fileStream = new AsyncZipDeflate(fileName, { level: 4 });
    zipFile.add(fileStream);

    const reader = response.body.getReader();
    let done = false;
    while (!done) {
      const { done: chunkDone, value } = await reader.read();
      done = chunkDone;
      if (value) {
        fileStream.push(new Uint8Array(value), done);
      }
    }

    return new Promise((resolve) => {
      if (done) {
        fileStream.push(new Uint8Array(0), true);
        resolve();
      }
    });
  };

  await Promise.all(urlsToDownload.map((url) => downloadAndStreamToZip(url)));

  zipFile.end();

  return await zipCompletionPromise;
}

export { downloadAndCompress };

image

Sorry, I somehow missed your discussion post! This shouldn't be happening; have you verified this issue with other ZIP decompression software or is it only visible in Windows Explorer?

Thank you for getting back to me, and no worries about missing the discussion post! In the meantime, I have taken the opportunity to dive deeper into the code and conduct tests in various scenarios. However, I continue to encounter the same issue.

I've already attempted to decompress the resulting ZIP files using 7zip, but unfortunately, the problem persists with the same CRC errors. Additionally, I have checked the validity of the ZIP archive using an online verification tool, which also confirmed the presence of these errors.

From my observations, it seems that the higher the level of compression, the more frequently the errors occur. This pattern is consistent regardless of the decompression tool used, suggesting that the issue might be related to the compression process itself rather than decompression.

I hope this information might be helpful for further investigation. I am more than willing to provide any additional details that might aid in resolving this issue.

Thank you again for your support and your time.

image
image

While trying to open those same Zip archives that @Masty88 produced on my MacBook Pro and the standard archiver tool of the system (Sonoma 14.2), I had the issue too, I could not decompress the archives.

This issue is concerning. I'm really sorry I haven't gotten to it yet - I'll investigate and fix this as soon as I can.

Reproduced and fixed locally. Will test further to verify my change fully fixes the problem.

@101arrowz thank you we will wait for the release :)

Should be fixed in v0.8.2; let me know if you run into any issues!