node-formidable / formidable

The most used, flexible, fast and streaming parser for multipart form data. Supports uploading to serverless environments, AWS S3, Azure, GCP or the filesystem. Used in production.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Files parsed by `IncomingForm().parse` are not automatically cleaned up or deleted.

theGOTOguy opened this issue · comments

Support plan

  • Which support plan is this issue covered by? (Community, Sponsor, Enterprise): Community
  • Currently blocking your project/work? (yes/no): no
  • Affecting a production system? (yes/no): yes

Context

  • Node.js version: 18.12.0
  • Release Line of Formidable (Legacy, Current, Next): Current
  • Formidable exact version: 2.0.1
  • Environment (node, browser, native, OS): node
  • Used with (popular names of modules): next.js

What are you trying to achieve or the steps to reproduce?

Parsed files are not cleaned up automatically and lead to a memory leak in production.
I couldn't find this behavior documented anywhere, so I'm unsure whether it's expected behavior that needs to be documented or whether it's a bug and parsed files should be deleted automatically.

export default async function handler(req, res) {
  // snip...
  let form = new formidable.IncomingForm();
  try {
    let formfields = await new Promise(function (resolve, reject) {
      form.parse(req, function (err, fields, files) {
          if (files?.file) {
            jsonStream = fs.createReadStream(files.file.filepath).pipe(ndjson.parse());
            in_file = files.file.filepath;
          }

          resolve(fields);
      });
    });
    // snip...
  } finally {
    // My first attempt to fix the memory leak.  This didn't work.
    if (jsonStream) {
      jsonStream.destroy();
    }

    // Without what follows, a memory leak appears.
    if (in_file) {
      fs.unlink(in_file, (err) => {
        if (err) {
          // Whatever.  Maybe something else cleaned it up.
          return
        }
      })
    }
  }
}

What was the result you got?

Screenshot from 2024-01-20 16-39-58

The effect is best seen in the production graphs.
You can see the container reboot due to an out-of-memory killer on the left, then you can see a reboot where I attempted to fix the leak by adding the jsonStream.destroy() logic, and then finally the noted fix where I use fs.unlink to delete the files emitted by form.parse.

What result did you expect?

I expected that the parsed files would be automatically cleaned up when form or formfields went out of scope. I have to use additional logic to separately delete them or a memory leak occurs.

Files are not removed

Can you help me understand how keeping files on disk means it's necessary to keep the data in memory? Keeping files seems super reasonable. Keeping things in memory seems somewhat less reasonable.

@bensonk In this case, I believe the reason is that the files are stored in memory as an artifact of the host (Cloud Run).

@GrosSacASac Is this working as intended, then, or is this a bug? If it's working as intended, the examples should probably be updated so that they clean up the files when they're done, because if not I imagine that a lot of your users made the same mistake and are experiencing this same disk space or memory leak situation without knowing it.

If I upload a file I want it to stay

Ok, I see that the behavior is documented: https://github.com/node-formidable/formidable/blob/master/README.md#highlights

I was text searching for different terms. From the perspective of my pipeline which consumes but has no need to store the files, this behavior is surprising but I can see how it would make sense from a different perspective.