PHPOffice / PHPWord

A pure PHP library for reading and writing word processing documents

Home Page:https://phpoffice.github.io/PHPWord/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[DOC] TemplateProcessor behavior changed with 1.2 - temporary file gets deleted while still being used

kevinpapst opened this issue · comments

Describe the Bug

Hi,

thanks for your work on this library!

A recent change introduced a bug that I can workaround, but I am not sure if this is intended behavior, as it comes pretty unexpected. It was introduced in #2475, which added TemplateProcessor::__destruct() in which the temporary file is deleted.

// ZipClass
if ($this->zipClass) {
try {
$this->zipClass->close();
} catch (Throwable $e) {
// Nothing to do here.
}
}
// Temporary file
if ($this->tempDocumentFilename && file_exists($this->tempDocumentFilename)) {
unlink($this->tempDocumentFilename);
}
}

This change basically disallows to use the temporary file while the process is still running, unless you keep the TemplateProcessor instance alive. Temporary files are, to my knowledge, always deleted by the engine after the scripts end, so this code seems to be superfluous.

Steps to Reproduce

This is my code:

$template = new TemplateProcessor('template.docx');
$tplFile = $template->save();
return $this->getFileResponse(new Stream($tplFile), 'export.docx');

This worked fine until I upgraded to 1.2

You save a Template and return the temporary filename, all from the context of a method that does uses a local instance of TemplateProcessor. As soon as your TemplateProcessor instance gets destructed (here when returning the response from the method) the temporary file now gets deleted.

I can work around the problem with this code:

$tplFile = @tempnam(sys_get_temp_dir(), 'foo-bar-docx');
$template->saveAs($tplFile);
return $this->getFileResponse(new Stream($tplFile), 'export.docx');

So basically using my own temporary file. I feel that this change makes the save() method kind of useless.

Expected Behavior

Do not delete the temporary file while I am still using it.
Trust the user of your library: I called save(), I can/will handle the file myself.
So my proposal is: remove the destruct method again.

Current Behavior

The temporary file is deleted upon destruct of the TemplateProcessor instance, which is likely before the file is used (here sent to the user).

Context

  • PHP Version: 8.2
  • PHPWord Version: 1.2

It is not the case that "Temporary files are ... always deleted by the engine after the scripts end". Without the unlink in this destructor, they will stay around, uselessly, forever (or until the operating system does its own cleanup). If you want them to persist, and are willing to take on responsibility for cleaning them up, you can, as you suggested, use saveAs instead of save. Adding the one extra line of code to generate the name doesn't seem especially burdensome, but, even so, it might be possible to change saveAs to generate its own file name (and return it to the caller) if, say, the file name parameter is supplied as a null string. Or, I suppose, you could add an unlink-at-destruct flag to the constructor.

Nope. See https://www.php.net/manual/en/function.tmpfile.php

The file is automatically removed when closed (for example, by calling fclose(), or when there are no remaining references to the file handle returned by tmpfile()), or when the script ends.

Yes, there are certain edge cases where the file might stay:

If the script terminates unexpectedly, the temporary file may not be deleted.

But that is entirely out of scope for the library (and your destruct wouldn't be called anyway in that case) and a much more tolerable potential issue than this BC break (whose intention might be good). With the current behavior you can also delete the entire save() methods, as it is not really usable anymore in modern world code.

I am just sharing my findings here, as this rather critical change was not mentioned anywhere in the CHANGELOG, commits or PR. I don't think that you (as library) need to care about the temp file being deleted. I have never seen such behavior.
But obviously it is your library and your decision.

I ran the following script against PhpWord 1.1. My file is still there, long after my script has ended. Do you get a different result?

$template = new TemplateProcessor($filename);
$tplFile = $template->save();
var_dump($tplFile);

But, reading over the documentation you sent, it describes function tmpfile, but PhpWord is using a different function tempnam. I think if PhpWord were to use tmpfile instead, it would suffer from the same problem you are already seeing - file would be deleted when TemplateProcessor is destroyed, with or without a destructor. (And I'm not sure how you'd find the file's name.)

I am sorry for misleading with the tmpfile vs tempnam functions. It's wasn't my intention to waste your time!

After further investigations I have to admit that I was partially mislead by your change. I am using Symfony and their BinaryFileResponse has a method deleteFileAfterSend() which I am using, so this is the main reason why I never had any leftover files.

I still think that the new behavior is an undocumented BC break, but now I understand why you added it.
So I am rephrasing my issue here to [DOCS] and propose to add a docblock like that:

    /**
     * Saves the result document and returns the temporary filename.
     * This file will be deleted as soon as the current TemplateProcessor instance is destructed.
     * If you need to further process the file, use saveAs() instead. 
     *
     * @return string
     */
    public function save() ...