benfulcher / hctsa

Highly comparative time-series analysis

Home Page:https://time-series-features.gitbook.io/hctsa-manual/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TISEAN d2 leaves temporary files undeleted

sdvillal opened this issue · comments

On a side note, I think there is unlikely but possible danger of having temporary fn colisions when running on parallel (it happened to me already once). We probably should create those in the system temporary dir and make sure they are unique.

Yeah, good point. Files would have to be created at the same millisecond to collide, which is rare but possible. Adding details of the function inputs into the filename would make these files unique (avoid potential collisions between TISEAN temp files run with the same function, such as d2, but different input parameters when computed in parallel).

Fixed -- tmp files now better named (with the parameters used) and placed in system temp directory.
Was an issue with TISEAN not writing the full file name (for c1) due to a long filename (now that the tmp filename is longer), but made the filename shorter and added a check for the appropriate file.
Updates are in OperationChanges branch

That fix still leaves room for filename collisions when using multiprocessing, so it is not fully correct.

What about using tempname, which should be safer than just using the clock (as long as the JVM is in use), or prepending the process_id, or both?

I read that the process id can be found like this:

function [pid] = portable_pid()

isOctave = exist('OCTAVE_VERSION', 'builtin') ~= 0;

if isOctave
  pid = getpid;
else
  pid = feature('getpid');
end

end

Yeah, thanks -- you're right. If running as normal with a time series at a time, you never risk running the same function with the same input parameters at the same time, but for alternate set-ups it could happen (running multiple instances of Matlab, for example). I'll try the tempname.

Now implemented here: bd0d10a