Remove extended filesystem attributes on Mac/Linux (xattr/mdls) such as Source/Quelle
tayfuuun opened this issue · comments
Thanks, good catch! I'll check it out. In the meantime if you could possibly find the exiftool
command line options that take remove it, it will be even easier for me to modify ExifCleaner to use those options automatically. The easiest way to do this would be to run the exiftool command and verify that it removed the Source/Quelle field in your sample PDF. That will help me get this feature ready faster. If not it's OK I can figure it out. It just might take me a bit longer.
@szTheory sorry no time for this one.
Good luck and thank you.
Thanks, good catch! I'll check it out. In the meantime if you could possibly find the
exiftool
command line options that take remove it, it will be even easier for me to modify ExifCleaner to use those options automatically. The easiest way to do this would be to run the exiftool command and verify that it removed the Source/Quelle field in your sample PDF. That will help me get this feature ready faster. If not it's OK I can figure it out. It just might take me a bit longer.
@szTheory can you please check the source for the other file formats too? JPG, PNG.
@szTheory any update?
Sorry no, can you provide some generic example documents and images with a Source/Quelle that is not being erased? I also recently released a new version of ExifCleaner. It's a long shot but maybe you can download that and see if it helps, since I did fix a few bugs.
@szTheory files and tests with the latest version 3.4.0
[Edit: files removed]
Results
Also with the newest version the source is not removed from PDF and PNG files.
Interesting, even if I run exiftool
directly on those files even with the -v
verbose flag it's not picking up the Source/Quelle metadata, but when I tested it on a Mac it shows the field in the file info window for both the PDF and the PNG. I'll have to look into this more.
If you run mdls myfile.png
it shows what looks like some Mac-specific metadata like kMDItemProfileName
and kMDItemWhereFroms
that exiftool
is not picking up on. Will have to see how to add support if it's built into exiftool already and just need some different command line flags, or if ExifCleaner has to bolt on extra functionality. In the meantime you can remove the metadata with xattr -c myfilehere.pdf
(the -c
flag means clear) and confirm afterwards by running mdls
again on the file. See this link for more info.
@szTheory xattr -c myfilehere.pdf
working! Nice. When you implement this in your tool, you would make me very happy.
Current plan for Mac
- spin up an extra
mdls
process to read extended filesystem attributes in the "# exif before" column, then another one with the-c
flag to clear them, then another one to populate the "# exif after" column. - If possible figure out a way to keep the
mdls
process alive in a process pool and keep them alive to process multiple files to minimize process overhead, like is done with exiftool. - Or pass multiple files at once to a single process per-CPU core.
- Investigate if there are any extended filesystem attributes that
mdls -c
still leaves behind and how to deal with them.
Current plan for Linux
- research the extended filesystem attributes more. There is probably variation between the Linux filesystems.
- If possible find a single tool that deals with all the Linux file systems uniformly
Current plan for Windows
- Find an existing command tool, perhaps C/C++ or Powershell that cleans Windows extended filesystem attrs
Current plan for all OS targets
- Extract the extended filesystem attribute cleanup into a single NPM package, or C/C++ tool with Node CAPI extension.
Without help this is likely going to take more than a year of low time comittment work. If someone provides a drop-in solution like there is with exiftool then it will go faster.
@szTheory one small update
the commend xattr -c myfilehere.pdf
working fine for images (.png, .jpg), but when I use it for PDF files, not every information is deleted on a macOS.
Thanks yeah I noticed that too. I'm not sure what to do about it. Maybe it's a bug in xattr
. I couldn't find any guide that mentions this failing. Everything just recommended to use xattr which clearly is not doing enough, even after I played around with all of its command line options.
I don't know enough about these filesystems to find a comprehensive solution, so hopefully someone can recommend a starting point. Ideally there would be one tool that gets rid of all the extended filesystem attributes. Better yet, one that works for all filesystem types, across all the major operating systems. Then that tool could be vetted and integrated with ExifCleaner for a single drag and drop that gets rid of everything, instead of being a patchwork process that depends on your environment.