PDF support

Question

PDF support

oliver021 opened this issue 3 years ago · comments

Hello friends, I was wondering if it would not be a good idea to include a metadata extractor for other types of files, such as pdf, excel sheets, word documents, etc, since these types of files contain a lot of metadata as well, and I have not seen any library topic on metadata extraction that covers that function, it would be very good since the title of this library is not really limited to metadata of multimedia files.

Drew Noakes · Answer 1 · Thu Mar 18 2021 08:05:15 GMT+0800 (China Standard Time)

The library is open to the addition of support for other kinds of data, with the following guidelines:

No dependencies on external libraries (we have only one exception to this for XMP processing)
Metadata must be representable using the directory/tag structure we use throughout

Support for PDF is being tracked in the sibling Java library in drewnoakes/metadata-extractor#327. I have no issue with supporting other document types as you suggest.

Oliver Valiente · Answer 2 · Thu Mar 18 2021 11:28:18 GMT+0800 (China Standard Time)

Okay, I make a pull request now, thanks for responding!

Drew Noakes · Answer 3 · Thu Mar 18 2021 11:31:46 GMT+0800 (China Standard Time)

@oliver021 fantastic, thanks.

Vincent Marnier · Answer 4 · Tue Jun 22 2021 15:53:47 GMT+0800 (China Standard Time)

Hello,
Is there any status about this?
I don't find the mentionned pull request.

Oliver Valiente · Answer 5 · Wed Jun 23 2021 02:33:49 GMT+0800 (China Standard Time)

Hello, I have not been able to do anything about it, I had a drastic change of plans in my schedule, and I find myself with a very short time