neilharvey / FileSignatures

A small library for detecting the type of a file based on header signature (also known as magic number).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RTFD Files

abbottmw opened this issue · comments

Is there a way to identify an RTFD file? They are basically a package, but are used on Apple/Macs with TextEdit.
I am currently not on a mac, but one rtfd file I did find on the web, FileSignatures sees it as a zip. Looking at the bytes, it does have the same signature as a zip, but I dont have more examples to test and how to tell if it is an rtfd file and not just a zip file.

RTFD File extension

Unless RTFD files have a unique set of magic bytes, it's probably going to be hard to detect them reliably.

It might be possible leverage the RTF format, https://github.com/neilharvey/FileSignatures/blob/master/src/FileSignatures/Formats/RichTextFormat.cs, since that's mostly what RTFD is - an RTF and some attachments.

Do you have an example file which you could attach?

It sounds similar to how Office documents work, which are a zip archive that contain a specific set of files.

For example, if you treat a Word document as an archive and list the contents you'll get something similar to the following:

> tar -tf FileSignatures\test\FileSignatures.Tests\Samples\test.docx
[Content_Types].xml
_rels/.rels
word/_rels/document.xml.rels
word/document.xml
word/theme/theme1.xml
word/settings.xml
word/fontTable.xml
word/webSettings.xml
docProps/app.xml
docProps/core.xml
word/styles.xml

How we detect those is to look for a unique entry which always exists: for Word that would be word/document.xml, for Excel it's xl/workbook.xml, Powerpoint is ppt/presentation.xml and so on.

You might be able to do something along the same lines - but you'll need some example files to look for common patterns.

Attaching the only example file I have. Ill see about getting other examples. Looks like it has a folder of the filename.
Rename this to rtfd if you like. couldnt upload an rtfd file. Ill try to do something similar to how you handle the docx etc.

convert_rtfd_to_rtf2.zip

Looks like it is just archive and has a folder of the same file name and inside there there is a TXT.rtf file. I was able to get a detection to work, but it seems like its easily spoofed if you just create a zip file with that structure.

I need to get a mac haha

> tar -tf rtfdtest.rtfd
rtfdtest.rtfd/
rtfdtest.rtfd/TXT.rtf

Ah yeah that makes sense. I found this article which seems to confirm what you've found:

"An RTFD file is a bundle, containing multiple files. It contains a Rich Text file called TXT.rtf that contains Rich Text formatting commands, as well as commands for including images or other attachments contained within the bundle. Images used in the document are stored in the bundle in their native formats."

https://fileformats.fandom.com/wiki/Rich_Text_Format_Directory

Glad you managed to get it working!

Anyone have idea how to add csv and txt file with this utility.
Does CSV of TXT file have any magic numbers?

@Harmeet94Singh No, unfortunately a CSV/TXT file does not have any file signature.