Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net

Home Page:https://www.clamav.net/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pdf with "pseudo" encryption

JAF84 opened this issue · comments

hello,

tested also with the latest release (1.3.0)

see attached PDF as sample, i have a lot of samples like this.

52n31op9ob2on.pdf

in this case the PDF is encrypted, but does not ask for a password an all images are visable by any pdf-viewer,
so some object are encrypted, but no special password is necessary to decrypt.

clamav is extracting every object of the PDF, but they are still encrypted, to useless to find anything usefill inside.
you can see the object with "clamscan --debug --leave-temps=yes --tempdir=1.tmp ..."

so of course clamav should also decrypt this files in order to scan the parts...

br johannes

I belive this object means, that the KEY is saved in the PDF file...

36 0 obj
<<
/CF <</StdCF <</CFM /AESV2
/Length 16
/Type /CryptFilter>>>>
/EncryptMetadata false
/Filter /Standard
/Length 128
/O <12C8E19723067F3F573A569162793847A399164D0ABD07C378E264D04385DE6C>
/P -3904
/R 4
/StmF /StdCF
/StrF /StdCF
/U <8DB1952FBC37B941D71F1E81F508A629A50EDB71F2423300B31F50D70AF2A721>
/V 4

endobj
xref

Hi,

Thank you for the notifying us about this, and I am sorry for the delay in responding to you.

In looking at our metadata, this file is recognizing that there is an encrypted image that is decryptable, but appears to be being extracted without being decrypted.

According to pdfimages, this image is of type portable pixmap (ppm).

I am opening a ticket internally to track this issue, and get it scheduled for the future. We'll udpate this issue when it is scheduled.

If you could provide some of your other samples, we would appreciate it.

Thanks,
Andy

hallo andy,

attached some more samples.

ce2kg7bptpo7e.pdf
v554fqz6krwme.pdf
7a2ljrwiskbk.pdf
eezo7xs89c.pdf
zikjqtdw51x5uuk.pdf

this are of course unwanted spam-pdf.

But there are also serious PDFs, which has this "pseudo-encrytion",
so using this "pdf-feature" does not globaly mean, that the PDF is bad one...

br johannes

hello,

now i can tell you more about this, encryption is done when you protect the PDF e.g. for not-printable.

see samples attached

1.pdf 1.pdf => without encryption
2.pdf 2.pdf => encryption, but no password necessary => so could/should be checked...
3.pdf 3.pdf => encryption, password necessary

can be easiely created with pdftk on linux:
pdftk 1.pdf output 2.pdf owner_pw 1234
pdftk 1.pdf output 3.pdf owner_pw 1234 user_pw 4321

br johannes

That's great, thank you for the samples, and instructions on where this is coming from. We have some other pdf tasks planned, so hopefully we can get this addressed as part of that work.

Thanks,
Andy

btw: also very interesting is that:

clamscan.exe --alert-encrypted=yes *pdf

1.pdf: OK
2.pdf: OK
3.pdf: Heuristics.Encrypted.PDF FOUND

so clamav already detects a difference between 2+3.pdf...

I haven't had a chance to play with the new files yet, but I would imagine 3.pdf would not have 'decrpytable' in the json output.

Just checked. 2.pdf is decryptable, 3.pdf is not.

hello Andy,

i now also checked, clamav 1.30
when i do 2.pdf you are right, it shows "pdf_find_and_extract_objs: encrypted pdf found, decryptable!"

LibClamAV debug: cli_pdf: U: : a95f5a7083f9fb99bb158fcd70e503db00000000000000000000000000000000
LibClamAV debug: cli_pdf: O: : dd027d75bab3642ffd6d1b4a2020e2df0022ff603ae18bfb6769f36dd5800bfa
LibClamAV debug: check_owner_password: Unknown or unsupported encryption version. R: 3
LibClamAV debug: check_owner_password: encrypted PDF found but cannot decrypt with empty owner password
LibClamAV debug: cli_pdf: U: : a95f5a7083f9fb99bb158fcd70e503db00000000000000000000000000000000
LibClamAV debug: cli_pdf: O: : dd027d75bab3642ffd6d1b4a2020e2df0022ff603ae18bfb6769f36dd5800bfa
LibClamAV debug: cli_pdf: md5: f57ac02ebae3c6f4fd80ca480c0db974
LibClamAV debug: cli_pdf: Candidate encryption key: f57ac02ebae3c6f4fd80ca480c0db974
LibClamAV debug: cli_pdf: fileID: 2bc8cb8f258e5c34c306e9bdf5ac31e7
LibClamAV debug: cli_pdf: computed U (R>=3): a95f5a7083f9fb99bb158fcd70e503db
LibClamAV debug: check_user_password: user password is empty
LibClamAV debug: pdf_find_and_extract_objs: encrypted pdf found, decryptable!
LibClamAV debug: Bytecode executing hook id 258 (0 hooks)
LibClamAV debug: Bytecode: no logical signature matched, no bytecode executed
LibClamAV debug: pdf_find_and_extract_objs: (parsed hooks) returned 0

when i do --leave-temps=yes with 1.pdf there i see the "hello world" object in the tempfiles.

but with 2.pdf the extractred tempfiles are all still encrypted ... and so useless.
so it's not possible to create signatures of the PDF-parts...

i've now also tested and PDF with an image.
i used leave-temps to get the image file and created a hash-based signature of if.

the unencrypted file was marked as infected after that.
then i used " pdftk 1.pdf output 2.pdf owner_pw 1234" to encrypt.

clamav was telling me "decyptable", but did not mark the file as infected.

so maybe clamav is maybe able the deccypt it, but does not use the unencrypted parts for some reason?

br johannes

I think the 'LibClamAV debug: check_owner_password: Unknown or unsupported encryption version. R: 3' is the problem. When that statement is printed in our pdf parser, it does not attempt to decrypt that block, but the decryptable flag is printed because we should be able to decrypt.

We have some other planned work to do on the pdf parser, so hopefully we can get this implemented as part of that.

Thank you for digging into this!

i have a lot of differnt samples here, but a lot of them contains bussiness data,
which i cannot post here.

but if there is some beta version to test, please let me know...

well anyway if the file has an encryption, like 2.pdf,
also if this only "for printing deny" and if clamav fails to decrypt
=> then it should also be marked as "Heuristics.Encrypted.PDF" or something simular...

because maybe there are also other encryptions, which clamav fails to decrypt
or maybe in the future the will be a new way to encrypt pdf files...

can you also think about this?

btw: if you think this is the problem:
"Unknown or unsupported encryption version. R: 3"

this should be fixable easily, if revision 4 is already working?
see pdf pdfreference1
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf
page 125+126.

there are some jobs aditional jobs do to
if revision is 3+ or revision is 4+...

br johannes

Unfortunately, we have a few other high-priority tasks that we need to address before we can get started on this. There is some other PDF work we need to do, so we plan on fixing this as part of that work.

I'll definitely let you know when there is something to test on your other samples.

Andy