nymann / pdf-scrub

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PDF Scrub

Scrubs encrypted compressed PDF files for text watermarks and metadata.

  1. Decrypts the PDF if it's encrypted
  2. Uncompresses the PDF
  3. Removes metadata (Xpacket)
  4. Tries to naively remove text based watermarks by matching objects which number of occurrences, is the same as the PDF page count. If multiple objects match, produce a pdf for each.
  5. Optionally compresses the PDF again if --no-compress is not given as a command line argument.

Usage

$ pdf_scrub --help
Usage: pdf_scrub [OPTIONS] FILES...

Arguments:
  FILES...  [required]

Options:
  --compress / --no-compress      Compress the final pdf to reduce file size greatly  [default: compress]

Dependencies

Requires qpdf and pdftk.

Development

For help getting started developing check DEVELOPMENT.md

About


Languages

Language:Python 74.1%Language:Makefile 25.9%