Ability to retrieve page size and rotation information in JSON
xelan opened this issue · comments
Andreas Erhard commented
Using pdfcpu info, the page sizes and rotation info can be shown:
$ pdfcpu info -u mm -p 0-99999 pagesizes.pdf
pages: 0,1,2,3
pagesizes.pdf:
Source: pagesizes.pdf
PDF version: 1.6
Page count: 3
Page 1: rot=+0 orientation:portrait
MediaBox (mm) (0.00, 0.00, 210.01, 297.00) w=210.01 h=297.00 ar=0.71 = CropBox, TrimBox, BleedBox, ArtBox
Page 2: rot=+0 orientation:landscape
MediaBox (mm) (0.00, 0.00, 297.00, 210.01) w=297.00 h=210.01 ar=1.41 = CropBox, TrimBox, BleedBox, ArtBox
Page 3: rot=+0 orientation:landscape
MediaBox (mm) (0.00, 0.00, 420.00, 297.00) w=420.00 h=297.00 ar=1.41 = CropBox, TrimBox, BleedBox, ArtBox
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Title:
Author:
Subject:
PDF Producer: LibreOffice 7.0
Content creator: Writer
Creation date: D:20240227100816+01'00'
Modification date:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Tagged: No
Hybrid: No
Linearized: No
Using XRef streams: No
Using object streams: No
Watermarked: No
Thumbnails: No
Form: No
Outlines: No
Names: No
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Encrypted: No
Permissions: Full access
However, the page size and rotation information is not included in the JSON:
$ pdfcpu info -j -u mm -p 0-99999 pagesizes.pdf
pages: 0,1,2,3
{
"header": {
"version": "pdfcpu v0.6.0 dev",
"creation": "2024-02-27 09:09:25 UTC"
},
"Infos": [
{
"source": "pagesizes.pdf",
"version": "1.6",
"pages": 3,
"title": "",
"author": "",
"subject": "",
"producer": "LibreOffice 7.0",
"creator": "Writer",
"creationDate": "D:20240227100816+01'00'",
"modificationDate": "",
"keywords": [],
"properties": {},
"tagged": false,
"hybrid": false,
"linearized": false,
"usingXRefStreams": false,
"usingObjectStreams": false,
"watermarked": false,
"thumbnails": false,
"form": false,
"signatures": false,
"appendOnly": false,
"bookmarks": false,
"names": false,
"encrypted": false,
"permissions": 0
}
]
}
Furthermore, it would be interesting to provide an option for all pages instead of specifiying a hardcoded range such as 0-99999
.
Thank you very much, best regards
Andreas
Horst Rutter commented
Use 1-
to select all pages as documented in pdfcpu selectedpages
.
More exhaustive JSON output goes into the pipeline.
Andreas Erhard commented
Thank you very much 👍
Horst Rutter commented
This fixed with https://github.com/pdfcpu/pdfcpu/releases/tag/v0.7.0