pdfcpu / pdfcpu

A PDF processor written in Go.

Home Page:http://pdfcpu.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Working with bytes (slices or buffers) instead of files

dc0d opened this issue · comments

Are there any plans to add the capability of working with bytes - instead of files - to the API?

More context:
I created a tiny tool that converts an 8-page PDF to PocketMod in WebAssembly. Many of the tools I needed (rotating pages, rearranging them, merging them, etc.) that can operate on bytes were already there. The only missing part was ExtractPages(...) - for my use case.

I also experimented with the idea of working with a "File System" inside WebAssembly (e.g., messing around with the JavaScript file included in Go WebAssembly to make it possible to use IndexedDB or something as the storage), and it turned out to be a far greater challenge than just transforming byte streams in memory.

This issue aims to determine whether there are any plans/interests for this, better support for WebAssembly and whether it is worth pursuing because it is not a small change.

Many API commands already work with readers.
Can you be more specific about what you are missing.
Thank you!

func ExtractPages(rs io.ReadSeeker, outDir, fileName string, selectedPages []string, conf *model.Configuration) error { ... } is one such example. While it accepts an io.ReadSeeker as input, the output goes into a/some file in a directory.

So, I created a function (a method for some struct) with the signature ExtractPages(rs io.ReadSeeker, selectedPages []string, conf *pdfxmodel.Configuration) ([]*bytes.Buffer, error) (it's not part of pdfcpu), which does the same thing. The only difference is the output target.

Try this in the api package:

func ExtractImagesRaw(rs io.ReadSeeker, selectedPages []string, conf *model.Configuration) ([]map[int]model.Image, error)

Misread your comment - page extraction just writes to files.
Sorry for the confusion!