ENHANCEMENT idea - support file objects, not just filenames

Question

ENHANCEMENT idea - support file objects, not just filenames

clach04 opened this issue 2 years ago · comments

clach04 commented 2 years ago

Would depend on #23 being implemented.

Support file objects for:

input text file, filename param
output PDF file, output param - NOTE reportlab Canvas already supports file objects

clach04 · Answer 1 · Sat Jun 04 2022 03:01:13 GMT+0800 (China Standard Time)

NOTE even with #25 (a quick fix for #23), more refactoring is required to allow file objects to be passed in.

One quick hack would be to instantiate PDFCreator object, then update the object to add file for input reading and/or output file object before calling generate() which would not require a major refactoring.

clach04 · Answer 2 · Sat Jun 04 2022 05:50:29 GMT+0800 (China Standard Time)

Output PDF, via file object (like) is possible via:

from io import BytesIO
....
pdf_file_name = ':memory:'
args = txt2pdf.parser.parse_args([source_file_name, '--encoding=' + source_encoding, '--output=' + pdf_file_name, '--media=letter', '--author=me', '--quiet'])
pdf_file_object = FakeFile()
args.output = pdf_file_object
PDFCreator(args, Margins(
    args.margin_right,
    args.margin_left,
    args.margin_top,
    args.margin_bottom)).generate()
pdf_file_object.getvalue()  # to get PDF bytes

clach04 · Answer 3 · Sat Apr 08 2023 07:57:21 GMT+0800 (China Standard Time)

Input file needs workaround _readDocument() and _process().
Main changes needed are that _process() data parameter today needs to be a file as lookup is done on the file number and then the file size.

Proposal

Instead of:

def _process(self, data):
    flen = os.fstat(data.fileno()).st_size

allow file length to be passed in, if omitted do existing file length check:

def _process(self, data, flen=None):
    flen = flen or os.fstat(data.fileno()).st_size

@baruchel any strong (negative ;-)) thoughts on this?

Not urgent, this came up as part of the investigation whilst debugging a bug (in the file IO code).