ENHANCEMENT idea - refactor so can be used as a library as well as command line
clach04 opened this issue · comments
clach04 commented
I.e. so import txt2pdf
is an option.
clach04 commented
I'm doing this today (well for the last year or so?) with code like the below:
try:
#raise ImportError
from io import BytesIO as FakeFile # py3
except ImportError:
try:
from cStringIO import StringIO as FakeFile # NOTE only use when not using .write() method - does not support Unicode
except ImportError:
from StringIO import StringIO as FakeFile
import txt2pdf # from https://github.com/baruchel/txt2pdf - NOTE requires https://github.com/baruchel/txt2pdf/pull/25
from txt2pdf import parser, Margins, PDFCreator
....
args = txt2pdf.parser.parse_args([
source_file_name, '--encoding=' + source_encoding,
'--output=' + intermediate_pdf_file_name,
'--media=letter', # default 2cm margins
'--author=Actian', # NOTE this ends up in intermediate pdf file, later stage can/will change this
'--quiet',
'--tab-size=4',
#'--font=\\Downloads\\courier-prime-code\\ttf\\CourierPrimeCode-FF.ttf', #
#'--font=\\Downloads\\courier-prime-code\\ttf\\CourierPrimeCode-symbol-Form-Feed.ttf', #
#'--break-on-blanks', # THIS DOES NOT WORK - proven with EA 2.1 code - I'm not convinced NOT using this works properly, looks like the logic for page-break support and --minimum-page-length is mixed up (and I don't think it works correctly)
#break blanks also not working. TODO prefix input text file with line numbers and visually inspeact with both options
#setting breakonly emmits 17 pages!?
#'--break-on-blanks', # do not force a page break on form-feed character ## DEBUG getting 17 pages not 2976 for EA 2.1 code!?
# '--minimum-page-length' defaults to 10
#u'--tab-replacement=\N{RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK}', # DEBUG
#u'--tab-replacement=\N{RIGHTWARDS ARROW}', # DEBUG # requires https://github.com/baruchel/txt2pdf/pull/27
])
# TODO title
if intermediate_pdf_file_name == ':memory:':
# special case, don't write to filesystem
pdf_file_object = FakeFile()
args.output = pdf_file_object
print('Reading plain text file %s, using encoding %s' % (source_file_name, source_encoding))
print('Generating PDF of all pages, %s' % intermediate_pdf_file_name)
pdf_creator = PDFCreator(args, Margins(
args.margin_right,
args.margin_left,
args.margin_top,
args.margin_bottom))
# hard coded character replacement, could instead use command line flags and a filename
# NOTE requires https://github.com/baruchel/txt2pdf/pull/29
pdf_creator.character_replacement = {
"12": "" # form-feed/0x0c/\f replaced with empty string (i.e. remove form-feeds)
}
try:
pdf_creator.generate()
except ......
I cheat slightly and make use of the argument parsing in txt2pdf already to feed that into PDFCreator().