htdebeer / pandocomatic

Automate the use of pandoc

Home Page:https://heerdebeer.org/Software/markdown/pandocomatic/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

passing input file to pandoc when pdf output is specified

ZBiener opened this issue · comments

On latest OSX Catalina version, using latest pandoc, pandocomatic, and paru, I get the following issue.

When I compile my file from .md to .tex, using the default latex-refs setting, I get the .tex file I am expecting. It compiles properly. However, when I compile using a setting that outputs from md to tex to pdf, the execution hangs.

This happens with every predefined format I have that outputs to PDF though tex. Here is an example of a the relevant part of pandocomatic.yaml:

#----------------------------------------------------------
 refs:
    extends: [user-info]
    pandoc:
      standalone: true
      verbose: true # verbose by default
      citeproc: true
      bibliography: "/Users/xxx/Library/texmf/bibtex/bib/BibDeskLibrary.json" 
      csl: csl/chicago-author-date.csl
      citation-abbreviations: cite-abbr.json # my journal abbreviations
      reference-links: true
    metadata:
      reference-section-title: References
      notes-after-punctuation: false
      link-citations: true 
      csl-hanging-indent: true
#-----------------------------------------------------------------------------
  syllabus-tex:
    extends: ['refs']
    pandoc:
      from: markdown+multiline_tables
      to: latex
      standalone: true
      template: templates/custom/syllabus.latex   

#-----------------------------------------------------------------------------
  syllabus-pdf:
    extends: ['syllabus-tex']
    pandoc:
      to: pdf
      pdf-engine: xelatex

If the input file specifies syllabus-tex, it is compiled to latex. If the input file specific syllabus-pdf, the command hangs.

The debug output of pandocomatic using with syllabus-pdf specified in the file is:

pandoc	--to=latex \
	--pdf-engine=xelatex \
	--standalone \
	--verbose \
	--citeproc \
	--csl=/Users/xxx/.pandoc/csl/chicago-author-date.csl \
	--citation-abbreviations=/Users/xxx/.pandoc/cite-abbr.json \
	--reference-links \
	--from=markdown+raw_tex \
	--filter=/Users/zvb1/.pandoc/filters/assimilateMetadata.rb \
	--template=/Users/zvb1/.pandoc/templates/custom.latex \
	--output=/Volumes/Data/paodf\ -\ xdf\ of\ sdf/wef/sdf\ 21.1\ 23r/test.pdf

Curiously, if I paste this back into the command line, and supply the file to be processed as the last argument, the file is processed properly.

Is it possible that the input file is somehow not being passed to pandoc? Does this indicate some issue in my install or ruby/paru/etc? When I command-c from the hanged processes, I get the following output:

	18: from /Users/xxx/.rbenv/versions/2.7.1/bin/pandocomatic:23:in `<main>'
	17: from /Users/xxx/.rbenv/versions/2.7.1/bin/pandocomatic:23:in `load'
	16: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/bin/pandocomatic:3:in `<top (required)>'
	15: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/pandocomatic.rb:105:in `run'
	14: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_multiple_command.rb:98:in `execute'
	13: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_multiple_command.rb:98:in `each'
	12: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_multiple_command.rb:99:in `block in execute'
	11: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/command.rb:129:in `execute'
	10: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_command.rb:86:in `run'
	 9: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_command.rb:135:in `convert_file'
	 8: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_command.rb:184:in `pandoc'
	 7: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_command.rb:184:in `chdir'
	 6: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/pandocomatic-0.2.7.4/lib/pandocomatic/command/convert_file_command.rb:218:in `block in pandoc'
	 5: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/paru-0.4.2.1/lib/paru/pandoc.rb:157:in `convert'
	 4: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/paru-0.4.2.1/lib/paru/pandoc.rb:318:in `run_converter'
	 3: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/2.7.0/open3.rb:101:in `popen3'
	 2: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/2.7.0/open3.rb:219:in `popen_run'
	 1: from /Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/paru-0.4.2.1/lib/paru/pandoc.rb:321:in `block in run_converter'
/Users/xxx/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/paru-0.4.2.1/lib/paru/pandoc.rb:321:in `read': Interrupt

Thanks.

Thanks for sending in this bug report. As it is bedtime in my locale, I'll look at it tomorrow somewhen. My first step will be to try to reproduce the issue. To do that, I might have need of an minimal input file to convert. Or does this issue occur with any input file? E.g., would a file containing this **markdown** text also run in this issue?

NO rush, of course. thank you. This happens with every file I've tried, which are all markdown. However, they all contain relevant metadata. Like this:

---
title: The Problem of Induction
pandocomatic_:
  use-template:
    - syllabus-pdf
---

# The Problem of Induction #

According to one formulation, the problem of induction is that our tendency to project past regularities into the future is not, and cannot, be justified with deductive certainty. 

I have trouble reproducing the issue with pandoc 2.11.2, pandocomatic 0.2.7.4, and paru 0.4.2.1. I did change the following to your pandocomatic.yaml file because I did not have your csl file, cite-abbr.json file, nor BibDeskLibrary.json file. I replaced the later with a simple bib file, removed the citation-abbreviations line, and used the apa.csl file. I also used the default LaTeX template instead yours. These changes should not matter much unless there are some errors in these files:

      bibliography: /home/dir/test/pandocomatic/bib.bib
      csl: apa.csl
      #citation-abbreviations: cite-abbr.json

and I added the user-info template:

user-info:
  metadata:
    author: A. Uthor

bib.bib has content:

@book {author2020a,
author = {A. Uthor},
title = {Title},
year = {2020}
}

This works as expected. Can you:

  • Check you have indeed this same version of pandoc? (2.11.2)
  • Try my example, just to make sure. Call the test file test.md
  • Try again, but now with a path for input and output files that does not contain any spaces or other "odd" characters? I tried the example also in directory pan do coma tic, but that worked here as well.

Thanks! Sorry for not supplying those files or the rest of my pandocomatic.yaml -- I assumed, as you said, that they wouldn't matter.

Sadly, the problem persist with the simplified setup. But I also diagnosed it better. By going into pandoc.rb, I tried to directly feed all sorts of pandoc options to run_converter. It turns out, the pandoc "--verbose" flag is to blame. If I remove it, everything's fine.

With the "--verbose" flags in the command, I can see that pandoc is launched, but it just hangs. The verbose flag doesn't cause problems on the command line, however; only when launched through ruby.

Very odd, but simple to overcome. I'm way out of my depth here... is this a pandoc issue? Should I report it there?

BTW, these are my versioning data:

pandoc 2.11.2
Compiled with pandoc-types 1.22, texmath 0.12.0.3, skylighting 0.10.0.3,
citeproc 0.2, ipynb 0.1.0.1

Pandocomatic version 0.2.7.4
paru (0.4.2.1)

This certainly is a bug. I am not sure if it is something that is going wrong in how I use Open3#popen3 in paru, or if something is going awry with how this popen3 behaves on your system. I created a minimal example. Put the following ruby code in file verbose_pandoc_issue.rb:

#!/usr/bin/env ruby
require "open3"

input = "Hello **world**"
command = "pandoc --verbose --from markdown --to html"
output = ""
error = ""
status = 0

Open3.popen3(command) do |stdin, stdout, stderr, thread|
    stdin << input
    stdin.close
    output << stdout.read
    error << stderr.read
    status = thread.value.exitstatus
end

puts "input:        '#{input}'"
puts "converted by: '#{command}'"
puts "into:         '#{output}'"
puts "with errors:  '#{error}'"
puts "and status:   '#{status}'"

Make verbose_pandoc_issue.rb executable and run it. If I do so, I get the output:

input:        'Hello **world**'
converted by: 'pandoc --verbose --from markdown --to html'
into:         '<p>Hello <strong>world</strong></p>
'
with errors:  ''
and status:   '0'

What is your output? And what happens when you remove the option --verbose?

Thinking about this a bit further, the --verbose option does not seem to do anything unless I convert something to PDF with LaTeX. Can you change the script verbose_pandoc_issue.rb to run command pandoc --verbose --from markdown --to latex -o out.pdf instead?

If I do so, all the LaTeX output is put in the error variable. That is, the output variable stays empty because the output is written to a file. What happens on your system?

Yes, the issue was always just limited to generating a PDF by means of LaTeX. HTML output works just fine with the --verbose flag.

On my system the command pandoc --verbose --from markdown --to latex -o out.pdf (from within verbose_pandoc_issue.rb) just hangs. No output whatsoever. If I kill it, the traceback read:

Traceback (most recent call last):
	4: from ./verbose_pandoc_issue.rb:10:in `<main>'
	3: from /Users/zvb1/.rbenv/versions/2.7.1/lib/ruby/2.7.0/open3.rb:101:in `popen3'
	2: from /Users/zvb1/.rbenv/versions/2.7.1/lib/ruby/2.7.0/open3.rb:219:in `popen_run'
	1: from ./verbose_pandoc_issue.rb:13:in `block in <main>'
./verbose_pandoc_issue.rb:13:in `read': Interrupt

I've tried this using ruby 2.7.1 and 2.6.6. No difference, modulo the directories in the traceback.

This might be related to the implementation of popen or ruby on your operating system, or some behavior of your operating system. What operating system are you using?

Wait, you already mentioned that at the top of your issue! I see you are using MacOS, which is a system I am not familiar with nor have access to. I fear I cannot investigate it further on my end. However, pull requests are welcome!

Thanks. I'm way out of my depth, so don't hold your breadth :-) Thanks for looking into this enough to help me sort it out.

You certainly found some obscure issue! My advice: don't use the --verbose option with pandocomatic. But if you need the extra output to improve your templates or somesuch, it can be handy I understand. However, during most pandocomatic runs I suppose you do not need the extra output.

So, I'll closing this issue for now. If there are any updates, don't hesitate to reopen it.

Thanks for sharing this. Removing --verbose also worked for me on my MacOS. Earlier in 2020 when I tried the last time I resorted to tex output and my own shell scripts.