ReScience / template

Template for article submission

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

difficulty converting metadata file yaml -> tex

mcbaneg opened this issue · comments

Setup: Windows 10 (64-bit), MikTeX, Python 2.7 with ruamel.yaml installed, cygwin

I modified the metadata.yaml file.
I first tried 'make' as recommended, but I have Python 2.7 installed and the command failed at python3.

I then tried running it directly (within Windows PowerShell), and got

PS C:\virial\rescience-c\template-master> python ./yaml-to-latex.py -i metadata.yaml -o metadata.tex
Traceback (most recent call last):
File "./yaml-to-latex.py", line 71, in
from article import Article
File "C:\virial\rescience-c\template-master\article.py", line 4, in
import yaml
ImportError: No module named yaml

Evidently the module name in ruamel.yaml (which seems to be the most up-to-date yaml kit for python) is different from that expected. I don't see documentation on which module (or which python version) authors are expected to be using. Suggestions?

Thanks,
G.

The YAML package that you need is PyYAML. I have never heard about ruamel.yaml before. It seems to be a fork of PyYAML, but I have no idea how compatible it is. I don't know either if either YAML package supports Python 2.7 in a sufficiently recent version.

We should have clearer instructions for the template. I added them in other ReScience repositories, but not so far on this one. Unfortunately reproducibility is becoming a problem even for article submissions!

That's a clear but unfortunate illustration of the reproducibility problem. I think I designed the yaml-to-latex.py script with Python 3 and never tested it with Python 2. Even though the PyYAML package is quite standard, it might be good to add a pointer in the template repository.

By the way, if your entry is for the Ten Years Reproducibility Challenge, we'll soon give some proposals for author on what to write in the article. We're a bit late on that.

@mcbaneg Please do submit your work! Your conclusion is just the kind of outcome we'd like to see from the challenge: what has worked in the past to maintain code in working state, and what hasn't. If people submit only the difficult-to-reproduce cases, we might erroneously conclude that nothing has ever worked!

@mcbaneg As @khinsen suggest, you can submit and your submission will serve as a testbed for others. Among the things that might be interesting is:

  1. How did you conserve the sources
  2. Did you take care of registering RNG seed (if you use it)
  3. Did you save command line options (if you need some options)
  4. Did you need to adapt your sources ?
  5. Did you need to adapt your libraries ?
  6. What guided your choice of fortran among other languages at that time
  7. etc.

Good points @rougier! I'd like to emphasize the utility of communicating the choices (and the motivations behind them) made at the time of publication, even if they risk being distorted by hindsight. That's something we can only get out of authors doing reproductions of their own work. For example, I realized that I never preserved or published code for reproducibility, but only to make it available for reuse by others. As a consequence, I am always missing the last small steps: command-line arguments, that five-line script that ties computations together, etc.

This discussion really belongs in ReScience/ten-years#4.

Yes, and we should start a author-instructions.md document. @mcbaneg Feel free to close the issue and let's continue discussion at ReScience/ten-years#4.

Okay, I have submitted a paper for the Ten Years challenge (3882888). I'm comfortable with it serving as trial case for working out both (1) what you hope for in a paper for this challenge, and (2) what submission conventions you want to use.

Two technical points:

  1. I tried to drag-and-drop the metadata.yaml file into the submission issue box, but was given a "we do not accept that file type" message. It felt silly to retype into the box several bits of information that I had already put into the .yaml file.

  2. I feel like your expectations for what tools authors will have immediately available are high enough to discourage some potential authors. They include:

make
LaTeX, including latexmk
Python 3.x with PyYAML (2.x can run PyYAML but its treatment of locales is different and it doesn't work with the template makefile for that reason)
perl (necessary for latexmk)

Many potential authors will be familiar with all these tools, but may well not have them installed on the computers they normally use for preparing publications. I am familiar with them all, and use some (including make and LaTeX) daily. However, I had to install latexmk, perl, Python 3 (I had 2.7, and spent the time to find out it didn't work for this purpose), and PyYAML on my laptop to complete the submission.

Thanks !

Yes, good point about the large set of tools needed just to compile. Note that there's an overleaf template (that may need to be updated) that simplify things. Only problem is that you still need to generate the yaml file. We can also have a simple web-form to create the yaml file but I'm a bit clueless on how to do that.

For the metadata uploading, I'm not sure to get your point.

@mcbaneg Thanks for the feedback on our submission procedure! We should probably simplify the required toolchain, ideally to the point that a TeXlive installation is sufficient to do everything. On the other hand, if that means reading YAML from TeX I'll probably change my mind!

@rougier If I understand our current setup correctly, make and latexmk are needed only for streamlining the PDF generation. We could perhaps provide a shell script that runs a worst-case scenario, assuming everything needs to be redone. But then, would a shell script work under Windows?

Yes, make and latexmk are not really needed. You can xelatex/biber/xelatex/xelatex and you're done. The metadata yaml and latex file can also be filled manually in case of problems (no need to use the python script to create the latex metadata from the yaml metadata, it's just more conveninet (if it runs)).