Add figures in markdown output
dolfim-ibm opened this issue · comments
Michele Dolfi commented
The current markdown output is skipping the figures objects.
We should allow users to have the images in the output as well.
Proposed format
...
<image>
Figure 2: Distribution of DocLayNet pages across document categories.
...
where <image>
is a placeholder and the text (if present) is the respective caption.
The placeholder should be customizable, e.g. using <!-- image -->
(which is a markdown comment). The initial choice of <image>
is motivated by the requirements of the llava input format.
The export_figures.py example will then be updated showing how to replace the <image>
placeholder with an actual markdown image pointing to the exported files.