brendonh / pyth

Python text markup and conversion

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rft control word "\f0" not reconized

joka opened this issue · comments

Im using rtf files generated by pandoc. They have a lot of "\f0" control words (no idea why).

/plugins/rtf15/reader.py cannot read these files because of this "\f0" word.

For a general solution, could you skip unknown control words?

Example rtf:
{\rtf\ansi\deff0{\fonttbl{\f0\froman Tms Rmn;}{\f1\fdecor
Symbol;}{\f2\fswiss Helv;}}{\colortbl;\red0\green0\blue0;
\red0\green0\blue255;\red0\green255\blue255;\red0\green255
blue0;\red255\green0\blue255;\red255\green0\blue0;\red255
green255\blue0;\red255\green255\blue255;}{\stylesheet{\fs20
\snext0Normal;}}{\info{\author John Doe}
{\creatim\yr1990\mo7\dy30\hr10\min48}{\version1}{\edmins0}
{\nofpages1}{\nofwords0}{\nofchars0}{\vern8351}}\widoctrl\ftnbj \sectd\linex0\endnhere \pard\plain \fs20 This is plain text.\

Hi Joka. Sorry for the slow response, I'm away from home.

\f0 is a standard control word that the RTF reader normally handles. From your example, I'm not sure what problem it's having.

I don't see a way to attach files here, so could you email your whole RTF file to me at brendonh@gmail.com ? I'll figure out what's tripping it up.

Cheers,
Brendon

ok fine,
and thank you for pyth, it's really nice to have an pythonic rtf reader.

I think I've fixed this (in trunk). Pyth was ignoring font declarations that didn't have a \fcharset. Now they default to the reader's charset (e.g. from the initial \ansi) instead, which I think is the right thing to do -- the spec isn't clear.

It seems to work for your example doc, anyway.