lxml error
Palisand opened this issue · comments
Using lxml version 4.6.3, I get the following error when trying to simplify a document with a bulleted or numbered list:
.../lib/python3.9/site-packages/simplify_docx/__init__.py in simplify(doc, options)
31 __set_options__(_options)
32
---> 33 out = document(doc.element).to_json(doc, _options)
34
35 if _options.get("friendly-name", True):
.../lib/python3.9/site-packages/simplify_docx/elements/base.py in to_json(self, doc, options, super_iter)
104 out.update({
105 "TYPE": self.__type__,
--> 106 "VALUE": [ elt.to_json(doc, options) for elt in self],
107 })
108 return out
.../lib/python3.9/site-packages/simplify_docx/elements/base.py in <listcomp>(.0)
104 out.update({
105 "TYPE": self.__type__,
--> 106 "VALUE": [ elt.to_json(doc, options) for elt in self],
107 })
108 return out
.../lib/python3.9/site-packages/simplify_docx/elements/body.py in to_json(self, doc, options, super_iter)
23 iter_me = peekable(self)
24 for elt in iter_me:
---> 25 JSON = elt.to_json(doc, options, iter_me)
26
27 if (
.../lib/python3.9/site-packages/simplify_docx/elements/paragraph.py in to_json(self, doc, options, super_iter)
165
166 if options.get("include-paragraph-indent", True):
--> 167 _indent = get_paragraph_ind(self.fragment, doc)
168 if _indent is not None:
169 out["style"] = {"indent": indentation(_indent).to_json(doc, options)}
.../lib/python3.9/site-packages/simplify_docx/utils/paragrapy_style.py in get_paragraph_ind(p, doc)
54 num_style = get_num_style(p, doc)
55 if num_style is not None and \
---> 56 num_style.pPr is not None and \
57 num_style.pPr.ind is not None:
58 return num_style.pPr.ind
AttributeError: 'lxml.etree._Element' object has no attribute 'pPr'
Can you share a minimal example? Hard to recapitulate an error without the inputs.
- Created a
.docx
file with the following contents:
- foo
- bar
- foo
- bar
-
In your python environment, install lxml 4.6.3, simplify-docx, and the other dependencies specified in
setup.py::setup.install_requires
-
Run
simplify(docx.Document("your-file.docx"))
and observe that the error is raised
Your stacktrace will be different, but you should have the same error
AttributeError: 'lxml.etree._Element' object has no attribute 'pPr'
Here is a partial stack trace. The input .docx is not simple, and it was generated by Microsoft Word.
File "C:\Users\zoo42\AppData\Roaming\Python\Python39\site-packages\simplify_docx\elements\body.py", line 25, in to_json
JSON = elt.to_json(doc, options, iter_me)
File "C:\Users\zoo42\AppData\Roaming\Python\Python39\site-packages\simplify_docx\elements\paragraph.py", line 167, in to_json
_indent = get_paragraph_ind(self.fragment, doc)
File "C:\Users\zoo42\AppData\Roaming\Python\Python39\site-packages\simplify_docx\utils\paragrapy_style.py", line 56, in get_paragraph_ind
num_style.pPr is not None and \
AttributeError: 'lxml.etree._Element' object has no attribute 'pPr'
I should do a bit of debugging.
@rleir I just accepted a PR that should deal with this. Can you let me know if this fix helped you? If so I'll close this issue.
(Sorry I don't have time to investigate myself. I'm not at all associated with the office team and I just maintain this in my free time, which is hard to come by lately...)