microsoft / Simplify-Docx

Simplify DOCX files to JSON

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AttributeError: 'lxml.etree._Element' object has no attribute 'val'

chris-park opened this issue · comments

Hi Simplify-Docx team,

I tried to run simplify() on a sample word document and ran into an error: AttributeError: 'lxml.etree._Element' object has no attribute 'val'. I've included a fully reproducible example below, which I ran on Google Colab using Python 3.7.13. Would you be able to help?

Thanks for your help.

Setup

python -m pip install git+https://github.com/jdthorpe/python-docx.git
pip install folium==0.2.1
python -m pip install git+https://github.com/microsoft/Simplify-Docx.git

Reproducible example

import docx
import requests
from simplify_docx import simplify

fpath = "https://www.nidcr.nih.gov/sites/default/files/2017-12/reportable-events-table.docx"
fname = "sample.docx"
with open(fname, "wb") as f:
    f.write(requests.get(fpath).content)

doc = docx.Document(fname)
simplify(doc)

# Error
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-21-a6ef8e30e39f>](https://localhost:8080/#) in <module>()
----> 1 simplify(doc)

4 frames
[/usr/local/lib/python3.7/dist-packages/simplify_docx/elements/table.py](https://localhost:8080/#) in to_json(self, doc, options, super_iter)
     71         _desc = self.fragment.tblPr.find(qn("w:tblDescription"))
     72         if _desc is not None:
---> 73             if (not _desc.val) and options.get("ignore-empty-table-description", True):
     74                 pass
     75             else:

AttributeError: 'lxml.etree._Element' object has no attribute 'val'

I've also added the full sample notebook here for your reference.

Thanks for the clear and reproducible example! Fixed.