danfickle / openhtmltopdf

An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!

Home Page:https://danfickle.github.io/pdf-templates/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeException ocurring on InlineText.setSubstring

LAlves91 opened this issue · comments

Greetings!

We're using openhtmltopdf in our project. Recently, we've updated from version 1.0.0 to 1.0.4, in order to solve a bug (reported in issue 420).

However, this upgrade caused another bug in our project. Now some documents break with the following stacktrace:

Caused by: java.lang.RuntimeException: set substring length too long (start = 1, end = 0): InlineText: ()
at com.openhtmltopdf.render.InlineText.setSubstring(InlineText.java:133)
at com.openhtmltopdf.layout.InlineBoxing.layoutText(InlineBoxing.java:1114)
at com.openhtmltopdf.layout.InlineBoxing.startInlineText(InlineBoxing.java:414)
at com.openhtmltopdf.layout.InlineBoxing.layoutContent(InlineBoxing.java:196)
at com.openhtmltopdf.render.BlockBox.layoutInlineChildren(BlockBox.java:1220)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1201)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableRowBox.layoutCell(TableRowBox.java:452)
at com.openhtmltopdf.newtable.TableRowBox.layoutChildren(TableRowBox.java:206)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableRowBox.layout(TableRowBox.java:95)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.newtable.TableSectionBox.layoutChildren(TableSectionBox.java:137)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableSectionBox.layout(TableSectionBox.java:278)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.newtable.TableBox.layoutChildren(TableBox.java:319)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.newtable.TableBox.layoutTable(TableBox.java:284)
at com.openhtmltopdf.newtable.TableBox.layout(TableBox.java:243)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableRowBox.layoutCell(TableRowBox.java:452)
at com.openhtmltopdf.newtable.TableRowBox.layoutChildren(TableRowBox.java:206)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableRowBox.layout(TableRowBox.java:95)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.newtable.TableSectionBox.layoutChildren(TableSectionBox.java:137)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.newtable.TableSectionBox.layout(TableSectionBox.java:278)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.newtable.TableBox.layoutChildren(TableBox.java:319)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.newtable.TableBox.layoutTable(TableBox.java:284)
at com.openhtmltopdf.newtable.TableBox.layout(TableBox.java:243)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:321)
at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:299)
at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:90)
at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1204)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1058)
at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:973)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.layout(PdfBoxRenderer.java:344)
at com.openhtmltopdf.pdfboxout.PdfRendererBuilder.run(PdfRendererBuilder.java:41)
... 208 more

After a few tests, we've reached two conclusions:

  • The error didn't happen in versions 1.0.0 and 1.0.1. It occurs in all versions starting from 1.0.2;
  • In our code, there's a base style which is appended in the documents:
    body { word-wrap: break-word; font-family: 'Courier New', Courier, monospace; }
    After removing the word-wrap style, it works again.

Is this an expected behaviour?

Cheers!

hi @LAlves91 , it's most likely a bug, could you provide an example that reproduce the error?

Thank you.

Hi @syjer !
I'm really sorry! I must've forgotten to add an html file. The code below is generating the reported error:

<html>
   <head>
      <style> body { word-wrap: break-word;}</style>
   </head>
   <body>
      <table>
         <td>
            <div align= "center">&nbsp;</div>
         </td>
         <td width="18">
            <div style="text-indent: 45pt;">&nbsp;</div>
         </td>
      </table>
   </body>
</html>

So far in my debugging efforts, what I've seen is:

  • During InlineBoxing.layoutContent@L152, a new LineBreakContext is created (with initial _end as 0);
  • A few lines later (at line 175), the method trimLeadingSpace is called. As &nbsp; is interpreted as " ", the _start value of LineBreakContext is updated to 1;
  • As there's no more content, there are no further updates to _start or _end;
  • Finally, at InlineBoxing.layoutText@L1114, setSubstring is called, using a value 1 as the start and 0 as the end, causing the error I reported.

I don't have much experience with openhtmltopdf core code, so my best effort to come up with a solution may be not good at all...but my initial thought would be to also update the ending value in trimLeadingSpace (I don't know if having an ending smaller than the start makes sense in other use cases).

Hi @LAlves91,

Thanks for debugging. I have made a fix similar to the one you suggested in PR. However, I was not able to reproduce with your sample HTML (or other variations I tried on 1.0.4 and the current 1.0.5 branch). Perhaps you could post your builder configuration, just so I can make a regression test for safety? Also, do you pre-process the HTML with Jsoup?

Thanks again.

Hello @danfickle !

Thank you for looking into my problem! About the sample, you're correct! I ended up posting the html without pre-processing...here's the actual input we send to openhtmltopdf:

<!DOCTYPE html [
   <!ENTITY nbsp "&#160;">
   ]>
<html>
   <head>
      <style>  body { word-wrap: break-word; font-family: 'Courier New', Courier, monospace; } table { width: 100% !important }  @page {margin-bottom: 1.5cm; margin-top: 4.5cm; margin-left: 2.6cm; margin-right: 1cm;}</style>
      <style> @page { size: A4; }</style>
      <style> body { word-wrap: break-word;}</style>
   </head>
   <body>
      <table>
         <tbody>
            <tr>
               <td>
                  <div align="center">
                     &nbsp;
                  </div>
               </td>
               <td width="18">
                  <div style="text-indent: 45pt;">
                     &nbsp;
                  </div>
               </td>
            </tr>
         </tbody>
      </table>
   </body>
</html>