JohannesKaufmann / html-to-markdown

⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🐛 Bug: code tag nested inside pre tag is not recognized

zhang19523zhao opened this issue · comments

image
image
image

This code doesn't seem to work

URL: https://www.bookstack.cn/read/the-way-to-go_ZH_CN/eBook-04.3.md

<pre class="prettyprint linenums prettyprinted" style=""><button class="btn btn-danger btn-sm btn-copy"><i class="fa fa-copy"></i> 复制代码</button><ol class="linenums"><li class="L0"><code class="lang-go"><span class="kwd">const</span><span class="pln"> </span><span class="typ">Pi</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> </span><span class="lit">3.14159</span></code></li></ol></pre>

turns to

```lang-go
<button class="btn btn-danger btn-sm btn-copy"><i class="fa fa-copy"></i> 复制代码</button><ol class="linenums"><li class="L0" data-converter-list-prefix="1. "><code class="lang-go"><span class="kwd">const</span><span class="pln"> </span><span class="typ">Pi</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> </span><span class="lit">3.14159</span></code></li></ol>
```

<code class="lang-go"><span class="kwd">const</span><span class="pln"> </span><span class="typ">Pi</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> </span><span class="lit">3.14159</span></code>

turns to

`const Pi = 3.14159`

It is happening because the code tag is not a direct child of pre. That is a case that I missed... Thanks a lot for reporting this 🙏

The bug is in this function that removes the html tags that are used for code highlighting.

@zhang19523zhao Unfortunately, I do not have the time to fix this until next month. Sorry about that 🤷‍♂️

Thanks♪(・ω・)ノ

Hi, is there any update?

@k1ng440 unfortunately it turns out to be trickier than I thought. Especially since people use pre/code tags very differently. So this is going to take some time unfortunately...

@zhang19523zhao @k1ng440 This should now be fixed in the latest version. Please let me know if you find any code snippets that don't work as expected.

Make sure to update the library by running:

go get -u github.com/JohannesKaufmann/html-to-markdown