k3a / html2text

Simple Go package to convert HTML to plain text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Advice on Modifying html2text Behavior to Preserve Link Text

devstein opened this issue Β· comments

Hi πŸ‘‹

Thanks for making and maintaining this package! It's almost exactly what I was looking for. The only different behavior I am looking for is to preserve the link text and append the link as a suffix to the text in the conversion.

For example, the behavior I want is

So(HTML2Text(`click <a href="javascript:void(0)">here</a>`), ShouldEqual, "click here")
So(HTML2Text(`click <a href="test"><span>here</span> or here</a>`), ShouldEqual, "click test <test>")
So(HTML2Text(`click <a href="http://bit.ly/2n4wXRs">news</a>`), ShouldEqual, "click news <http://bit.ly/2n4wXRs>")

I understand a change like would be breaking for users and change the interface, so I am planning on forking the repo to make this change.

Do you have advice on how to make this change? It's clear that changes would need to be made here. Any advice is appreciated!

Took a stab at it 😁 PR is here

@k3a The only part I'm unsure of is this line.

Do I need to also look for tagNameLowercase == "a" or will the closing tag always be tagNameLowercase == "/a"

commented

Thanks for your contribution! I integrated it into the master in a backward-compatible way :) and kept the CI as well.

Amazing! Glad I could help