Bug - Converting an <img> tag with a hypen in src and a src greater than 74 characters adds a newline after the hypen in the output
mkmoisen opened this issue · comments
I noticed an odd bug when Converting an <img>
tag containing:
- A hyphen in the src
- A src longer than 74 characters
Converting a <img>
tag with a src of 74 characters or less works fine
> # Note the missing "y" in the last word, "supply"
>img = '<img src="http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_suppl.jpg">'
>html2text.html2text(img)
u'![](http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_suppl.jpg)\n\n'
> # Note the addition of the "y" in the last word, "supply"
>img = '<img src="http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_supply.jpg">'
>html2text.html2text(img)
u'![](http://matthewmoisen.com/blog/wp-\ncontent/matthew_moisen_tractor_supply.jpg)\n\n'
See how a \n
character has been added after wp- ?
@mkmoisen you might wanna have a look a this #91
The project has been moved to https://github.com/Alir3z4/html2text/
I noticed an odd bug when Converting an
<img>
tag containing:
- A hyphen in the src
- A src longer than 74 characters
Converting a
<img>
tag with a src of 74 characters or less works fine> # Note the missing "y" in the last word, "supply" >img = '<img src="http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_suppl.jpg">' >html2text.html2text(img) u'![](http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_suppl.jpg)\n\n' > # Note the addition of the "y" in the last word, "supply" >img = '<img src="http://matthewmoisen.com/blog/wp-content/matthew_moisen_tractor_supply.jpg">' >html2text.html2text(img) u'![](http://matthewmoisen.com/blog/wp-\ncontent/matthew_moisen_tractor_supply.jpg)\n\n'
See how a
\n
character has been added after wp- ?
HI, i have the same problem as you so how did you resolve it? thx
@JQ-K Sorry I do not remember.
I got around this issue by avoiding wrapping altogether. Using the bodywidth
argument:
html2text(html=str(soup), bodywidth=0)