Can't split getText() into paragraphs
liusiqi43 opened this issue · comments
Hello, I’m trying to get major text from articles but I got a string without new line characters. Is there a way to extract the text while retaining all new line characters? Otherwise there will be only one single paragraphe per article…
Or is there a switch to retain certain html tag while doing the extraction? Like retain all <a>
and <br>
.
By the way, thanks for your great work!
Snacktory is using jsoup under the hood for this work. You might look there but I fear jsoup does not offer an option to preserve new lines.