How can I restore the content with the extracted metadata
goddyZhao opened this issue · comments
Hi, I tried this tool and it is awesome but I have a question here:
Say that I want to extract the main page content (article) and show the restyled articles (like the pocket text view). But unfluff just extract the whole text (with images and videos splitted) in the json object, How can I restore the content?
I even don't know how many paragraphs it has and where is the image position in the article. Thanks!
You can't do that with this package, unfortunately. It's not really designed to do that. It's designed to just grab the plain text and discard the original document structure.
You could modify the code work differently, but that's not really what it's designed to do right now. Sorry :(
well, ok! Thanks @ageitgey