Usage

Have a peek at the example main.py:

from workspace.zhi2xml import from_url_to_xml, build_tex

urls = [
    # "list the urls which"
    # "you want to download"
    # "here ... for example:"
    "https://zhuanlan.zhihu.com/p/331738362",
    "https://zhuanlan.zhihu.com/p/359082678",
]

for url in urls:
    doc_xml, meta = from_url_to_xml(url)
    basename = '_'.join(meta).replace('/', '_').replace(' ', '_')
    print(basename)
    with open(file= 'xml/' + basename + '.xml', mode= 'w') as f:
        f.write(doc_xml)
    with open(file= 'tex/' + basename + '.tex', mode= 'w') as f:
        f.write(build_tex(doc_xml, meta))

Now run the main.py script and check generated .tex files in tex/ directory, they are almost ready to compile.

PS: If any picture occurred in the original posts, corresponding \figure{} block and \includegraphic{} block will be generated but commented. Download all these pictures according to the comment then uncomment these blocks to revive figures. PPS: html <table> hasn't been supported yet.

MizurenNanako / zhitex

Usage

About

Languages