ChenghaoMou / paper2audio

Convert research papers to audio files.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

paper2audio

Convert research papers to audio files. Currently, it reads one paragraph at a time, while generating audio files in the cache dir for the entire file.

Important

The repo is hosted on CodeBerg — a true FOSS alternative to Github. The Github version is a mirror. Please contribute or open issues here when possible. Why giving up on Github?

Features

  • Layout analysis for noise removal (native PDF files only). Following elements are removed by default:
    • Footnote
    • Page-header
    • Page-footer
    • Table
    • Formula
    • Picture
  • Rule based noise reduction:
    • References
    • Citations
  • TTS with GCP (This requires that you have an active project and have enabled TTS feature, make sure you are aware of the cost of doing so)
  • Convert the paper to a beautiful yet minimal html file so you can use whatever TTS you might prefer

Usage

Convert a paper to audio files

python -m paper2audio to-audio "/Users/chenghao/Zotero/storage/QFKMKFMV/Chen et al. - 2024 - Orion-14B Open-source Multilingual Large Language Models.pdf"

Convert a paper to html file

python -m paper2audio to-html "/Users/chenghao/Zotero/storage/QFKMKFMV/Chen et al. - 2024 - Orion-14B Open-source Multilingual Large Language Models.pdf" --output "output.html"

Examples

Feel free to checkout the html output in examples. Here are some example preview links:

  • Orion-14B Open-source Multilingual Large Language Models HTML PDF

Acknowledgement

Thanks to pierreguillou for the layout model.

License

MIT

About

Convert research papers to audio files.

License:MIT License


Languages

Language:Python 100.0%