sharad1126 / compressor

Because we don't have enough time to read everything

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compressor

Because we do not have time to read everything.

Daily Arxiv summaries: https://yobibyte.github.io/arxiv_compressed.html

Compressor is an LLM-based scientific literature / talks summarisation project started by yobibyte. It is heavily relying on llama.cpp and HuggingFace models.

Compressor is under active development, you are entering unchartered waters when using it.

I will be happy to any feedback / feature requests, and, please, send PRs.

Usecases

  1. Get arxiv link, summarise.

  2. Get all papers submitted to Arxiv at a date (usually published today). Summarise each.

  3. Get a pdf, summarise. Not yet implemented.

  4. Get an audio of a talk, get a script, summarise. WIP.

  5. Summarise all papers accepted to some conference on OpenReview.

  6. Summarise all talks of a particular conference. Future plans.

Architecture

Crawler -> Compressor -> Reporter

Big TODOs

  • Current version does summarisation based on abstracts only. Add full-text support.
  • Better exception handling. Right now, postprocessing LLM outputs might fail from time to time requiring rerunning the compressor.

About

Because we don't have enough time to read everything

License:MIT License


Languages

Language:Python 96.0%Language:Shell 4.0%