raceychan / learncpp_pdf

an advance tool that instantly turns "www.learncpp.com" into a pdf book (in a minimalistic style)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LearnCPP_PDF

Disclaimer

All content directly comes from the learncpp.com website, no content changed, some decorative elements and the comment section is removed for better readability.

Please consider supporting the website here learncpp-about

since it specifically states that a pdf version should not be spread out by anyone and people should instead make pdf on their own, this tool is hence created.

Usage

1.clone the repo

git clone 'git@github.com:raceychan/learncpp_pdf.git'
  1. cd to src folder
cd learncpp_pdf
  1. install wkhtmltopdf for ubuntu/debian users, do:
sudo apt-get install wkhtmltopdf

as a user who uses a different os, you might see more details on this page wkhtmltopdf-download

  1. execute the application
make install && make run

Configuration

You can create a '.env' file under the project root, the program will read them.

key type default
DOWNLOAD_CONCURRENT_MAX int 200
COMPUTE_PROCESS_MAX int os.cpu_count()
COMPUTE_PROCESS_TIMEOUT int 60
PDF_CONVERTION_MAX_RETRY int 3
BOOK_NAME str 'learncpp.pdf
REMOVE_CACHE_ON_SUCCESS bool False

Note: setting DOWNLOAD_CONCURRENT_MAX to higher number might boost download speed, but some requests might fail as it exerts more pressure on the website

CLI

You can use cli with following options to force-redo an action.

pixi run python -m book --help
options:
  -h, --help      show this help message and exit
  -D, --download  Downloading articles from learcpp.com, ignore cache
  -C, --convert   Converting downloaded htmls to pdfs, ignore cache
  -M, --merge     Merging Chapters into a single book, ignore cache
  -R, --rmcache   Remove the cache folder
  -A, --all       Download, convert and merge
  -S, --showerrors show error log in the console

example: re-run the convert process and remove the cache folder

pixi run python -m book --convert --rmcache

if not command specified, all actions will be taken(cache would be applied to avoid uncessary requests).

Use-Tips

  • It is possible that the download process and/or the convert process might fail due to various reason, for example, the target site is overloaded, in most cases, you can simply just re-run the program to solve these problems. However, if you do think it is a bug, always feel free to post an issue.

  • You might want to compress the pdf book for performance and storage. check pdfsizeopt out.

Features

  • Ultra fast, utilize concurrency for scraping and parallel for making PDF, the whole process is expected to finish within a few minutes.
  • Rich cli interface showing realtime progress of the application
  • Cache on fail, you can just re-run the application without worrying about redundant IO or calcualtion.

Alternatives

This does not utilize concurrent requests and multiprocessing, so it takes substantially more time to do the job.

About

an advance tool that instantly turns "www.learncpp.com" into a pdf book (in a minimalistic style)

License:GNU General Public License v3.0


Languages

Language:Python 99.3%Language:Makefile 0.7%