Tools and data for improving some (specific) PDF files. The command pdfstuff here might be useful for those who want to add tocs to existing PDFs. == How to use == $ apt install build-essential wine ghostscript libpodofo-dev Windows NT Programmer's Reference: - Get the SDK ISO here https://winworldpc.com/product/windows-sdk-ddk/nt-3x - Copy the DOC/SDK/WIN32API directory to source/WIN32API (or if you have bsdtar installed you can just place Disk01.iso in source and run make extract) - Run make Windows 1.03 SDK Docs: - Get the documentation PDFs from the same place as NT - Copy the files into source/win103/ - Run make win103 MS-DOS 2.0 Programmer's Reference: - curl -o source/Microsoft_Programmers_Reference_Manual_MSDOS_2.0.pdf "http://www.bitsavers.org/pdf/microsoft/dos/Microsoft_Programmers_Reference_Manual_MSDOS_2.0.pdf" - make dos2.pdf C90 standard: - Place it here: source/ansi-iso-9899-1990-1.pdf - Run make c90.pdf Dijkstra's Cooperating sequential processes: - curl -o source/EWD123.pdf https://www.cs.utexas.edu/users/EWD/ewd01xx/EWD123.PDF - make ewd123.pdf K&R C: - curl -Lo "source/The C Programming Language First Edition [UA-07].pdf" "https://archive.org/download/TheCProgrammingLanguageFirstEdition/The%20C%20Programming%20Language%20First%20Edition%20%5BUA-07%5D.pdf" make tcpl - make knrc1.pdf == pdfstuff usage == The program runs commands from its arguments from left to right. List of commands: --debug : enable debug output --read [file] : read a pdf file into memory --write [file] : write the current file --append [file] : append pages from another file to the current one (acts as --read if you don't have one open yet) --title [title] : sets the document title --num-dump : output existing num to stdout --num [file] : reads page labels from a file --toc-dump : output existing toc to stdout --toc-clear : removes the toc (must be used before --toc is used) --toc [file] : reads toc from a file --pagemode [mode] : set pagemode --box n l b w h : same as a podofobox command num files look like: prefix<TAB>page number<TAB>page index Where page index is 1-based index of the page, and page number is the user-visible page number. You only need to specify the start of each range, everything else will be extrapolated. Supported number types types: none (only prefix is used) 28 decimal XXVIII uppercase roman xxviii lowercase roman AB uppercase letters ab lowercase letters See section 7.3.1 of PDF 1.3 standard for more details. toc files looks like: <a bunch of tabs>title<TAB>page number The amount of tabs at the start indicates the depth of a given entry. The page number is the user visible one, and relies on the loaded num file. If you don't have that, it'll default to 1-based index. == Useful stuff == The following oneliner was used for extracting the TOC cat expand/WIN32API/VOLI/FRONT1.PS | sed -n 's/^[^(]*(\([^.].*\)).*/\1/p' | awk 'BEGIN{x=0} /\/93$/{x=0} x==1{print} /Contents/{x=1}' Vim users might benefit from the following. :set spell list ft=