SuttaCentral books: make HTML, EPUB, PDF
- Python 3.10
- Docker and docker-compose
Clone repo
Install pre-commit git hooks:
- First install dependencies (libraries responsible for formatting). You fill find them in Makefile#lint
e.g.
pip install black isort mypy bandit autoflake
- Then install the actual pre-commit hook
# Go to project root cd publications # Make sure the file is executable chmod +x pre-commit # Install it cp pre-commit .git/hooks
It will mount the project root to the container:
# Build dev docker image
make build IMAGE_TARGET=development
# Run dev console to fiddle
make run-bash
# Tests
# if needed:
make build IMAGE_TARGET=development
# and
make test or make test-ci
# Lint code
# if needed:
make build IMAGE_TARGET=development
# and
make lint
# Go to project root
cd publications
# Run the script with given args
make run <personal_access_token> <publication_number>
Project uses pip-tools to handle dependencies in requirements/*.txt
files.
To manage requirements you need to have pip-tools
installed in your env (or run docker for devs
make build IMAGE_TARGET=development; make run-dev bash
).
Packages name used in this project are stored in requirements/*.in
.
Guide on how to resolve conflicts
and/or update dependencies.
- Add package name to suitable
requirements/*.in
file - Run command to propagate changes and pin the package number
# Compile requirements *.txt files based on *.in file content
make compile-deps
# To update all packages, periodically re-run
make recompile-deps
make sync-deps
# Build docker image to github
make build IMAGE_TARGET=production
# Push docker image to github
make push-docker-image IMAGE_TARGET=production
# Build docker image to github
make build IMAGE_TARGET=development
# Push docker image to github
make push-docker-image IMAGE_TARGET=development
- The project uses heavy
texlive
packages to generate books and covers, therefore we have decided to use a prebuilt Docker image for production. - Any .tex template or a
.env_public
file can be updated between application runs. There is an entrypoint inDockerfile
(production stage) to ensure that the latest project source files from GitHub repo are being used.
The project can be run via GitHub Actions in three ways:
- As a scheduled cron job (currently every Monday). It uses
suttacentral/sc_data/bilara_data
repo to detect if any publication included insuttacentral.net/api/publication/editions
response were modified since a previous run. The detector usesEDITION_FINDER_PATTERNS
environmental variable to match files to specific editions. - Manually by the user without any input. The app will automatically detect modified editions as above (see point 1.)
- Manually by the user with input containing a publication number(s). The app accepts a single value
or a list of values separated by a comma, for example:
scpub1
orscpub1,scpub2,scpub3
.
- Each publication may have a different editions (currently HTML, EPUB and PDF are supported). Mapping of finished
publications:
suttacentral.net/api/publication/editions
- Each edition has its own JSON config:
suttacentral.net/api/publication/edition/<edition_id>
which contains:- basic information about the publication's author, language, title, etc.
- the number and details of individual volumes
- the order and content of individual matters included in frontmatter, mainmatter and backmatter
- the depth of a main table of contents and a secondary table of contents if specified
- There are few kinds of front and back matters:
./matter/<name>.html
are processed in unchanged form- others (e.g.
titlepage
,halftitlepage
, etc.) use jinja2 templates included insrc/sutta_publisher/templates/html/
main-toc
is generated after the mainmatter is ready and uses jinja2 template
- Mainmatter parts are composed of segments taken from SuttaCentral API. There are two types of segments:
branch
- the segments which have no actual content. These are the headings (titles) of whole books, parts, chapters etc. included in a given mainmatter partleaf
- the segments with actual content: html markups, main text verses, notes, references
- SuttaCentral API does not provide any information about the depth of a given segment. The structure is strictly based
on
super_tree.json
and adequate<text_uid>-tree.json
.
Since pylatex
package adds default documentclass and document_options commands to the files it generates, they config
must be changed via LATEX_DOCUMENT_CONFIG
variable in .env_public
file.
SuttaCentral's custom styling classes like namo, uddana-intro, pli, san can be added to the project via the following
variables included in .env_public
file:
SANSKRIT_LANGUAGES
:class="pli"
-->\textsanskrit
FOREIGN_SCRIPT_MACRO_LANGUAGES
:lang="lzh"
-->\langlzh
STYLING_CLASSES
:class="uddana-intro"
-->\scuddanaintro
(please note that thesc
is added at the beginning and-
hyphen is removed)
If a given edition's depth does not allow to use Latex sections, we can force the project to use Latex chapters via
TEXTS_WITH_CHAPTER_SUTTA_TITLES
variable in .env_public
file.
Titles with IDs ending with pannasaka
are converted to Latex custom \pannasa
commands. Additional branches can be
added via ADDITIONAL_PANNASAKA_IDS
variable in .env_public
file.