rstudio / tinytex

A lightweight, cross-platform, portable, and easy-to-maintain LaTeX distribution based on TeX Live

Home Page:https://yihui.org/tinytex/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatically install hyphenation patterns when using polyglossia

svraka opened this issue · comments

Commit ceb2001 added support for automatic installation of some babel language packages. The same issue arises with polyglossia, used with XeLaTeX, as polyglossia only throws warnings if hyphenation patterns are not available.

Here's a minimal reproducible document:

\documentclass{article}

\usepackage{polyglossia}
\setmainlanguage{hungarian}

\begin{document}

\dots

\end{document}

And here's the relevant part from the logs:

Package polyglossia Warning: No hyphenation patterns were loaded for `hungarian
'
(polyglossia)                I will use \language=\l@nohyphenation instead on i
nput line 10.

In this case the required package for hyphenation patterns is hyphen-hungarian, which can be deduced from the logs. I haven't tested all languages but my understanding is that it should work for most languages.

Although English seems to be a special case, if you want use British spelling.

Package polyglossia Warning: No hyphenation patterns were loaded for British En
glish
(polyglossia)                I will use the patterns for US English instead on 
input line 187.

This requires the hyphen-english package. However, according to its description:

Additional hyphenation patterns for American and British English in ASCII encoding. The American English patterns (usenglishmax) greatly extend the standard patterns from Knuth to find many additional hyphenation points. British English hyphenation is completely different from US English, so has its own set of patterns.

I would argue TinyTeX should even include this packages by default, as it would be useful for anyone writing in English. The size is 201k.


By filing an issue to this repo, I promise that

  • I have fully read the issue guide at https://yihui.org/issue/.
  • I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('tinytex'). I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: remotes::install_github('yihui/tinytex').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

In this case the required package for hyphenation patterns is hyphen-hungarian, which can be deduced from the logs

That sounds simple enough to implement (should be similar to ceb2001). Do you want to try a pull request?

Although English seems to be a special case, if you want use British spelling.

It seems the implementation won't be very clean. I tend to let users manually install the package.

I would argue TinyTeX should even include this packages by default, as it would be useful for anyone writing in English. The size is 201k.

I'm not familiar with polyglossia or the hyphen-* packages, and I'm not sure if they should be included by default. The size doesn't look bad, though.

Thanks!

That sounds simple enough to implement (should be similar to ceb2001). Do you want to try a pull request?

Sure, I can give it a crack in ~2-3 weeks after I come back from holiday.

I'm not familiar with polyglossia or the hyphen-* packages, and I'm not sure if they should be included by default. The size doesn't look bad, though.

If I'm not mistaken, the hyphen-* packages are not only used by polyglossia but babel too. Maybe even if one doesn't load babel?

Just started digging into this, and immediately ran into a problem: LaTeX wraps logfiles. It came up in my original examples but there are even longer language names (not to mention what to do with English variants).

Package polyglossia Warning: No hyphenation patterns were loaded for `portugues
e'

Warnings for babel are similar, and the current regex looks for the same pattern but apparently line breaks cannot occur there?

Have you ever had to deal with linebreaks in logfiles? I haven't found any logic for that in the package. LaTeX can be set to wrap longer lines by setting the max_print_line value in texmf.cnf but that either requires changes in the TinyTeX installation, or to each call to the LaTeX engine (see this TeX.SE answer and comments).

We can configure max_print_line in TinyTeX to a large value, e.g.,

tlmgr conf texmf max_print_line 10000

We need to run that in these places (after the lines):

https://github.com/yihui/tinytex/blob/10dd9361a97e6c1d56e0b42c58f08c5dd9db45c4/R/install.R#L382-L383

https://github.com/yihui/tinytex/blob/10dd9361a97e6c1d56e0b42c58f08c5dd9db45c4/tools/install-base.sh#L48

https://github.com/yihui/tinytex/blob/10dd9361a97e6c1d56e0b42c58f08c5dd9db45c4/tools/install-windows.bat#L59

@svraka You can include these changes in your PR or send a separate PR. If you are not familiar with the codebase of tinytex, I can also let @cderv do it since it's simple enough. Thanks!


Just for the record, it's also possible not to configure max_print_line in texmf.cnf but do it through the argument -cnf-line in command line, e.g.,

pdflatex -cnf-line=max_print_line=10000 test.tex

Personally I prefer LaTeX not to hard-wrap the log because that makes the log slightly more trickier to parse. I tend to just change max_print_line permanently in the config file. If users do not like it, they can delete this config with:

tlmgr conf texmf --delete max_print_line

Since I haven't heard back, I just added the max_print_line config by myself in 8e6dc53.

Thanks! Unfortunately I didn't have any time to work on a PR I hope to pick it up soon.

That's okay. I just did the work by myself. You may test the current dev version if you are interested. Thanks!