koalaman / shellcheck

ShellCheck, a static analysis tool for shell scripts

Home Page:https://www.shellcheck.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

new Rule: Avoid journalctl unit filtering

Dmole opened this issue · comments

commented

For new checks and feature suggestions

[Edited to reflect further testing]

Avoid using glob in journalctl -u --unit as it can be 3 orders of magnitude slower.
(4 orders of magnitude slower than grepping a log file)

Use the full service name

journalctl -u $SERVICE

as listed in

systemctl list-units --type=service | grep $APP

instead of a glob

journalctl -u $APP*

matching regex:

grep -P 'journalctl[^|>)]* -(u|-unit) [^ |>)]*\*'

Pretty niche feature ey?

There has been a number of requests like this where someone want to see one command or other get special treatment.
Maybe someone can propose a way to disable a rule given a certain command and let teams have that shellcheck setting in a shared repository to make their life easier or something.

commented

systemd is standard, not niche.
Rules are already easy to disable, but there is no reason to ever disable this rule.

I think you understand what I mean, but for others:
systemd is not part of any shell internals and have it standout in front of other external programs it need a case. To me it lacks one really.

commented

I don't understand what you mean.

There is no such thing as "shell internals", there are "shell builtins" and POSIX,
but the code and goals of ShellCheck do not limit themselves to any minimal set,
and explicitly encourage addressing "corner cases" even less likely to be hit than this.

Ok teacher. Present the PR and cross your fingers.

I don't understand what you mean.

I understood what he meant.

On my system, journalctl -u cron > /dev/null takes 1 second while journalctl | grep cron > /dev/null takes 48 seconds.

The log is mostly ssh bot traffic, and journalctl -u ssh > /dev/null takes 63 seconds while journalctl | grep ssh > /dev/null takes 51 seconds, which is a slight improvement but nowhere near 2 orders of magnitude.

Is this a known issue or something you discovered? It's expected that whichever command you try first will be orders of magnitude slower than the second due to disk caching, so does it hold up under repeated timing?

commented

@koalaman It looks like there are at least 5 ways to get logs ranging from 139.021s to 0.050s.

cron and ssh are system services which journalctl optimizes, services that are not system (or journalctl is unsure of) like postfix* (or anything with a glob) are slow:

# time journalctl -u postfix* | grep -c .
47065
real	2m19.021s

# time journalctl | grep " postfix/" | grep -c .
47484
real	0m20.148s

# time journalctl --system -u postfix* | grep -c .
47065
real	0m7.845s

time journalctl -u postfix@-.service  | grep -c .
47121
real	0m0.727s

# time cat /var/log/maillog* | grep -c " postfix/"
194559
real	0m0.050s

So it looks like glob is the worst offender that should be avoided or at least mitigated with --system.

commented

Relatedly the --quiet option should always be used in scripts to avoid the "-- No entries --" spam.

I don't know much about systemd, but grep is obviously less robust and gives different answers in all your example queries. From what I can tell from the source code and setting SYSTEMD_LOG_LEVEL=debug, journalctl already expands the glob into matching units rather than doing a different type of scan; -u ssh.service -u ssh.socket produces the same filter expression as journalctl -u 'ssh*'. It also doesn't seem like ShellCheck should be suggesting that the user only ever search the system log.

If journalctl's filtering is objectively suboptimal, then that should likely be fixed on the journalctl side instead.

Thanks for the suggestion!