new Rule: Avoid journalctl unit filtering

Question

new Rule: Avoid journalctl unit filtering

Dmole opened this issue 6 months ago · comments

For new checks and feature suggestions

https://www.shellcheck.net/ (i.e. the latest commit) currently gives no useful warnings about this
I searched through https://github.com/koalaman/shellcheck/issues and didn't find anything related

[Edited to reflect further testing]

Avoid using glob in journalctl -u --unit as it can be 3 orders of magnitude slower.
(4 orders of magnitude slower than grepping a log file)

Use the full service name

journalctl -u $SERVICE

as listed in

systemctl list-units --type=service | grep $APP

instead of a glob

journalctl -u $APP*

matching regex:

grep -P 'journalctl[^|>)]* -(u|-unit) [^ |>)]*\*'

Martin Bagge / brother · Answer 1 · Wed Apr 03 2024 15:26:37 GMT+0800 (China Standard Time)

Pretty niche feature ey?

There has been a number of requests like this where someone want to see one command or other get special treatment.
Maybe someone can propose a way to disable a rule given a certain command and let teams have that shellcheck setting in a shared repository to make their life easier or something.

Tim · Answer 2 · Wed Apr 03 2024 20:08:48 GMT+0800 (China Standard Time)

systemd is standard, not niche.
Rules are already easy to disable, but there is no reason to ever disable this rule.

Martin Bagge / brother · Answer 3 · Wed Apr 03 2024 22:09:49 GMT+0800 (China Standard Time)

I think you understand what I mean, but for others:
systemd is not part of any shell internals and have it standout in front of other external programs it need a case. To me it lacks one really.

Tim · Answer 4 · Wed Apr 03 2024 22:38:35 GMT+0800 (China Standard Time)

I don't understand what you mean.

There is no such thing as "shell internals", there are "shell builtins" and POSIX,
but the code and goals of ShellCheck do not limit themselves to any minimal set,
and explicitly encourage addressing "corner cases" even less likely to be hit than this.

Martin Bagge / brother · Answer 5 · Thu Apr 04 2024 17:10:23 GMT+0800 (China Standard Time)

Ok teacher. Present the PR and cross your fingers.

JonasDoe · Answer 6 · Fri Apr 05 2024 02:07:16 GMT+0800 (China Standard Time)

I don't understand what you mean.

I understood what he meant.

Vidar Holen · Answer 7 · Tue Apr 09 2024 11:47:17 GMT+0800 (China Standard Time)

On my system, journalctl -u cron > /dev/null takes 1 second while journalctl | grep cron > /dev/null takes 48 seconds.

The log is mostly ssh bot traffic, and journalctl -u ssh > /dev/null takes 63 seconds while journalctl | grep ssh > /dev/null takes 51 seconds, which is a slight improvement but nowhere near 2 orders of magnitude.

Is this a known issue or something you discovered? It's expected that whichever command you try first will be orders of magnitude slower than the second due to disk caching, so does it hold up under repeated timing?

Tim · Answer 8 · Tue Apr 09 2024 20:24:22 GMT+0800 (China Standard Time)

@koalaman It looks like there are at least 5 ways to get logs ranging from 139.021s to 0.050s.

cron and ssh are system services which journalctl optimizes, services that are not system (or journalctl is unsure of) like postfix* (or anything with a glob) are slow:

# time journalctl -u postfix* | grep -c .
47065
real	2m19.021s

# time journalctl | grep " postfix/" | grep -c .
47484
real	0m20.148s

# time journalctl --system -u postfix* | grep -c .
47065
real	0m7.845s

time journalctl -u postfix@-.service  | grep -c .
47121
real	0m0.727s

# time cat /var/log/maillog* | grep -c " postfix/"
194559
real	0m0.050s

So it looks like glob is the worst offender that should be avoided or at least mitigated with --system.

Tim · Answer 9 · Thu Apr 25 2024 21:30:01 GMT+0800 (China Standard Time)

Relatedly the --quiet option should always be used in scripts to avoid the "-- No entries --" spam.

Vidar Holen · Answer 10 · Sun May 05 2024 09:33:13 GMT+0800 (China Standard Time)

I don't know much about systemd, but grep is obviously less robust and gives different answers in all your example queries. From what I can tell from the source code and setting SYSTEMD_LOG_LEVEL=debug, journalctl already expands the glob into matching units rather than doing a different type of scan; -u ssh.service -u ssh.socket produces the same filter expression as journalctl -u 'ssh*'. It also doesn't seem like ShellCheck should be suggesting that the user only ever search the system log.

If journalctl's filtering is objectively suboptimal, then that should likely be fixed on the journalctl side instead.

Thanks for the suggestion!