kubernetes-sigs / scheduler-plugins

Repository for out-of-tree scheduler plugins based on scheduler framework.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[NodeResourceTopology] review and overhaul internal logging

ffromani opened this issue · comments

The logging inside the NodeResourceTopology code grew over time without a real design and lacking a clear architecture.
The plan is to fix this. I'm proposing to review all the usages of klog over the codebase (roughly 110 at my last count and)

  1. make sure either the namespace/name pair or the `UID' of the pod being processed are logged, to make as easy as possibly to track the flows while reviewing the logs (either by humans or by machines, either online, offline, manually or using log ingestion tools)
  2. extending the above, make the logs consistents as possible. Always use structured logging (this means getting rid of the few instances of Warningf in the codebase)
  3. enable log pluggability. The code should use a logr.Logger instance, not klog directly. Doing this way, we lose nothing because klog integrates natively with logr (see: https://pkg.go.dev/k8s.io/klog/v2#pkg-overview) and of course by default the code will still log through klog (using https://pkg.go.dev/k8s.io/klog/v2#Background). But we will enable trivial log replacement should be needed by integrators, for example for better auditing of the logging decisions.

Note: point 3 will not require any extra dep, since both scheduler-plugins and k/k already depend on go-logr, and AFAICT there are no plans on the horizon to remove this dependency.
Note: point 3 will also be an enabler for point 1 above (see: https://pkg.go.dev/k8s.io/klog/v2#LoggerWithValues)

I volunteer to do this work

tagging other maintainers/reviewers for awareness and comments/discussion: @Huang-Wei @Tal-or @swatisehgal @PiotrProkop

+1 to this effort

@ffromani just one thing about the ETA - I may start the v0.28 release cut this weekend. Do you want to postpone the cut date to next week so you can incorporate this?

@ffromani just one thing about the ETA - I may start the v0.28 release cut this weekend. Do you want to postpone the cut date to next week so you can incorporate this?

Thanks @Huang-Wei for sharing. Yes, better to wait. I happened to have time to start working on the PR, but this work is not very urgent and can wait.