kubewharf / katalyst-core

Katalyst aims to provide a universal solution to help improve resource utilization and optimize the overall costs in the cloud. This is the core components in Katalyst system, including multiple agents and centralized components

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Promote node resource over-commitment to GA

caohe opened this issue · comments

Why is this needed?

In v0.4, we released the MVP version of node resource over-commitment and implemented some basic features.

In v0.5, we plan to make some enhancements to this function to bring it to GA status.

What would you like to be added?

  • Dynamic over-commitment ratio adjustment: In order to make the amount of over-committed resources more accurate, we will combine long-term and short-term prediction algorithms to calculate the amount of resources that can be over-committed. #472
  • Interference detection and mitigation: In order to avoid resource competition caused by over-commitment, we will introduce multi-dimensional interference detection strategies, including CPU load/usage, memory usage, the reclaiming rate of kswapd, etc. Furthermore, we will introduce multi-tiered mitigation measures, including scheduling prevention, eviction, etc. #518
  • Compatibility with core binding: Prevent the bound cores from being over-committed to avoid scheduling too many CPU-bound Pods and causing the Pods to fail to start. #472