CustomNodeResource status 中的 CPU allocatable 数值没有更新
flpanbin opened this issue · comments
What happened?
我按照文档 https://gokatalyst.io/docs/getting-started/colocation-quick-start/ 安装部署了 katalyst, 然后创建了 shared-normal-pod 应用,应用创建前后观察 kcnr 中 resource.katalyst.kubewharf.io/reclaimed_millicpu 的数值并没有变化。
root@ubuntu:~/katalyst/examples# kubectl get nodes -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.6.202.151 Ready control-plane 19h v1.24.6-kubewharf.8 10.6.202.151 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic containerd://1.4.12
node1 Ready <none> 19h v1.24.6-kubewharf.8 10.6.202.152 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic containerd://1.4.12
node2 Ready <none> 19h v1.24.6-kubewharf.8 10.6.202.153 <none> Ubuntu 20.04.5 LTS 5.4.0-125-generic containerd://1.4.12
root@ubuntu:~/katalyst/examples# helm list -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
katalyst-colocation katalyst-system 1 2024-05-24 09:28:44.44903291 +0000 UTC deployed katalyst-colocation-orm-0.5.0 v0.5.0
malachite malachite-system 1 2024-05-24 09:16:19.208333849 +0000 UTC deployed malachite-0.1.0 0.1.0
node2 节点的资源使用情况,确实是占用了2 core cpu.
shared-normal-pod 调度到了 node2 节点,节点的配置是 4核8G,该节点的 kcnr 中的 status.resources. allocatable 中的 cpu 和 memory 都没有变化。所有节点的信息都一样。
root@ubuntu:~/katalyst/examples# kubectl get kcnr node2 -oyaml
apiVersion: node.katalyst.kubewharf.io/v1alpha1
kind: CustomNodeResource
metadata:
annotations:
katalyst.kubewharf.io/cpu_overcommit_ratio: "1.00"
katalyst.kubewharf.io/guaranteed_cpus: "0"
katalyst.kubewharf.io/memory_overcommit_ratio: "1.00"
katalyst.kubewharf.io/overcommit_cpu_manager: none
katalyst.kubewharf.io/overcommit_memory_manager: None
creationTimestamp: "2024-05-24T02:01:18Z"
generation: 2
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: node2
kubernetes.io/os: linux
name: node2
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Node
name: node2
uid: 6a7c0a4b-451a-4c96-a580-c0e792772077
resourceVersion: "131467"
uid: 69d3adc7-7709-4e36-aa30-554ad7d6e1be
spec:
nodeResourceProperties:
- propertyName: numa
propertyQuantity: "1"
- propertyName: nbw
propertyQuantity: 10k
- propertyName: cpu
propertyQuantity: "4"
- propertyName: memory
propertyQuantity: 8148204Ki
- propertyName: cis
propertyValues:
- avx2
- propertyName: topology
propertyValues:
- '{"Iface":"ens160","Speed":10000,"NumaNode":0,"Enable":true,"Addr":{"IPV4":["10.6.202.153"],"IPV6":null},"NSName":"","NSAbsolutePath":""}'
status:
resources:
allocatable:
resource.katalyst.kubewharf.io/reclaimed_memory: 5Gi
resource.katalyst.kubewharf.io/reclaimed_millicpu: 4k
capacity:
resource.katalyst.kubewharf.io/reclaimed_memory: 5Gi
resource.katalyst.kubewharf.io/reclaimed_millicpu: 4k
topologyPolicy: None
topologyZone:
- children:
- attributes:
- name: katalyst.kubewharf.io/netns_name
value: ""
- name: katalyst.kubewharf.io/resource_identifier
value: ens160
name: ens160
resources:
allocatable:
resource.katalyst.kubewharf.io/net_bandwidth: 9k
capacity:
resource.katalyst.kubewharf.io/net_bandwidth: 9k
type: NIC
- name: "0"
resources:
allocatable:
cpu: "4"
memory: "8343760896"
capacity:
cpu: "4"
memory: "8343760896"
type: Numa
name: "0"
resources: {}
type: Socket
![image](https://private-user-images.githubusercontent.com/13435258/333554334-53621057-b08e-40d0-875d-6f8c8524b807.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjI3MDM1MzcsIm5iZiI6MTcyMjcwMzIzNywicGF0aCI6Ii8xMzQzNTI1OC8zMzM1NTQzMzQtNTM2MjEwNTctYjA4ZS00MGQwLTg3NWQtNmY4Yzg1MjRiODA3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MDMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODAzVDE2NDAzN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTc0MWE0MmI5Y2E3OTMzNWU1OGFiZDc0ZDdmYzM4Y2I4YWMxNTljNGI0ZDBjMmM3YzBkNGExNDgzZjkwYmEzMmImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.HdiqS7nISpnZS9MiQtRe-HO1uimtCshhnGYQ_064Nk4)
What did you expect to happen?
node2 节点的kcnr status 数值更新。
How can we reproduce it (as minimally and precisely as possible)?
按照文档操作:https://gokatalyst.io/docs/getting-started/colocation-quick-start/
Software version
$ <software> version
# paste output here
@flpanbin
你是说给 shared cores pods 一个 load 之后,kcnr 上的可用 reclaimed resource 没有按预期变小是吗?
@pendoragon 是的,但是我升级了节点配置后(将 cpu 从 4核8G 升级到 8核16G),发现数据又更新了。
@pendoragon 是的,但是我升级了节点配置后(将 cpu 从 4核8G 升级到 8核16G),发现数据又更新了。
应该是节点规格太小的时候上报的资源始终受到默认MinReclaimedResourceForReport(4C5Gi)影响导致的
@pendoragon 是的,但是我升级了节点配置后(将 cpu 从 4核8G 升级到 8核16G),发现数据又更新了。
应该是节点规格太小的时候上报的资源始终受到默认MinReclaimedResourceForReport(4C5Gi)影响导致的
了解了,感谢!