榴莲榴莲's starred repositories
everyone-can-use-english
人人都能用英语
cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,支持国产cpu/gpu/npu芯片,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
argo-events
Event-driven Automation Framework for Kubernetes
gpu-operator
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
argocd-image-updater
Automatic container image update for Argo CD
clusterpedia
The Encyclopedia of Kubernetes clusters
k8s-vgpu-scheduler
OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow applications to access larger memory space than its physical capacity. It is designed for ease of use of extended device memory for AI workloads.
katalyst-core
Katalyst aims to provide a universal solution to help improve resource utilization and optimize the overall costs in the cloud. This is the core components in Katalyst system, including multiple agents and centralized components
godel-scheduler
a unified scheduler for online and offline tasks
mig-parted
MIG Partition Editor for NVIDIA GPUs
Kubernetes-practical-exercises-Hands-on
A repo to help you learn Kubernetes from the ground up by doing practical exercises and teach you how to use Kubernetes to deploy, manage, and scale containerized applications.
blackdagger
Blackdagger is a DAG-based automation tool specifically used in DevOps, DevSecOps, MLOps, MLSecOps, and Continuous Red Teaming (CART).
quick-debug
quick debug program running in the k8s pod
argo-workflow-multicluster
Enable Argo Workflow multi-cluster capabilities using Open Cluster Management (OCM). For more information, please visit: https://argoproj.github.io/argo-workflows/ and https://open-cluster-management.io/