elotl / kip

Virtual-kubelet provider running pods in cloud instances

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kip host cluster requirement

ddysher opened this issue · comments

Hi, thanks for open sourcing the project. We have a use case where the cluster is running in a private cloud but we want to extend the workload to the cloud. Specifically, we want to train a model using Cloud GPU. The data already present in S3 and we just need to train the model and save the model back to S3, so I'd expect we won't need most of the features in Kubernetes. Does kip support the use case? If not, we have to run the cluster in the Cloud where we launch the GPU instances.

@ddysher : thanks a lot for checking out kip!

Can you please confirm the following?

  1. Your on-prem cluster is running kubernetes control plane
  2. The model training workload that you would like to burst to AWS is containerized and you would like to schedule it via on-prem kubernetes control plane

If the answers to above two questions are "yes", then yes your usecase is supported. You will not need to run a separate kubernetes control plane on AWS.

We would love to help you setup your kip environment. Please let me know if you'd like a Zoom call with the kip team (no strings attached). Thanks!

If the answers to above two questions are "yes", then yes your usecase is supported. You will not need to run a separate kubernetes control plane on AWS.

@ddysher : here is a demo of the on-prem kubernetes control plane shipping pods to AWS. We will be doing a live demo of the same on wednesday 10am PT during CNCF webinar if you are interested. You can also deploy a test setup using https://github.com/elotl/kip/tree/master/deploy/terraform-vpn to try it out. Thanks!

@myechuri Thanks for the detailed response. The answers to the above questions are "yes", I'll take a look at the demo, thanks!

@ddysher : awesome! We are super excited for your trial - please do let us know if you have any questions or run into issues. Thanks!

@ddysher : checking in to see if you had a chance to try out kip for bursting your workload to AWS and if you have any questions/concerns and/or blocker issues. Thanks!

@myechuri Hi, thanks for checking out and sorry for late response. The solution is a bit overkill and we decided to manually scale the infra when needed. kip remains a solution whenever we need more automation!