cloudfoundry / cloud-service-broker

OSBAPI service broker that uses Terraform to provision and bind services. Derived from https://github.com/GoogleCloudPlatform/gcp-service-broker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FR] : Ability to replay the provision process sing the onbeded terraform present in the CSB in case of disaster recovery

wdesplas opened this issue · comments

Is your feature request related to a problem? Please describe.
I'm currently working on the CSB in order to find a way to restore the infrastructure after a crash of the platform. We keep in mind that the only reference that we have is the database.

I find out that it is easy to retrieve the tfstate but no the provision.tf with all needed variable that as been used during the provision phrase.

Describe the solution you'd like
It will be helpfull to have this kind of feature that will replay the provision by applying the used provision.tf and the used tfstate for a specific instance id or all of it in a success state.

Describe alternatives you've considered
An alternative is retrieving the tfstate and the used provision.tf and then, launch the command that contains the tf and the tfstate.
This alternative will work with need a lot of effort too.

Additional Context
This will ease the adoption of the CSB as the platform restoration will be manage by the CBS in case of crash by simply reapply the already played provision process.

Priority
Medium

Priority Context
It's only a nice to have in this case as the cloud-service-broker works fine without this implementation but will ease the CSB adoption in most cases.

Platform
N/A

Applicable Services
N/A

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/183246736

The labels on this github issue will be updated when the story is started.

Hi @wdesplas, thanks for reaching out.

I would like to clarify the scenario you are tackling here. If I understand correctly this would be when either

  • Cloud foundry (or any other platform the broker runs in) crashes or
  • The broker app crashes

and the crash happens while terraform was already invoked and provision was already requested to the IaaS.

The state after reestablishing normal operation would be resources partially (or completely) created in the IaaS but an operation in progress in CF and the broker that currently cannot be resumed and requires manual cleanup.

Is the scenario above correct?
Thanks,
Marcela

Thanks for you answer. Your scenario does not fit exactly what I mean.

The sceanario is is much closer to :

As service provider, in case of Services crash, it is difficult on the service side (IaaS as example) to recreate the resources that fit the CSB resources (terraform).
That means the services it no longer available for the customer and he has to recreate the service with a different instanceid etc ...

I suggest to findout a way to easily recall the terraform used during the provision process to re-create identically the service that is no longer available.
Thus we will be able to improve the customer experience in case of back end service crash.

I will try to implement this feature if you are ok with it but I didn't find any way to perform this actions using the elements present in the CSB database.

Regards ,
Willy

Hi @pivotal-marcela-campo ,

It appears that with terraform version > 1.13, the tfstate is refreshed before each apply.
That means, in case of backend service restoration after a crash, a simple update allow the user to recreate the missing ressources.
Thus does not work with terraform <1.13.

I have performed test test using the CSB version 0.12, and with consul and haproxy backend.

Therefore, I close this issue and i will create a new one with a feature with cloud-service-broker>cmd as restore command.