minamijoyo / tfmigrate

A Terraform / OpenTofu state migration tool for GitOps

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature request: support actions spanning multiple HCL files targeting the same projects

mdb opened this issue · comments

Currently, tfmigrate history mode allows users to:

Keep track of which migrations have been applied and apply all unapplied migrations in sequence.

However, this can result in tfmigrate plan errors if ever multiple tfmigrate HCL files target the same Terraform root module projects. For example, consider a tfmigrate plan invocation (using history mode) that targets the following unapplied migrations:

# migration-1.hcl
migration "multi_state" "foo" {
  from_dir = "project-1"
  to_dir     = "project-2"
  
  actions = [
    "mv null.foo null.foo",
  ]
}
# migration-2.hcl
migration "multi_state" "foo" {
  from_dir = "project-1"
  to_dir     = "project-2"
  
  actions = [
    "mv null.bar null.bar",
  ]
}

In this scenario, despite the migrations' validity -- and despite that both project-1's and project-2's *.tf configuration reflects the desired post-migration end state -- tfmigrate plan encounters terraform plan command returns unexpected diffs, as it invokes tfmigrate plan against each individual migration HCL file serially (tfmigrate plan migration-1.hcl, tfmigrate plan migration-2.hcl).

Alternatively, could it be reasonable for tfmigrate plan to have the capability of discovering and merging multiple migration HCL files in such scenarios, and performing a single tfmigrate plan (and thereby a single terraform plan) against a single, combined migration?

Perhaps this seems like an unexpected use case. However, in my experience using tfmigrate to perform non-trivial large scale migrations (> 1K resources, for example), circumstantial factors -- and sheer volume of targeted resources -- often warrant the codification of the migrations across multiple, distinct migration HCL files.

Thank you for your proposal!

The current implementation is limited to defining only one migration per file, but how about using a file boundary as a transaction boundary to group multiple migrations as an atomic transaction? I think having clear transaction boundaries on a file-by-file basis is better being declarative than implicitly merging migrations depending on the state of the migration history. In particular, implicit merging is not apparent how the merge will take place to the reviewer from the code changes because it depends on the runtime state instead of at the time of reviewing.

While it is inevitable to redesign the Migrator interface, it could support more generic use cases. For example, migrations splitting dir1 into dir2 and dir3 should be grouped into a single transaction. The transaction should track the state of all directories being affected and roll back them if something wrong happens.

The current implementation is limited to defining only one migration per file, but how about using a file boundary as a transaction boundary to group multiple migrations as an atomic transaction?

@minamijoyo Understood, and in the most common use cases, I think you're right. However, at non-trivial scale, it becomes a bit challenging given...

  • A migration HCL file containing 10s of thousands of actions is difficult to reliably create, reason about, and code review
  • Yet, it's also impractical to subdivide the actions across distinct pull requests and individual tfmigrate apply invocations, as this can take a prohibitively long time to execute at scale.

So, to offset these challenges, I was imagining a feature through which users could optionally batch actions expressed across multiple files as a single transaction, perhaps something like a transaction_group field...

# migration-1.hcl
migration "multi_state" "foo" {
  transaction_group = "project-one-to-project-two"
  from_dir = "project-1"
  to_dir     = "project-2"
  
  actions = [
    "mv null.foo null.foo",
  ]
}
# migration-2.hcl
migration "multi_state" "foo" {
  transaction_group = "project-one-to-project-two"
  from_dir = "project-1"
  to_dir     = "project-2"
  
  actions = [
    "mv null.bar null.bar",
  ]
}

However, I fully respect you may feel this is perhaps an unusual usecase catering to an anti-pattern, and ultimately prefer the present-day tfmigrate pattern that treats individual migration HCL files as distinct tfmigrate transactions. I figured it couldn't hurt to pitch you on the idea, though :)

I don't fully understand what environment you are trying to migrate, but a migration of 10k lines is impossible to review anyway, even if you divide it into several files. If you are not writing 10k lines by hand but generating them in some script, why not review the script and generate the 10k lines in one file?

I don't fully understand what environment you are trying to migrate, but a migration of 10k lines is impossible to review anyway, even if you divide it into several files. If you are not writing 10k lines by hand but generating them in some script, why not review the script and generate the 10k lines in one file

@minamijoyo These are good questions. In fact, our migration HCL files are programmatically generated (and even programmatically validated prior to tfmigrate apply-time). Nonetheless, we're still facing some other unique constraints, hence pitching you on the idea outlined in this issue.

However, I think you're ultimately right: our use case is unusual and caters to an unfortunate anti-pattern, and so I'll close this issue. Thank you for at least entertaining the discussion! :)