Improve the speed of recursive inspections
wata727 opened this issue · comments
Introduction
After merging #1918, modules that called many child modules experienced a noticeable performance degradation. The company I work for has over 200 root modules with calls to over 40 child modules. Previously, recursive inspections took about 1 minute to complete, but now that local modules are being called, it now takes nearly 5 minutes.
Below are the minimal steps to reproduce this issue:
plugin "terraform" {
enabled = false
}
plugin "google" {
enabled = true
version = "0.26.0"
source = "github.com/terraform-linters/tflint-ruleset-google"
}
module "module" {
source = "./module"
count = 50
}
$ time tflint
real 0m1.642s
user 0m1.779s
sys 0m1.753s
$ time tflint --call-module-type none
real 0m0.056s
user 0m0.037s
sys 0m0.065s
Although the performance degradation in this example is small, it can add up to significant slowdowns.
Proposal
There are 3 solutions to this kind of issue.
Improve the speed of inspection for a single module
We can improve the speed of recursive inspections by speeding up inspecting on a single module. This can mainly be achieved by parallelizing runners.
In the current design, each module call starts a new gRPC server, and the ruleset checks for each one sequentially. So if a ruleset has 100 rules and a module calls 50 child modules, the rules will be evaluated 5000 times in total. This design has the advantage that the ruleset does not need to distinguish between root and child modules, but it is too much waste in terms of performance.
This design issue has also been tracked in terraform-linters/tflint-plugin-sdk#193, which may solve it, but in the short term processing them in parallel may improve performance. Specifically, the following implementation could potentially be performed in goroutines:
Lines 179 to 184 in 95926c2
However, we need to think about thread safety issues introduced by parallelism.
EDIT: Merged in #1944
Parallelization of recursive inspection
By performing recursive inspection in parallel, the overall inspection time is shortened. This is what we expected when we designed the recursion inspection, but the current design requires switching working directories, so parallelization with goroutines is not possible. See #1622 (review)
However, parallelization by process may be possible. In environments where multiple CPU cores are available, inspections can be sped up by starting worker processes and aggregating the results. Terragrunt's approach is helpful here. See https://terragrunt.gruntwork.io/docs/features/execute-terraform-commands-on-multiple-modules-at-once/#limiting-the-module-execution-parallelism
EDIT: Merged in #2021
Efficient recursive inspection
The current recursive inspection checks all directories except hidden directories. However, it may be possible to narrow down the target more efficiently. Also, it may be possible to skip unnecessary initialization based on the inspection results of other modules.
As of this writing, I don't have an idea yet, but I think this possibility is worth exploring.
References
- #1918
- https://github.com/terraform-linters/tflint/blob/v0.49.0/cmd/inspect.go#L175-L180
- terraform-linters/tflint-plugin-sdk#193
- #1622
- https://terragrunt.gruntwork.io/docs/features/execute-terraform-commands-on-multiple-modules-at-once/#limiting-the-module-execution-parallelism
- https://github.com/terraform-linters/tflint/blob/v0.49.0/cmd/cli.go#L146-L161