grafana / k6-operator

An operator for running distributed k6 tests.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add test coverage

JorTurFer opened this issue · comments

Feature Description

Working on this PR, I have tried to add some integration test for this changes but there isn't any wait to do it with current setup because envtest doesn't have built-in controllers (such as job controller) and currently there isn't any e2e test as far as I can see :/

envtest allows to use a real cluster for executing the integration tests, maybe we can add a documentation about the topic and use the envtest feature to execute an integration test suite. For local development and CI, we can use Kind and that should be all.

Currently any change in the operator can literally break the whole operator because there is just a single e2e test with the happy path. This is an example from KEDA HTTP Add-on, where just using kind, we execute multiple e2e test cases easily and where newcomers can add new e2e test cases easily. It can be executed on GH Runners without any other requirement and in our case, we execute the e2e test on different k8s versions and architectures super easy

I'd like to see something like that in k6-operator because as I said, any change is potentially a bug because the e2e isn't tested but either the integrations (at least not publicly, maybe internally it's covered with other tests)

Suggested Solution (optional)

No response

Already existing or connected issues / PRs (optional)

No response

Thanks for taking the time to write up the issue! We've had these in the backlog for quite a while actually:

Let's leave this issue open as well for now, for the sake of public record.

any change is potentially a bug because the e2e isn't tested but either the integrations (at least not publicly, maybe internally it's covered with other tests)

Yes, there are more internal tests happening on each change. But indeed, automating this is a problem that I've been wanting to get to for a long time. Now that we are spending more time on k6-operator, there's some hope that we'll be able to start resolving this.

I can port our KEDA's approach to this repo if you want. It doesn't require external resources and newcomers can add e2e test cases.
The main issue I see is that you need a real k8s cluster (a local cluster is enough) because testenv doesn't have built-in controllers, so there isn't any good way to test that jobs are working properly. This can be done using a real cluster for testenv or directly using e2e tests.
If you choose the direction (e2e test or functional test with a real local cluster), I could help

Hi @JorTurFer, Thanks for the offer to help!

I've looked a bit into the approach you linked and worked on: it could certainly work but this is not exactly what I had in mind for e2e tests here... I see this is a multi-layer problem which shouldn't be done ad-hoc; otherwise, I'll just end up re-doing it in a couple of quarters.

A part of what I've wanted to setup is not public at the moment (might become public soon-ish if I'm lucky). OTOH, it'd also bring additional complexity so there is some balancing act here of keeping complexity manageable. So I'm grokking now what kind of testing pipelines we would need and where.

Last but not least, my next most pressing TODO is actually a refactoring of the controller and that work will likely impact decision making here.

So there are multiple factors I try to take into account. I appreciate your offer to help, but I'll need a bit of time to be certain what can be done externally in this manner.

Either way, it's a WIP from my perspective. I'll probably have more info on the next week!

Hi @yorugac

Last but not least, my next most pressing TODO is actually a refactoring of the controller and that work will likely impact decision making here.

This is the most important reason for adding e2e/black box test cases, a refactor without testing is not a refactor, is the hoping of not breaking anything.
I totally understand your concerns and I totally agree that just e2e test cases don't solve the situation, but it's a guarantee for incoming refactors from usage pov.

In the mid term (once all those refactors are done), integration tests are a really good idea and envtest allows using a local cluster (such as kind) enabling UseExistingCluster during the initialization.

We are in the road of integrating this awesome tool, and now we're planning to fork and and use our own supply chain (adding e2e tests in our side) to prevent unexpected changes from the upstream, but honestly, I prefer to contribute here directly rather than forking the repo just to adding e2e tests.

unexpected changes from the upstream

@JorTurFer, how are you updating the operator? We've recently added new installation methods: bundle addition and Helm addition which should help prevent unexpected changes.

From my viewpoint, far worse is lack of understanding of all the possible ways people use the k6-operator. We had an incident of "broken" functionality once because nobody knew that the operator was being used in a certain specific way. No amount of tests, be it integration or E2E, could have prevented it as the cause was in lack of knowledge. We even have a public issue for that here: #194
On that note, can I ask you to describe which use cases you're expecting in your usage of k6-operator? Perhaps, it'd help to prioritize in context of this issue as well.
Similarly, since you mentioned newcomers adding new test cases: for your PR here, what would be the test(s) you'd add as conitbutor?

Similarly, since you mentioned newcomers adding new test cases: for your PR here, what would be the test(s) you'd add as conitbutor?

First of all, I'd include a tests for validating changes on generated manifests and informers/listers/clientset (I know that we don't use them currently, but as it's the best approach at scale, if we introduce them we should test them) because otherwise, I could have introduced the change in the json path but not in CRD. I'm not 100% sure because I don't remember all the tests, but I almost sure that current test suite wouldn't have detected that change. This could have been also detected with an e2e tests which validate the spec parameters, like deploying something with nodeSelector, affinities, etc (although, IMHO validating that generated code is up-to-date is most easy to detect this issue).

how are you updating the operator? We've recently added new installation methods: #4 and #277 which should help prevent unexpected changes.

It depends, you have asked about what I would tested on this #274. Let's say that we missed the part of CRD update and due to the lack of coverage, CI didn't detect it either. Any user updating the operator, doesn't matter if they use helm/kustomize/bundle would have faced with breaking change unexpected, and that's why I said that as user, I can face with unexpected changes introduced by error. In this case, I updated the CRD and you checked it, but what if not? that's my point

No amount of tests, be it integration or E2E, could have prevented it as the cause was in lack of knowledge

Sorry, but I have to disagree here, I don't think that this is a valid reason for not adding/improving test coverage. Any software can have edge cases and everybody can break something by error, but in that case (IMHO), the solution must cover that edge case with a test for the future.

On that note, can I ask you to describe which use cases you're expecting in your usage of k6-operator

I expect a reliable software as I can assume from other things such as nginx, prometheus, etc. Of course, it doesn't mean that it never fails (in fact, I think that in this topic, my attitude is always trying to help patching/fixing the issues), but currently, errors not coverable with unit tests aren't coverable, so we can face with regressions easily, which reduces the trust on this operator