Incremental Computation for Testing

Question

Incremental Computation for Testing

CMCDragonkai opened this issue 2 years ago · comments

Specification

Once the tests get to a certain size, we really want to only run the tests that have been changed by the files/commits.

Integration tests will still run all the tests. That is tests on staging branch would run through everything, but during feature commits, we can reduce cost by only running tests that have been affected by the changes.

Jest has some ability to do this:

We could locally override this with a special commit [ci test] label that we can program ourselves.

Additional context

https://stackoverflow.com/questions/67008800/jest-is-there-a-solution-to-run-tests-only-for-changed-impacted-files
https://suncommander.medium.com/run-jest-for-unit-tests-of-modified-files-only-e39b7b176b1b
#54 - also relevant for benchmarking and profiling, but these are a little different

Tasks

...
...
...

Roger Qiu · Answer 1 · Sat Jun 04 2022 12:22:21 GMT+0800 (China Standard Time)

In PK, we are also generating a child pipeline of tests, so that tests can be run in parallel. This works fine if the gitlab ci/cd jobs have low overhead in starting.

An alternative is test load balancing, where you split tests between workers, so one job may run multiple tests. We do this by directory, but it may need to be done more dynamically.

Roger Qiu · Answer 2 · Sat Jun 04 2022 12:24:39 GMT+0800 (China Standard Time)

If we are splitting by nominal tests, this could mean some jobs don't run any tests at all because nothing was changed. It's far better to have never started that job in the first place.

Therefore the more scalable option is test load balancing, where the test planner in the beginning can figure out what tests need to run based on changes, and then distribute those tests across multiple test jobs.

We lose some nice separation of pipeline logs however, but perhaps we gain that with gitlab's test reporting. Currently the unit test reports on the CI/CD doesn't really show all test execution information. This is because we have logs that go straight to STDERR that isn't captured by jest, and thus the junit report. During debugging we prefer that they show up on stderr instead of being captured by jest, but I guess during CI/CD it makes sense for the entire log to show up instead.

Roger Qiu · Answer 3 · Fri Jul 15 2022 23:52:05 GMT+0800 (China Standard Time)

Also see:

In another world, the tsc incremental compilation could work without needing us to delete the dist directory. It would maintain correctness and ensure any files deleted is also deleted in the target dist directory. It looks like this is actually possible with a new mode of tsc: https://www.typescriptlang.org/docs/handbook/project-references.html#tsc--b-commandline

Roger Qiu · Answer 4 · Sat Jul 30 2022 16:17:54 GMT+0800 (China Standard Time)

We should incorporate sanity checking/smoke testing into our testing phases.

The initial incremental tests act as a smoke test/sanity check on the changes in the software. If the incremental tests don't pass, there's no need to do full unit testing (check stage).

Then only after full unit testing has passed check-stage, do we proceed to do a cross-platfrom build-stage testing which tests all unit tests across all platforms. And only if this passes do we do cross-platform and staging deployment integration testing.

This diagram is useful: https://en.wikipedia.org/wiki/Bathtub_curve.

What's interesting is by separating the components into different npm packages, we end up with potential separate CI/CD stages across each component, that gives us some fine-grained analytics on the failure rates of each component.

Roger Qiu · Answer 5 · Sat Aug 06 2022 18:19:10 GMT+0800 (China Standard Time)

One way to reduce useless CI/CD is to skip not only [ci skip] but any commit that says WIP as a prefix. This is because these commits are likely already broken and not worth running any jobs on.

Roger Qiu · Answer 6 · Mon Jul 10 2023 21:01:12 GMT+0800 (China Standard Time)

This is sort of achieved already by factoring out libraries from Polykey. A big move is going to be CLI portion factored out to Polykey-CLI.

I find it difficult to note what tests need to run from tests because JS is very dynamic. So I think we will close this for now.