I’m curious: do others have a special way to organise the test execution of large suites in CI/CD, or do you just run one big regression suite?
For very large suites, i find it useful to have different suites marked with tags for different purposes. I’ve outlined some strategies for this in Organising Your Tests in CI/CD: A Practical Guide, like tagging tests and running layered suites.
Not dealing with a very large suite yet, but I’ve seen how quickly things can get out of hand without a strategy.
Tagging does help a lot. In bigger setups, I’ve noticed people often layer it: e.g. smoke tests on every commit, a broader functional suite on merges, and then full regression either nightly or on demand.
Ideally, the dev team could apply tags based on the areas their changes touch so CI/CD only pulls in the most relevant tests.
Great question: organizing test execution in CI/CD could be a huge differentiator as to speed and quality of feedback.
From my experiences, massive regression builds for every build are impractical once your test base has grown. I use tag-based organization and layered test suites throughout, as you have described in the article.
My general test splits would look like this:
Smoke or Quick Checks. Run on every commit to assure core functionalities.
Integration or API Suites. Run at night or at pre-release builds.
Full Regression. Run at major merges or right before pushes to production.
That way, you can have fast feedback at first and cover thoroughly at the right time. Parallel execution and test impact analysis would contribute more towards improvements in runtime.
I’m very curious how others strike the balance between fast pipelines and all-encompassing coverage, especially when they lean toward flaky or data-heavy tests.
Yes, running massive suites can take time, and no matter how much you try to avoid it, they’ll always have a couple of flaky tests. I prefer when developers actually trust what they’re running. If they always see green ticks, then the moment something fails, they get suspicious and think they’ve caused it. But if you run huge suites that can’t guarantee an all-pass even with no bugs, they start dismissing the results and get annoyed. That’s why I prefer running tests in smaller chunks, full regressions usually need some QA monitoring, while smaller suites can run more independently.
Started a new job 5 month ago now, I’ve got over 100 tests in a 1.5 hour long suite and also about to break it up. I’m going to go 3 ways I think.
installation/provision and un-provision tests (these are really slow)
Stress and lifecycle tests that are still harder and slower to sandbox/setup
Simple “low isolation” tests and smoke tests
I’m still on a standalone CI/CD instance that triggers nightly but after 4 weeks it has settled down to just 8 failing tests which are mainly small product issues and then perhaps a flakey handful of tests which I hope to shift all of into the lifecycle group of tests.
My largest dilemma really is that when I design tests, I design a vertical slice through a feature. And then have to later split it all apart into these 3 boxes. Shall have to find a consistent way of doing so as will we all, to develop that discipline I guess.