Are we running too many regressions than required?

Most teams keep adding tests to their regression suite, which results in thousands of test cases over time. We rarely remove tests. Eventually, regression cycles take hours to run and require constant maintenance.

This situation raises an interesting question - Are we actually improving the quality or just accumulating more tests?

Do we really need such large regression suites, or can there be a better approach for regression testing.

You’re not improving the quality, unless those cases are generating useful information which is then acted upon by those with more direct influence over quality.

My guess is that a long regression suite will provide some value. The question really is: Is the regression suite worth the cost?

So you can look at the costs of regression, which are mainly in terms of time. The time taken to decide what cases to add, the time taken to write them and add them and debug them and test them, the time taken to run the case multiplied by the number of times you run it, the time taken to review the results, the time to update them when anything changes with the software or the suite or the tooling, and all the other conversations and decisions and meetings and training that a software project (like automation) takes.

If the suite is run by people then… well I have what I hope to be well-known opinions about explicit test cases. I think they are not good. But the costs of that are astronomical in modern software, one simply cannot assign the monetary, emotional and morale expense to a person to pretend to sort of follow some instructions for reasons not best explained. It would take an extreme situation for me to think otherwise.

Either way the costs will also depend on how often the regression suite has to change, and so on. If the software and market doesn’t change then the tooling won’t have to change much either.

There’s also the cost of not knowing what it does. Does anyone really know how each of these checks serves the greater test strategy, given how limited they are?

Then you have the benefits. That really depends. If your software simply has to perform some basic, logical steps, or people get hurt, then you will likely need a regression suite to help cover them.

That being said I think there are many bad reasons that people have large regression suites, and the chief amongst them is fear. People are scared that if they don’t run the suite they’ll miss something. This tends to come about because those people do not understand how hideously limited a check is. What actually happens is that some system that simulates the data and state and interaction with a piece of software (in ways that often don’t reflect real world use), in very limited ways and make extremely limited “observations” that are processed in a rigid, logical way. Then we call that “test_login_success” and pretend that the name describes what happened - that we’ve tested successful logins. Which we have not, we’ve done some things and simulated a click and looked for one or two signs of success, while ignoring any other issues or bugs.

So people go around thinking “I’m sure glad there aren’t any problems! Look at all the green!”, while not realising that doesn’t mean the software is tested, or, gods help us, “100%” tested. Often they don’t know what the suite says it does. If they know what it says it does they don’t know what it actually does. And if they know what it actually does they don’t know why it does that. Not always, of course, but it’s pretty common. And if the expense of doing something nobody understands is getting that high without someone doing something about it… that says to me that it’s serving a purpose, even if that purpose is mainly ceremonial in nature.

If you’d like to sort this out and you have the political power to influence such things, then I recommend going through the suite and asking “what if we just threw this bit away?” Often suites are covered by unit tests, or repeated in other testing, or pieces are added “just because” and nobody puts an expiry date on them. Sometimes they are important. Sometimes they’re important and not covered by the check - and may require actual investigation of the release candidates. You could consider taking anything that’s lived there a long time and deciding on some sort of category - a golden check that will always live on because it’s important and unchanging and well-written, all the way down to “why is this still here?”. Purpose is a great nexus for this - what risks are these checks actually mitigating? Are they thought-through, or just added like another teaspoon on a mountain of washing up? Do they actually check anything close to what they claim to check? Are they defensible in light of their cost? Do they actually find important problems?

Those will be the real answers to your questions. Regression suites can be very valuable. They are frequently not.

You might also solve the problem with grids and parallel tests and so on. But you might just be laying expensive offers at a shrine to a force that doesn’t exist.

Still, and either way, best of luck.

Further reading:

3 Likes

@kinofrost Your point about Cost Vs Value is a critical consideration that I missed.

Many teams assume that a growing regression suite automatically means a higher quality, but they miss on evaluating whether each of these checks are still serving the purpose they are meant to.

In real-time scenarios, I have seen regression suites grow because tests are easy to add but hard to remove. Over time, suites become a mixture of critical validations, duplicated checks, and legacy tests that no one is confident enough to remove. The result is longer cycles, higher maintenance effort, and sometimes a false sense of coverage, causing not just loss of time and effort but significant amount of money.

What I find interesting is how few teams actively manage regression suites as living assets. Questions like these are rarely asked: Which tests actually catch defects? Which ones are redundant with unit or API tests? Which ones protect high-risk areas of the system? Which ones are simply historical artifacts? This is where I think approaches like risk-based regression, intelligent test selection, and observability of test outcomes become important. Instead of running everything every time, teams can prioritize tests based on change impact, failure history, and system risk. In other words, the goal should probably shift from “more tests” to “more signal.”

Curious how others approach this. Do you actively prune your regression suites, or do they mostly grow over time?

Interesting, I need the more perspective from antoher people

For us, we don’t sit still with our regression testing. We continually use risk based testing and include the automated tests in that assessment. So each release as asks the same question, “What do we need to test?” and be selective.

The benefit of doing that is you start to accumulate very interesting information about your test packs. What tests haven’t been run for more than x releases? Which tests require the most regular maintenance? As time evolves you can start archiving (not deleting) test cases that one day were very important, but today, we haven’t needed them for the last 10 releases and the roadmap isn’t showing any changes in the area under question.

Its quite liberating as you feel true ownership of your testing assets and that you are evolving them with the product.

I think, this is a very mature way of managing regression testing; here you are treating test cases as an evolving asset. This is a practical way to keep your suite relevant while preserving historical coverage.

One challenge I’ve seen in larger systems is that maintaining this level of visibility across test packs becomes difficult when tests are spread across different tools and frameworks. It becomes harder to track things like test usage, maintenance effort, and actual value over time .

Curious how you are tracking those insights today. Is it something you manage through your test management system, or through custom reporting?

1 Like

Picking just one angle of this.

I’ve encountered a lot of teams running all their tests on all browsers.

The alternative to this is to research the risks associated with browser differences and create a smaller specific set of tests to cover these risks.

The risk focus can be a big part of this, when a change is made do you actively consider the specific regression risks of that change or do you run everything every time. The latter will carry a lot of waste from some aspects but its often a valid choice when teams opt not to spend that time on risk analysis.

Quick question worth asking your team: when did you last check which tests actually caught real bugs?

Most teams have this data somewhere like pass/fail history, defect logs, but never really look at it. If you’re running 2,000 checks and the same 40 keep finding actual problems, that’s worth a conversation. It doesn’t mean you axe the other 1,960 overnight, but it moves the discussion from gut feeling to something concrete.

The point probably isn’t to have fewer tests. It’s to have tests you can actually justify, where each one has a clear reason to exist and someone who genuinely owns it. Most suites are nowhere near that bar, and that’s the real issue.