Combinatorial testing in strengths greater than pairwise: worthwhile or no?

Have you ever done combinatorial testing in strengths greater than pairwise (3-way, 4-way, or more) AND the extra strength above and beyond pairwise combos paid off?

I’m writing up how to use the ACTS tool, which creates lists of test cases with strengths greater than pairwise.

But honestly I’m not convinced that testing beyond pairwise combos is all that worthwhile overall. I’d like to hear some anecdotes.

1 Like

That was around 10 years ago and I don’t remember many more details.
You can ask questions, but I’m unsure if I will able to answer them with sufficient details.
Might even be that I confuse something.

We once used orthogonal arrays for automated checks.
We had a statistic graphs image rendered which had multiple options to change the image. For that we wanted to have a set of scenarios where we cover multiple combinations of options. For this scenarios we initially created expected images, checked them to be fine, and stored them in our repo and used the for comparison.

The thing is that the options for for this statistic view where an combinatoric explosion to check for every combination. Which would result in a long runtime to create and compare all images.
Compared to the risk this doesn’t felt worth. Partly because only a few options where connected and most where independent from each other.
But we wanted to have some coverage. Covering different scenarios, but not all.
It where 10-15 GUI options with different values. Some where simple boolean, some where lists where you choose an option.

2 Likes

Tools like the ACTS tool and Microsoft’s PICT pairwise test case generator allow you to generate what’s called a “covering array,” which is an “adequate” subset of “all possible combinations.” Pairwise tools are fairly well known in the combinatorial testing realm, but tools that allow greater numbers of combinations (3 way, 4 way, and more) seem not to be in wide use at all.

Do I understand correctly that your team did NOT use a tool that could generate a covering array of test cases?

No. We did use a tool for that.
Giving the right parameters to it, a complex algorithm to cover our options right, was a demanding task.

We also implemented that our automated check could find out if the parameters of the tool needs to be changed, because the options of the statistic dialog have changed.

1 Like

Understood.

To set the record straight on the overall effectiveness of 3-or-more-way combinatorial testing: after I came across the case studies section on the US NIST web site, I’m now convinced that the technique has a ton to offer.

You can read the studies for yourself. God willing I’ll mention some of them in the article I’m drafting as well.

1 Like

I enjoy combinatorial testing and for the most part I tend to cover it naturally in an exploratory way unless its going into an automation script where the more structured wise elements can be useful.

I do tend to find from experience that some of the theory behind it still holds water.

Its worth tracking what was the actual code fix to your historic issues that have been found, was it a single code fix at say a unit level or was it a complex fix of five or six variables at the same time.

In most cases I’ve found its often a single line of code fix which leads to the idea of singles testing should be very common, the next most common type of fix is the interaction between two things hence pairwise being common.

I don’t have the numbers but lets say that is pretty high coverage maybe ballpark 80 percent of fixes for issues found. There are papers on this with more statistical views.

On the otherside of the argument is that the more variables and combinations there are the more complex things become so whilst rarer that root cause is 9 variable interaction it is still a risk.

Low to medium risk products, singles, pairwise and at least some combinations beyond that are usually sufficient, maybe above 95% of combinatorial issues covered.

So if it was 9 variables with multiple options for each you can easily hit hundreds of thousands of combinations but often picking say twenty 9 variable combinations will catch 9 variable issues.

It remains a risk issue alongside efficient test coverage, if its a medical tool and people could die then you may go full on full variable coverage and its going straight into an automated script as a tool strength high data coverage technique.

Free entry variables complicate things but usually change those tests to the single variable coverage.

As a side note I also find using the idea of was it a single line code fix to decide where in the stack that risk should be covered, again using the same theory “most” issues can be caught at unit level with only rarer issues further up the stack.

@rabia.brown I found this article from James Bach interesting. In general and also because it also goes in the direction @andrewkelly2555 is hinting at.
2 interesting quotes (with much more content there):

  • “Pairwise testing fails when you don’t have a good enough oracle”
  • “Pairwise testing fails when you don’t know how the variables interact”

What I get is that pairwise testing needs to be used with care and not as blind “statistic”.

https://www.satisfice.com/download/pairwise-testing-a-best-practice-that-isnt

1 Like