Defining 'regression' in testing

Hi gang,

So I have been thinking about regression testing and its definition lately. For reference, my thoughts are based on @mb1’s webinar on regression testing and @friendlytester’s FART model, which I would encourage others to watch.

However, to summarise, when talking about regression testing we’re concerned with investigating if our product has ‘regressed’ back to an unacceptable or worse state. This is based upon definitions Michael cited from the dictionary:

I like this definition and it especially helps me understand the role of automated checking and it is about change detection. However, when I drill down into certain situations I find the definition doesn’t quite fit. For example:

Say we have a ‘regression’ bug that is found in a feature that has existed for a while which was caused by some newly introduced code in a different feature. This bug has, let’s say for ease of discussion, never existed until this new code was added so if it never existed before, how can it be that the feature has regressed to a previous state? The previous states never contained this bug.

So I wonder if I am maybe being too literal with this definition or am I defining a bug as a regression bug when it’s something else. Perhaps the term regression bug is misleading and all that matters is something has changed for the perceived negative (If so then is regression even the right term).

I’d be interested in hearing others thoughts about this.


So I wonder if I am maybe being too literal with this definition or am I defining a bug as a regression bug when it’s something else.

Why does it matter? If you know why it matters then you know the purpose behind your definition. It could be that you’re talking about “regression” because you’ve covered some model of change risk in your system based on new screens/data/controls/something else, so you’re now interested in unframed risks around the rest of the software, especially where testing is important/critical/mandated/cheap. Then your definition of what is regression, and what is a regression bug, suits that end.

So I say start with why “regression testing” fulfils some part of your strategy and evolve the definition from that. Then if you argue over if something is or is not a “regression bug”, for example, you know why you’re doing it.


So I agree with what you’re saying @kinofrost, context would define what the bug is ultimately in a real life situation. I am more questioning the abstract definition of Regression testing and its meaning.

I think something I have been missing out on is asking myself ‘what is regressing’? My feeling is that it would be quality and not specific functionality or features. Since regression requires a state to go back to, a freshly introduced bug means you’ve regressed to a lesser state of quality rather than a version or release that contained a specific bug.

Which brings up an interesting thought that if automated regression checking cannot determine quality, then the name is misleading…? Since those tools can only tell you what has changed and not whether it’s a regression or progression.


I am more questioning the abstract definition of Regression testing and its meaning.
I don’t think that, when considering the details, a suitable shared idea of regression exists. It’s the same as “automation” and “test case”, in that the definitions are varied depending on the worldview and purpose of the person who’s interpreting it.

I won’t go into what RST says about regression and the subjective nature of quality, you sat next to me in the RST course, but yes I think that a regression must be a relationship between one state of quality and another, and therefore a relationship between one relationship with a product and other based on snapshots on a timeline. Introduction of something that threatens the value of the product (to that person) at some point shows us that we’re looking at two times for a product that share some subjective value and differ in subjective value. That ignores the added value between those time periods (refactors, bug fixes, new features, and so on).

So regression is where a more recent time period for a product has a person’s subjective value that is lower than an earlier subjective value, with respect to some limited model of the system. That model could be a product element, for example a function (“this button never used to make the system crash”). The model could be some quality criteria (“this program never used to run this slowly”).

If we look at automation in the sense of a manual process where tools are used (“tool assisted regression checking”) then we can only say that it’s an attempt at a regression check by identifying and then codifying the model we want to use. Or trying to.

Yes, an “apparent quality” decision can only be made by a human, but they can write a check in such a way that suggests facts that support the idea that the system is of at least the same quality (with respect to a limited model).


Regression is about the quality of something getting worse, going backwards in some sense and to some degree. So it’s relative to a) a person and that person’s perception of quality b) and the something in question.

With respect to your uncertainty here, if you shift your focus from the feature to the system, I anticipate that all will be well.

By the way, replying directly to the notification that I received for this message returns a message titled “Unexpected Reply Address”, with the following text:

We’re sorry, but your email message to [“”] (titled RE: [The Club] [Ministry of Testing] Defining ‘regression’ in testing) didn’t work.

Your reply was sent from a different email address than the one we expected, so we’re not sure if this is the same person. Try sending from another email address, or contact a staff member.

—Michael B.


Thank you @mb1 and @kinofrost this was helpful rubber ducking! I’ve been contextualising the idea of regression in the wrong way and need to bring it back to Quality rather than comparable code base/features.

1 Like

“Need to bring it back to quality” is the name of my Metallica cover band.


So this is the quintessence, then? I love this.

Just one more question then: if this is the quintessence for regression testing, and test automation is mainly focused on regression testing … why do tools not focus on change?

The only tools that I am aware of, that do focus on change are ApprovalTests and TextTest—and now ReTest. Why aren’t there more?

Depends on what you call change. Selenium checks for a change by examining one particular fact at different times - the change is detected when the fact is no longer true.

Although that is technically true, it is not what I would call focus on change.

You could also program a version control system by writing a script that compares a specific file with a specific copy of that file and upon change you would have to manually update the copy of the file. I wouldn’t call that version control. Yet, that is what a manual Selenium check is for one particular fact. So I wouldn’t call that focus on change.

Depends on what you call change. Selenium checks for a change by examining one particular fact at different times - the change is detected when the fact is no longer true.

Here lies the rub, and where I feel my misunderstanding came from. If we are concerned with how a product regresses in quality, which is highly subjective based on who is consuming your product. If we measure potential regression risk by comparing states of functionality and or code using automated checking (I have been experimenting with ApprovalTest for a little while) how do we connect the information that something has changed between functional states to a change quality state?

This touches on Automation in Testing for me. That we should use automated checking to flag potential changes and then explore around those changes to determine changes in quality.

Something that also occurs to me when thinking about this problem is that the more abstract your check is from the thing you are monitoring for change I.E. navigating via the UI to check a difference in the database, the more exploration is required because the check has become diluted. I’ve always advocated for the need of targeted checks based on arguments for more reliable checking and faster checking, but talking about that connection of code change to quality change is really interesting.


I am myself working with automated testing and this distinction is very important for me, as I only am concerned/responsible for regression errors. I think the important aspect is ‘change to the worse’. This is different from ‘change to the better’ which is a feature not a bug (sorry, I couldn’t help it). There may be bugs in features, but these are not changes to the worse, but insufficient changes to the better or unwanted side-effects of changes to the better.

Another way of conceptualizing it which is more in line with your semantic concerns, is to see regressions not as regression TO an error-state, but FROM a functioning state: you had achieved the state you wanted, but have regressed from it, but possible in another direction than you arrived at it in the first place.


What’s a non-regression bug?

I would say that a non-regression bug is a bug in new functionality: something that has never worked. Sometimes it takes a long time to discover the bug, but if the bug was in the system when the functionality was released, it is a non-regression bug. Of course there will be situations where new code unleashes a bug that could not be produced before that code was written. This opens questions like: Is the bug in the new code/functionality or the old code/functionality and is there a clear watershed between what is old and what is new? In these situations it must be argued from instance to instance whether it is a regression bug or non-regression bug (we just coined the term ‘feature-issue’ for this). That is if the distinction is important in each and every instance!

Tools don’t focus on anything. People use tools to help them (the people) focus on things. (Example: the microscope doesn’t focus on the bacteria; I use the microscope to help ME focus on the bacteria. The microscope has no idea what it’s looking at.)


We use our brains; our heuristics; our desires and other emotions; and we use tools to aid us in that. That’s the essence of evaluation; we ascribe value to something (not to be confused with the common notion of evaluation as calculation, or as the outcome of a programmed signal). We observe that something has changed (the check affords the observation), and we decide whether we like it or not (the observation and other factors afford the evaluation).


This is a pertinent question in one sense. When something that I want to work now doesn’t work now, I’m not happy. “Well, of course, it’s never worked before” seems like cold comfort.

In another sense, I’m allergic to defining something in terms of what it is not. A sofa is a non-regression bug.

—Michael B.


Totally agree, bad wording on my side. What I meant is that tools are geared to a certain usage. Of course you could use the microscope also to hammer a nail into the wall, but you would have a hard time doing so. I imprecisely meant that a microscope was constructed to focus on something (e.g. bacteria) and is especially helpful to people that try to do that.

In the same way our current UI testing tools are not helpful to detect changes. It is cumbersome and error-prone to define checks and they require a lot of maintenance—therefore there are often way too few checks.

A sofa is a non-regression bug as in “¬(regression bug)” but a sofa is not a non-regression bug as in “¬(regression) bug”, because a sofa is not a bug. Probably not a bug.

Regression requires testing two times. Once to notice a problem or notice the lack of a problem, then once again to notice it (again, or for the first time). That means that any software I’ve never tested before doesn’t contain regression bugs. Unless we’re saying that anyone could have tested it before, in which case a regression bug is one where someone decided there wasn’t a problem with something according to some model and then someone else or the same person decided that

That means that a non-regression bug must be one in something nobody has ever tested before (how something gets built without anyone doing any testing of any kind is a pertinent question), or, for bugs that have been long in production where it’s never worked, one where we must decide that the testing we did do didn’t find the problem and our coverage against whatever model we had would have been sufficient to find it.

So… a bug of regression is where via a check or test a human discovered there wasn’t a problem, then another human (or the same one) discovered there was, where the human in the latter case decides that the original check or test would have found the problem they just found. OR the timing order is reversed and we look for problems that would have existed if we had looked and found that they didn’t (according to some model), in an earlier version.

A bug of non-regression is where the latter human decides that any check or test done by any human that they know of either detected the “same problem” (via whatever criteria we’ve chosen to describe it being the same) or did not detect the problem. Or everyone only tested once, and knows what each other tested and the coverage they achieved.

Interestingly that means you can turn potential regressions into potential non-regressions by increasing the epistemic risk gap.

Also interestingly if the bug is later in time within the same version would we call it a regression? Is something a regression if it goes wrong because the environment changes? Is there really such a thing as a regression bug, or just a regression “vulnerability” (in the sense of something that permits a problem under the right circumstances). Should categorise by causes rather than symptoms?

I don’t know what point I’m making, but I’m enjoying myself.