TestBash Spring 2023 - How (Not) to Measure Quality - Michael Kutz

During TestBash Spring, @mkutz gave a talk on How (Not) to Measure Quality.

We’ll use this Club thread to keep the conversation going, share resources and answer any questions we don’t get to during the live session.


Questions Answered Live

@david2023 - Shouldn’t we be concerned about the Severity of the bugs found and not the number of bugs found?

@lisa.crispin - Thinking back to Mohammed’s talk on testing infrastructure, do you have thoughts on measuring quality of infrastructure?

@karentestsstuff - How would you counsel management to use metrics in such a way that they will not undermine the psychological safety of a team?

@saib - Interested to know more about your zero bug policy, could you please elaborate?

Questions Not Answered Live

@diantris - My main question for all metrics ever would be: how to avoid Goodhart’s Law becoming reality?

Anonymous - How can we measure the effectiveness of our automation tests?

@cookietester - Amazing stuff, but how do you collate and report on all of these parts? Is it just one person doing it or is over multiple people and then bringing it together?

@lennysmith - Do you have any thoughts on how to mature and prioritize measuring quality / automation in monolithic projects? Perhaps something that we should (Not) do?

@testerfromleic - (Forgive me if I missed this). How do the company goals inform or feed into what the team measures? (OKRs or North Star metrics for example)

@ghawkes - How do you balance keeping collaborative metrics internal with senior leadership desiring targets on the same metrics?

@fullsnacktester - Do you have a good resource to look more into DORA metrics?

Resources Mentioned

Well, it is hard. There’s no silver bullet here that will prevent it.
I think following the goal-question-metric method helps, as it gives you the ability to go back as check if the metric really answers the relevant question. Or if the question is still the right one to reach your goal.

I think Goodhart’s Law doesn’t apply that much if the people who are using a metric understand why they do that and if they are not only extrinsicly motivated by the metric, but also intrinsicly motivated to reach that goal.

The different domains of measuring quality have different owners:
Inner quality is owned by the development team. They measure it and they look into it to improve their inner quality. Nobody else needs a report there. If the team decides to share some insights to strengthen their point (e.g. we need more time to pay technical debt), they can do that. But management or product people should focus on the metrics they own.
Outer quality is owned by the product people. Let’s face it: a product is mostly successful because of its features. Bugs can ruin it, but the decisions on what it can and what it can’t do are much more important. This is of course also interesting for everybody else and probably should be shared in the company by default.
Process quality is the domain of the managers. They should use it to guide their decisions for hiring and funding. I think the measured team should be interested in its own performance as well, but these should not be shared with somebody else IMHO as it may have a negative impact on motivation and trigger Goodhart’s Law.

This is pretty broad. I think I’d need some more details here. But in general I’d recommend to follow the goal-question-metric method. What is the goal? Do you want to continuously improve quality? Which quality? Which aspects exactly?
Then come up with questions you need answers for to know if you get closer to that goal.
Only then try to chose metrics that answer these questions as good (accurately, recently, locally) as possible.

I must admit I had to think about this for a while (so no, you didn’t miss it).
In my experience company goals mostly about money. Maximizing revenue, while minimizing cost (yes, I do live in a capitalistic society). Outer quality can help with the first, process quality with the ladder. Inner quality contributes to both.

To north star metrics and OKRs I’d apply the goal-question-metric method in reverse to figure out if they actually contribute to a real goal, or are mere opinionated measures that are harmful to motivation, collaboration and probably suffer form Goodhart’s Law.

Hope this answers the question. Feel free to post follow-ups.

Give them metrics that work better for them. Ask them what they want to achieve/what their goal is. Try to suggestion questions that tell them if they are reaching that goal. Only then find metrics answering these question.

This should make them more happy then to look at your metrics anyway.

Also: don’t tell them your metrics :slight_smile:

1 Like

I read the Accelerate book and recommend that. @lisa.crispin also recommended the DORA community mailing list, but I haven’t checked it out yet.

1 Like