How Are You Measuring AI Impact in Testing?

checkout_champion · 9 June 2026 10:11

My company is starting to look at how we measure and record AI usage for development and testing work, they want us to record something tangible that they can report on each month. Is anyone measuring time saved/quality improvements by AI. How are you doing it? Any ideas welcome.

ghawkes · 9 June 2026 12:57

Quite simply I stick to DORA metrics as at the end of the day, success is measured in outcomes. We adapted them slightly to meet our needs:

Planned deployment frequency
Unplanned deployment frequency
Change Failure Rate (or for us frequency we are fixing production issues)
Time to Restore Service
Regression Testing Time taken (not a DORA metric, but one we want to measure)
Rate of feature enhancement (not a DORA metric again, but tracking our frequency of deploying new features)

The bottom line is, is your new way of working improving outcomes? Those are the metrics that matter.

probe_runner · 10 June 2026 05:08

In our case, the clearest impact has been QA capacity.

We use AI to read developer-provided specs and code changes, then help design and execute the relevant tests. The useful output is not just generated test cases, but a go/no-go style report: what was checked, what risks were covered, what was not covered, and whether the change looks safe to proceed.

So I would not measure AI by prompts used or tests generated. I’d measure whether the QA team can assess more changes with the same people, while still producing useful evidence for release decisions.

The key metrics for us are: time from dev handoff to QA feedback, number of changes assessed per cycle, quality of the evidence, issues caught before release, and how much human review is still needed.

douglasdcm · 10 June 2026 12:22

First I’d use the metrics I have like number of bugs by KLOC, time to deliver, recurrence of bugs… and them, according to the results I’d implement new metrics. I think using existent metrics first is a good approach, because you are able to compare the before and after IA. Using fresh new metrics you don’t have the a baseline.

giorgos.s · 10 June 2026 14:25

The ROI of AI-assisted Software Development report released recently (by the DORA people) is also useful for how to tackle this, as it explicitly includes the cost dimension.

checkout_champion · 11 June 2026 13:19

Thanks for the reply, Gary, that really helps. I wasn’t familiar with DORA metrics before, but they make a lot of sense. It makes sense to measure whether AI is actually improving outcomes rather than just tracking “AI usage” for the sake of it.

I guess stakeholders will look time saved in development and testing, because those are the easiest numbers to report. But I’d really like to shift the narrative toward quality improvements.

I really like your point about tracking the frequency of production fixes. It’s a simple, outcome‑focused way to show whether quality is improving over time.

I’m not sure how we can do this, if we want to compare AI‑assisted work with non‑AI‑assisted work, we can only do that by looking at historical baselines. Without that, it’s impossible to say whether AI is genuinely improving things or just changing the shape of the work.

checkout_champion · 11 June 2026 13:31

Thank you for your reply, I’m building up a really useful list of metrics I hadn’t considered before. I’ve actually been recording the time from dev handoff to QA for the last six months, so it’ll be interesting to see whether that improves as we start introducing AI into the process over the next six months.

I haven’t been tracking many of the other metrics you mentioned, it reinforces something I’m starting to realise: if we want to compare AI‑assisted vs non‑AI‑assisted work, the only fair way is to measure against our historical baselines. Otherwise we’re just guessing.

That said, it might also make sense to start capturing these metrics now and then see how future AI improvements shift them over time.

checkout_champion · 11 June 2026 13:37

It’s good point, it’s difficult to find out anything meaningful about what AI is giving you, without having the pre-AI data.

checkout_champion · 11 June 2026 13:41

Interesting, thank you for that, I’ll take a look at that report.

Topic		Replies	Views
We’re working on defining KPIs for our QA team. With the rise of GenAI adoption in QA/testing activities Discussions learning , career-development , qa	5	178	19 September 2025
DORA any experiences Discussions metrics , reporting	12	1356	18 April 2023
Feedback about QA metrics. Which ones would you add/remove? Discussions learning , career-development	14	983	16 November 2023
How Do you Measure Quality and Quantify it? Discussions quality , metrics , reporting , measurement	16	3129	29 December 2021
What parameters should we consider when coming up with a Quality Score for teams? Discussions quality , metrics	8	526	13 May 2024

How Are You Measuring AI Impact in Testing?

Related topics