We’re working on defining KPIs for our QA team. With the rise of GenAI adoption in QA/testing activities

I’ve outlined some possible KPIs below, but the challenge I see is that many of these vary in relevance depending on the project type, product lifecycle, or testing scope. I’d appreciate community feedback on how to refine and narrow this list to the best three universal KPIs.


Candidate KPIs

  1. Defect Detection Rate

    • Measures the effectiveness of QA in finding defects during testing.

    • Can be weighted by severity to reflect impact.

  2. Test Coverage and Effectiveness

    • Assesses the extent of codebase or functionality covered by tests.

    • Measures how efficient the test cases are in identifying issues and ensuring quality.

  3. Test Execution and Automation Rate

    • Tracks the percentage of test cases that are automated.

    • Evaluates how effectively automated and manual tests are executed during each phase.

  4. Escaped Defects

    • Tracks the number (and severity) of defects that pass QA undetected and reach production.

Looking forward to your thoughts and community feedback!

Maybe I’m being dense but the summary mentioned GenAI. Was that intentional? Unless you’re interested in GenAI uptake, I’d ignore it from KPIs. It’s a tool. A powerful one but a tool. The ability to meet KPIs should be independent of tools.

With any use of KPI’s you really should have a goal, a problem you are trying to solve, a vision of the future and very likely a support plan of action.

I’d highly recommend against individual KPI’s.

So back to the first questions, why are you using KPI’s?

What happens after you measure them, what difference do they make to team and individual?

For AI - lets say your KPI is about adoption, you want to measure how well your team is embracing AI, what AI activities are they doing, what tools are they looking at, are the skeptically evaluating their usage well and importantly what active support can be provided to help make sure adoption is suitable and returns a value? Perhaps a good goal is recognizable day to day value from the use of AI?

There is quite a lot in there so maybe start with learning AI potential as a starting KPI, what courses have they taken, have they built a library of knowledge, what experiments are they running, track how many presentations they do on the results of those experiments, perhaps writing an article on AI, can they train others in the team etc.

You will need to provide time and support for this, if you are serious about it make sure they have paid tools, maybe even a first KPI on evaluating which tools that should be.

Later you might want to look at productivity impact of the tools but I would not rush that, start with suitable adoption of AI where appropriate?

Here are a few tools that may be worth evaluating for starting points. General AI for reports, test plans and test ideas and root cause suggestions, LLM/MCP combo for Automation, Augmented IDE level Automation, Devtools built in AI analysis and perhaps Agents and multi agent usage though this one might be trying to run to quick. There is enough there that they will likely find something of interest that you can track and measure and maybe find of value on a day to day basis.

Firstly, I’d take a step back and examine the acronym:

  • Key
    • They matter, all the time to any project, to any stakeholder
  • Performance
    • They should be a measure of outcome, not activity
  • Indicators
    • They report symptoms, not facts. They will guide your next steps, but they won’t have the answers

So look at each one of your KPI’s you’ve listed and ask yourself:

  • Does it matter all the time to everyone?
  • Does it focus on outcomes and not activity?
  • Does it guide the next steps?

So my advice is rather than pluck at KPI’s to see if they matter (I’ve tried that and failed), start with establishing the desired outcomes and then ask the question, how do we know we’ve achieved that? You want everyone to buy into these KPI’s and not just QA for them to be successful, so for me there needs to be buy in from the wider project stakeholders.

I’d like to refer you to this reply I made in 2018 about metrics which is hopefully poetic and kind:

There’s some truly excellent points above from others about leading with purpose, so I’ll mention three other things:

  1. Many described metrics are actually mathematically and philosophically unsound in fundamental ways. Often they abstract away reality until the numbers don’t describe anything at all. They will talk about “defects” but what they mean is defect reports that end up in the tracking software, assuming all defects are reported this way (rather than being fixed in design or by dev pairing for example), and assuming customers report all defects (they do not), and assuming that all testing happens in cases (none of it does) and so on and so forth. They’re also not comparable - one defect is more important and impactful to some people at a certain time than others. Knowing how many defect reports there are might be helpful, it’s just not then necessarily helpful to play maths with them and declare that of use.
  2. Metrics should always drive questions, never decisions. They cannot be used algorithmically, nor to judge people. They indicate the possibility of an unknown problem, like a car engine check light. In Bach parlance: Avoid control metrics, embrace inquiry metrics.
  3. Measurement is hard. There’s whole industries surrounding measurement because it is hard. Science has spent years of philosophy on it. So you must proceed with humility and leave room for error bars, as well as for the simplification of a metric to apply to many different possible realities.

Context: Ive only worked one place and in a smallish startup, so this is very little experience talking.

I ran into a lot of unknowns trying to define “escaped defects”:

  • Is any issue an escaped defect? missed requirements, integration issues?
  • Who defines something as a defect, and who determines the responsibility?
  • Is a bug or escaped defect from 2+ sprints ago counting against the current numbers or retro active?

We had this same problem of being asked for KPIs and we constantly pushed back to the leadership team that they needed to tell us what problems we were trying to solve aside from “show us that the QEs are working”.

I think a good KPI for quality, depending on its involvement in the planning process, “is tickets/bugs created after estimated completion time”. The idea here shows that the scope of the project or understanding of the delivery was inaccurate or bugs showed up meaning problems in parallelization. Its also trackable to a project so you can highlight when things are working well and compare 2 projects for analysis.