We’re working on defining KPIs for our QA team. With the rise of GenAI adoption in QA/testing activities

I’d like to refer you to this reply I made in 2018 about metrics which is hopefully poetic and kind:

There’s some truly excellent points above from others about leading with purpose, so I’ll mention three other things:

  1. Many described metrics are actually mathematically and philosophically unsound in fundamental ways. Often they abstract away reality until the numbers don’t describe anything at all. They will talk about “defects” but what they mean is defect reports that end up in the tracking software, assuming all defects are reported this way (rather than being fixed in design or by dev pairing for example), and assuming customers report all defects (they do not), and assuming that all testing happens in cases (none of it does) and so on and so forth. They’re also not comparable - one defect is more important and impactful to some people at a certain time than others. Knowing how many defect reports there are might be helpful, it’s just not then necessarily helpful to play maths with them and declare that of use.
  2. Metrics should always drive questions, never decisions. They cannot be used algorithmically, nor to judge people. They indicate the possibility of an unknown problem, like a car engine check light. In Bach parlance: Avoid control metrics, embrace inquiry metrics.
  3. Measurement is hard. There’s whole industries surrounding measurement because it is hard. Science has spent years of philosophy on it. So you must proceed with humility and leave room for error bars, as well as for the simplification of a metric to apply to many different possible realities.