Feature QA Scorecard - How one can measure the scorecards?

Hello Folks,

Hope well. I wanted to reach out to you and ask things about QA scorecards. In my org, for each feature, we have to fill the QA scorecards that include entry and exit scorecards. It is like a checklist for entry and exit criteria. The intention is to provide high-level assessment of the health of the feature. Some of the Entry criteria that include these scorecards are - what is the Unit testing coverage percentage? Are user stories in jira reported by PMs? Are tech specs, final requirements are linked to jira and finalized? Is there any automation testing done by the Front end? The problem is its been very subjective and we would like to measure that. Currently, we are giving % scores to per criteria like (>90%, 50 to 89%, <50%). Everyone thinks differently- in the same given situation, one QA will give a score (50 to 89%) and another one will (>90%). Results are varying from person to person. This way we are are not able to measure it correctly. I am seeking help to see if someone has implemented it and if there is any advice on it.

Thanks for much for helping.

Regards,
Puneet A

In the past I saw different models to assess software like CMM and TPi. Whole books were written abput them and still scores would differ.

My approach would be:

  • make calculations for the scoring transparant.
    E.g. pick a piece of code and determine the way of scoring of an entry criterium from several people.
  • make appointments about the scoring.
  • review the scoring at least every month, but not too frequently.
  • be specific in the calculation.
    If a function is supposed to do ten different things, are there at least ten unit tests for these things?

Other things to be considered.

  • an important thing is safety. What happens, if the score is 20% for unit test coverage? Is it safe to report it? Does the developer still have a job after this score?
  • the goal of the QA scorecard is to improve the software. How is the feedback from the QA scorecard used? Has enough time been allocated to the developer? Does the developer have time to learn about things like unit tests?
  • making appointments about the scoring of the criteria on QA scorecards cost time. Sticking to these appointments cost time. On the long term it can save time, so support from managers is needed.
  • the system can be gamed. This means, that people can misuse the system to get a good score. A unit test coverage of 90% is bad, if the unit tests for the highest product risk are missing. Or not enough errors are tested.

It’s a little hard to understand what the expected use of the score card would be which makes it hard to say how you add measurements to it. For instance if you want to have the case “Is this feature good enough to put into production?” Then you would typically need something like and ROI where on the score card you need to provide information on how likely it is to not get the expected value. In one instance we had a service running that generated x amount of monies per user and hour and then any identified bugs that where not fixed would have a cost analysis on the level of “10% of the users affected and total loss of revenue for 10 mins” which allows you to take a very informed decision on release with bugs or delay release and fix bugs.
On the other hand if it is a checklist to see that you have followed the process then transform all measurements into a yes / no questions. Like is unit test coverage (super crappy metric btw) above 40%. Have all documentation been approved? Are all business critical areas covered by automated tests?
On the same note if you want to find some process elements for your test process the best experience I have had comes from risk based testing in the fighter airplane industry. Where basically you have to first identify what the criticality for the feature is. Then have a strategy on how you test the different levels. Like high critical will have domain testing. Have a reference oracle. Static code analysis etc. etc. Random testing with a no error in log oracle. And then you can have a corresponding checklist of “Have you done all the parts”.

A common problem is that you might want to solve several problems with one tool, which typically leads to a solution that does not solve any of them really well. And the other is the one size fits all. We will do this for every feature no matter how important it is which have a similar problem that you will either over do it for some features or under do it for some features.