Live Blog Testbash Germany: Qualitative Risk-Based Test Reporting by Nancy Kelln

We are on the last talk! Thanks for everyone still reading! Iā€™m excited about seeing this, because Nancy sounded like she has great stuff to say during the panel. At the same time, Iā€™m excited about stretching my legs and shoulders and neck!

Nancy is promising to talk about metrics, but that theyā€™re not gonna be boring. Many people have to deliver metrics ā€“ and those that donā€™t are lucky, Nancy says. My caps: WOULDNā€™T IT BE NICE IF WE DIDNā€™T HAVE TO PROVE OUR VALUE WITH NUMBERS!

Talking about metrics ā€“ the goal is to try to quantify quality. A metric tries to show that. Theyā€™re usually numbers to ty to quantify quality.

As tester, we produce things. Test plans, test cases, test metrics. But do these outputs assure quality? (Iā€™m going for a no here). We start to count all the things. Number of test cases, number passed, blocked. Hours spent testing, number of defects found. And more. How are we supposed to use these numbers to define quality?

Weā€™re doing a walkthrough on what some of our words really mean.

A test case is basically an expectation that someone who matters has. Even if the test case is written by someone else, the tester is the one making the decision on whether the expectation was met or not. So we could count how many expectations weā€™ve met and how many we havenā€™t. We could also count defects as unmet expectations. But is that helpful?

On a real life project with big stakes (national image, lives potentially at stage), Nancy didnā€™t require any metrics. She was trusted on the team. Then she was asked to join the bigger team, which was having awful problems. She was the fourth test lead in a short space of time. (Nancy: if they fire the three previous test leads, itā€™s not a testing problem). When Nancy started, she was presented with a giant stack of printed test cases (Iā€™m typing slowly because I canā€™t even believe this). The team had three defect systems. One of them was Excel. The team was meeting criteria of counting things, but they had no clue about the quality of their system.

If we look at trip advisor in terms of test metrics, then weā€™d end up with reports like ā€œ10 expectations, 8 metā€. What does that even mean without any more information? Itā€™s impossible to make a decision.

The ā€œsolutionā€ to this could be to add random numbers as entry and exit criteria. Even number of defects doesnā€™t help us.

Trip advisor reviews capture stories. Why donā€™t we do this in testing? Numbers make us feel secure. And weā€™re not good at telling testing stories (oh yes!). Yet we like stories. We tell them naturally. And yet we donā€™t use them in testing.

Itā€™s because itā€™s hard to do. We can look at graphs quickly (even though itā€™s maybe bad information). Itā€™s hard to put our testing into a story that someone will read.

Nancy suggests using metrics that drive conversation. That will be qualitative metrics as opposed to quantitative.

Sheā€™s showing us a lightweight dashboard. The rows are focus areas (could be stories, epics, business functional areas). There are columns for test progress (phase and how itā€™s going: you can be e.g. in progress and in red or in progress and in green). The next columns are for planned testing intensity (low, medium, high) and achieved testing intensity (did we stop because itā€™s great or did we do more than planned); Nancy doesnā€™t count things like test cases any more.

The next column is test confidence (oh I love this metric!). She asks the question ā€œif this went live tomorrow, would you be thumbs up, down or in the middle?ā€. Finally, there is room for comments.

I like this approach. It starts conversations and steers away from too much detail.

Nancy decided to try this approach out in that stressful project. She got hold of the project charter, and approached the stakeholders. She found out how they felt about the release by saying that they could go live on Friday ā€“ and gauging their response. Based on the responses (all ā€œno way!!ā€), she filled in the comments field. By having real stakeholderā€™s problems and their needs in order to go live, she could have the tough conversation with the product manager about moving the release.

This approach has worked well for her on rescue projects. Yet on a green-field project, she suddenly went back to counting test cases until she caught herself. In order to use this from the beginning, itā€™s a sales pitch to your stakeholders. Get their engagement and understanding. Because counting test cases in good projects probably wonā€™t hurt you. But it wonā€™t help you either.

Itā€™s a good idea to keep the documents so you can see trends (an area that was green for confidence goes red and stays there).

Nancy reminds us now that we canā€™t measure quality in 1ā€™s and 0ā€™s. Quality is a conversation and involves bringing stakeholders results that include stories from those testing the software. And if we have properly informed stakeholders, they can make educated decisions on what to release, when to release, risks, impacts and mitigation plans.

Finally, she encourages us that this takes practice. And that sometimes it works and sometimes it doesnā€™t. A great talk for someone like me who shivers every time someone asks to count things!

Thatā€™s a wrap from me everyone! Thanks for reading and I hope it was useful! :slight_smile:

4 Likes

Thanks for all the blogging Alex!!

1 Like