Dealing with finding critical bugs that were missed during testing

Dear Agony Ant,

After a release, we often find critical bugs that were missed during testing.

What steps, practices or processes can we take to improve our testing process and catch these issues earlier?

2 Likes
  • Try to sort and organize your bugs based on what type they are (regression, production or use case) and in which area of the software they appear in
  • Then try to find any trends or patterns, do you see anything that may be a constant source of problems?
  • Once you have that you will see a strategy take shape on it’s own
  • You may consider building your own checklist of things to check that are crucial, most commonly used user journeys and most bug prone areas
  • Having some insight into common patterns at hand, you can also discuss with your dev team to see if there’s something in the codebase that needs version upgrading or optimizing which has been long overdue
  • If there’s any UI flows or APIs that don’t change much, write automation scripts for them and set their passing green as a pre-requisite for sending something for testing

TL;DR Put your detective hat on! :male_detective:t2:

Hey wait a minute,
@rosie is this another test agony ant post? well at least I got to practice my wisdom sharing skills :sweat_smile:

2 Likes

I’m relatively new here and englisch is not my first language, so hopefully I covered everything well :smiley:

I would assess whether there is a specific subject area or domain where errors frequently occur and if these areas are sufficiently covered by the test strategy. This involves analyzing error data to identify common error areas and patterns. It is crucial to ensure that these areas are classified as critical in the test strategy and are adequately addressed. If discrepancies are found, it may be necessary to increase test coverage or expand the scope of the areas deemed critical.

Additionally, categorizing errors to determine whether they are bugs in new features or recurring errors that have already been resolved is essential. If errors are found in new features, it is important to analyze whether these features have been thoroughly tested. For recurring errors in previously tested areas, the test strategy should be adjusted to increase the frequency and coverage of regression tests. Reviewing the timing of the tests may also be necessary to ensure they are conducted at the optimal time.

Moreover, it is beneficial to delve deeper and examine whether the errors consistently occur under similar conditions or independently. This requires a root cause analysis to identify the underlying reasons for the errors. In many cases, the issue does not lie in missing tests but in other factors, such as the need for refactoring in specific areas of the software or unresolved conceptual issues. These ambiguities often lead to different approaches and testing methods among team members.

For instance, we encountered a problem where tickets did not clearly define how a new feature should be tested, and the concept was not adequately described. This led to recurring issues as the error rate increased with each handover during the processing of the new feature. Improved communication and a better framework for the conception and description of features and the resulting tickets resulted in significant improvements and a reduction in the error rate.

In another problematic area with recurring errors, it became evident that refactoring was necessary. After implementing the refactoring, the error rate in these and adjacent areas decreased significantly.

1 Like

Excellent insight, @sylviaczarnecki and @hananurrehman. Thanks for sharing! :bulb:

I think any process to review what happened with the intention to improve things should honour the Retrospective Prime Directive.

“Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.”

Source.

It is powerful to read this out at the start of any retrospective, process review discussion, post-mortem etc.

2 Likes

How do you find out about those bugs? Are you finding them first or are the users, other company colleagues, bug hunters, crash statistics, production system logs, system analytics, or other sources?
How would you describe them? As others suggested investigate their source, classify them, group them, and check their impact on users and the company (image, revenue)
Can you link them to the actions that led them to be there? Work backward and see who had the power, and the knowledge over the product state at various times.
Can you find the different causes that led to them? Who would have to change and what would have to be different for the situation to improve?