Our company has been burned quite recently due to the higher prevalence of ‘data related bugs’ being deployed to live. By ‘data related’, I mean bugs that occur due to a specific volume and complexity of data - e.g. the same functionality may work for one user but not another due to the amount and type of linked data.
We have a selenium automation framework that covers the high risk areas. Best practices are followed with regards to making the tests atomic, quick, stable and repeatable. The automation environment data is managed by either clearing down all the database tables before a test run, or backing up and restoring to reset the data to just before the test run started.
This has been working well - False positives reported are low, and a multitude of different issues have been found whilst automation has been in use, however an increasing number of data specific bugs have slipped through. They have also slipped through manual testing as our test environments are often not as complex as what the user ultimately attempts to do with the system.
How can we prevent more of these type of bugs going out to live?
These are some of things I have thought of, time and resources dependent…
Find a client who experiences these issues more often than others and backup and restore their dataset to test. Use this test environment for manual testing and automation (although the automation will still create fresh data on top)
Evaluate where multiple layers of complexity can be added in the system, and attempt to setup equivalent data both with the automation and manually, to test with.
Have a fast and reliable roll back procedure should a dodgy build go out.
I should mention as a department we are overworked and under resourced, which is prehaps the biggest factor of all that needs to be addressed.