How to reduce the incidence of data related bugs?

konzy262 · 28 May 2019 14:15

Hi There,

Our company has been burned quite recently due to the higher prevalence of ‘data related bugs’ being deployed to live. By ‘data related’, I mean bugs that occur due to a specific volume and complexity of data - e.g. the same functionality may work for one user but not another due to the amount and type of linked data.

We have a selenium automation framework that covers the high risk areas. Best practices are followed with regards to making the tests atomic, quick, stable and repeatable. The automation environment data is managed by either clearing down all the database tables before a test run, or backing up and restoring to reset the data to just before the test run started.

This has been working well - False positives reported are low, and a multitude of different issues have been found whilst automation has been in use, however an increasing number of data specific bugs have slipped through. They have also slipped through manual testing as our test environments are often not as complex as what the user ultimately attempts to do with the system.

How can we prevent more of these type of bugs going out to live?

These are some of things I have thought of, time and resources dependent…

Find a client who experiences these issues more often than others and backup and restore their dataset to test. Use this test environment for manual testing and automation (although the automation will still create fresh data on top)
Evaluate where multiple layers of complexity can be added in the system, and attempt to setup equivalent data both with the automation and manually, to test with.
Have a fast and reliable roll back procedure should a dodgy build go out.

I should mention as a department we are overworked and under resourced, which is prehaps the biggest factor of all that needs to be addressed.

devtotest · 28 May 2019 15:23

Hello @konzy262!

Data related bugs have a couple of root causes. One is data quality and another is data diversity. From your description, it seems the errors are rooted in data diversity. That is, for all of the great inspections made by your team, some combinations of data still cause the system to break.

I had this scenario a while back. We worked with a vendor to define a schema for the data. We worked with them for a few months to have them deliver the data according to the schema. While we started to have some confidence with the delivered data, we were unsure about how the system would behave in production.

We worked with the vendor to scrub production data of sensitive information and present the scrubbed data to a non-production version of the system. The system was designed to have this data delivered daily.
Every day, we received the scrubbed data early in the morning. We were able to assess the behavior of the system before the same data (that is, un-scrubbed) was presented to the production system. Some days, we found data issues, corrected them, and verified them before the data was presented to the production system. Over time, we identified a handful of data related bugs while preventing production issues. This was not a long term solution but it helped reduce “known bug” in production.

Production data is a great source of diversity so this may be a solution for you. I believe your first thought on finding a client where the issues occur more often could help especially if they can deliver scrubbed production data. I was unclear how adding complexity (your second thought) could assist.

Joe

alanmbarr · 29 May 2019 20:50

It’s difficult to say I think many businesses struggle with data. Historically data lineage has not been taking seriously enough. It sounds like this could be related to complex state related to the business domain? Ideally you have checks way before the UI layer but I think there is not enough context here to give anyone the confidence to give you good advice.

konzy262 · 30 May 2019 13:25

Thanks.

What I’m also curious about is whether most software should have constraints built in to prevent the data volume and complexity spiraling out of control.

I’m not necessarily talking about obvious examples, like if you have a file upload and you select a 25gb file and the system inevitably grinds to a halt or crashes. I’m talking about any area where there is an unrestricted ability to add new data.

Take addresses for example. 95% of test cases will have either 1 or 2 addresses against a record. Some user somewhere may decide to add 50, which then causes the record to load slowly, and maybe cause bugs in other parts of the system, e.g. if there is a drop down which needs to load all those addresses.

Should every area of a good software system that allows a user to add data have upper boundaries for the amount of allowed data clearly defined and checked for?

devtotest · 30 May 2019 15:35

Hello @konzy262!

In my opinion, well designed software provides both the ability to create AND manage its data. When we (designers or testers) discover or determine that some data scenarios can impact system behavior or performance, we should both craft a solution to manage that data and be transparent with our business team members about the data management challenges.

In the case of a large file, typical design solutions are to convey small bits of the file with some validation model (e.g., check sums) until the file is conveyed. Enhancements to this design, as we have seen, is background execution or queuing (as used in video presentations).

In the case of addresses, I might wonder about a business case for 50 addresses but it depends on the domain (I guess rental management might have a need for 50 or more addresses).
One might apply the large file model and present X addresses per page (that is, a small amount at a time). I have seen this at the bottom of a forum page where I might navigate to other records or pages (something like << < 2 3 4 5 > >>). There is an opportunity to manage the data conveyance while the user may be scanning the first page, the application could be retrieving more pages.

It sounds like 50 addresses is not typical in your application but possible. Since this has been discovered, might it be an opportunity to explore alternative presentations with your business team members? For example, when addresses are more than X, the presentation is simplified.

Should every area of a good software system that allows a user to add data have upper boundaries for the amount of allowed data clearly defined and checked for?

Yes. In my opinion, that upper boundary should be a balance to user value and system capabilities. When the upper boundary is reached, you can get more storage, or ask the user to pay for more addresses. There may be other alternatives.

Joe

alanmbarr · 30 May 2019 20:18

I think this is probably unsatisfying but it depends on what a business is willing to pay and what its goals are. In a large enough enterprise there is going to be many systems hooked together in some manner. Developers would be responsible for building in that validation at every system. Business owners will prioritize new features over data validation across systems. I helped build a homerolled data/information catalog that illustrates the length constraints of fields in different systems. It would take many man hours to truly fix this. I think that at that point we have to take off our QA hats and put on a business user or software architect hat and think about what risks are we trying to prevent across systems and how much are we willing to spend on it either in monetary terms or opportunity costs.

mirza · 10 December 2019 09:24

I’ve seen this on big enterprise projects where they got a lot of different systems communicating with each other. A lot of “bugs”, if not most are caused by data related issues, like delays in replications, miscommunication with API middle-where, very rarely I’ve seen bugs in prod that have been introduced by code changes. Some of these issues can’t be avoided, I guess, as complexity of the project grows, but the approach is to prevent it by planing good architecture for the future.

robertday · 11 December 2019 10:15

The first system I ever worked on was a huge data collection tool - in fact, at the outset, it wasn’t a tool, it was a paper form and our QA role was about ensuring the quality of the 40,000 or so data items we were requiring our clients to provide us with. (This was in the field of utility regulation.) Only as the data collection systems became more complex did increasing elements of software testing become more important as the focus moved from ensuring that collected data was soundly gathered according to detailed real world requirements, to making certain that our systems recorded data correctly, properly allocated that data to the correct space in the master database, and any operations performed on that data (deriving percentages, manipulating data to provide completion rates) were performed accurately and consistently.

Unlike the OP’s situation, we were not so concerned with data volume (that was a small-‘p’ political issue between our senior management and the regulated companies) as with data quality, compliance with the requirements for reporting that number, and accurate data handling so that the results of our specialist analysis and calculations were robust, repeatable and above all, defensible, both in court [including the ‘court of public opinion’] and ultimately in Parliament.

Where I think this is relevant to the OP’s situation is that the QA role involved a lot of work in defining the reporting requirements - what was to be collected, to what level of accuracy and under what degree if auditable scrutiny - and maintaining the dataset - a year-round process of allowing clients to query and challenge the numbers and our analysis of them. After a few years of this, I found that I had a pretty good grasp on what each of those 40,000 data items ought to look like, how they would change over time (year-on-year), and so what a “typical” dataset ought to consist of. I was then able to build, from scratch, a test dataset that encompassed each of these 40,000 items in such a way as to make them typical enough to test out the data collection and uploading tools without those numbers being readily identifiable as belonging to any one client (commercial confidentiality and all that).

That degree of knowledge meant that I reached a stage where I could be confident that any issues with the application weren’t caused by the dataset. And the extent of the work I did with clients meant that I was considered to be the ‘go-to’ person on data issues. There were only two problems with this.

First, I became aware that there was at least one client whose submitted numbers looked very similar to mine in terms of their magnitude and (in particular) the rate of change over successive years. That client turned out to have been falsifying their data returns, resulting in their being fined somewhere over £30 million and subject to a criminal investigation by the Serious Fraud Office.

Secondly, I was the only person in the organisation doing this sort of work, and I did it for fifteen years. By the end of that time, I was feeling pretty jaded whilst colleagues at the same time both relied on my work and were ignorant of it. Ultimately, I had to get out of the organisation before I burnt out completely.

So my answer to the question the OP posed is:

Understand your data
Understand your metadata
Understand the operations that data is subjected to; and
Don’t try to do it all yourself - otherwise you’ll wake up one morning
in a very bad place.

vishaldutt · 18 January 2023 02:46

Database testing is a form of backend testing, it involves aspects of the software which are not visible to the user.
This includes the following :

The app to the database flow.
Data mapping and data integrity.
The ACID properties validation.

These are atomicity, consistency, isolation, and durability. The accurate testing of all these happens only during the qa database testing. This cover the accuracy of implemented business rules, as databases are not just for records storage.
There is need to properly test all these different areas of the database, information storage and queries.

There are so many different techniques and tools to perform qa database testing. All depending on the application, the queries involved, and the preferences of the software developers. Though the basic aim of testing always remains the same.

Transactions testing is one technique which helps to ensure that each of the ACID properties works accurately.

Then there is schema testing that makes sure that each object in the scheme works as it should and done by doing a thorough check of each object. It includes the following:’

Database names
Databases and devices
Database settings
Device registration, data and reset
Tables
The space for each database
Checking the keys and indexes

To make sure that basic database functions and procedures work properly, we usually do Stress testing and performance testing. In this the tester creates various queries and runs them through the system to see how the software responds.

Also, no matter what procedure and testing methods are used. Only the tester will have to have a strong grasp of both SQL (Structured Query Language) and DML (Data Manipulation Language).

Topic		Replies	Views
Learning the data analysis and insights side of observability Archive learning , data	3	521	26 October 2021
Automation Test Data - Advice Archive automation , data	5	649	12 September 2019
How to cover too many branching scenarios? How to deal with too large combinatorial? Discussions tools , learning , automation , process	12	278	7 May 2025
Baking Quality into Your Data Pipeline with Ali Khalid Archive testbash-new-zealand , data , quality	3	644	20 November 2020
Too Many Bugs in Production - What Are We Going to Do? with Melissa Fisher 🪐 TestBash collaboration , process , testbash-home	29	1323	9 July 2021

How to reduce the incidence of data related bugs?

Related topics