Learning the data analysis and insights side of observability

I’m looking for resources around learning the data/analysis/insight part of observability. How do you know how to look for things that might be important, how to draw a story from data, as well as bias management, and cleaning up your data?

We’ve got the technical side sorted, but there’s a couple of people on my team looking to learn more about using that observability


Here are my notes on TDM: (copied from reports/books/articles -edited here and there)
But this is a brief summary and has epic insights into Test Data management and cleaning up your data.
(all publicly available, so feel free to share & use it :smiley: )

Test Data Problems (challenges)

Low test coverage

53% of respondents in the 2018-19 World Quality Report stated that they lacked appropriate test data.

Data for testing new functionality

Production data is often drawn from the expected scenarios that users have exercised against a previously released system. It therefore lacks the data needed to test new functionality. Imagine that a new field has been added to an unreleased version of a web registration form: how can production data possibly already contain this data? Data has to be prepared for new functionality.

Outliers and edge cases

Production data is typically narrow and highly repetitious, focusing on expected user behavior. It lacks the outliers and edge cases required for sufficient test coverage.

Negative Scenarios

Test data copied from production is “happy path”. QA must mitigate the risk of unexpected behavior, for instance testing error handling rigorously in advance of a release. Therefor a copy from production for test data will not work for negative scenarios.

Poor Data Consistency

61% of respondents in the latest World Quality Report cite “maintaining test data consistency across different systems under test” as a test data challenge. End-to-end testing requires complete, interrelated data sets with which to test every combination of API and Database call, as well as any UI activity. The challenge is retaining the referential integrity of this complex, interrelated data when preparing data provisioning for test environments.

Slow and manual data refreshes

Copying complex data is complex, and manual data refreshes can therefore be slow and error-prone.

Crude data sub-setting

Subsetting test data is valuable in lowering storage costs, as well as reducing data provisioning time and the time required to execute tests. However, simplistic subsetting techniques often neglect interrelationships in complex data.

Overly manual data masking

Masking is another valuable TDM process in that it helps mitigate against the risk of exposing sensitive Personally Identifiable Information (PII) to less secure test environments. However, masking must also respect the relationships that exist across tables: mask data in one row, and that change must be reflected consistently across tables. These relationships are often highly complex. Imagine you are testing an online banking system. Transaction logs for bank accounts contain temporal trends that reflect the time of withdrawals, transactions, and purchases. The data is also interrelated mathematically, including for example sum totals made up of numerous other variables. Reflecting these temporal and numerical relationships accurately while masking is highly complex, masking manually or using certain commodity tooling will rarely be able to retain the relationships.

Test Data waiting times

36% of respondents to the 2019 Continuous Testing Report state that over half of testing time is spent seaching for, managing, maintaining and creating test data.

Test data bottlenecks can arise as testers wait for data to be provisioned. QA teams are often dependent on a central team responsible for moving production data for test environments. These Ops teams must perform several TDM tasks to move data, including subsetting, masking, and copying.

Hunting for the “right” data

When the data is provisioned to test environments, testers must spend further time finding the exact data combinations needed to execute their test cases. The challenge is production copies are large, unwieldy, and repetitious. Finding exact combinations of interrelated data is therefore slow and frustrating, and 46% of respondents to The World Quality Report cite finding relevant test data in large data sets as a test data challenge. Finding particular combinations might be accelerated using database queries or a set of scripts. However, these must be tailored to individual tests, while the tests and data are subject to change. The queries or scripts must be updated or recreated each time test cases or test data change, eating up time within an iteration.

Manual data creation

The frustrating hunt for data combinations will often bear no fruit, as production data lacks the data needed for rigorous testing. Testers must then create the complex data needed to fulfill their test cases, particularly the outliers and unexpected results needed for sufficient test coverage. This is time-consuming and error-prone, particularly when performed by hand. Test failures arise from data inconsistencies, while data is created for particular test cases. The time spent creating data must therefore also be repeated as the system under test changes.

Cross-Team constraints

The time and cost associated with moving production-size copies of data to test environments further means that there are never enough copies of data. This creates cross-team constraints, undermining parallel testing and development. With traditional TDM techniques, testers are often forced to wait for an upstream team to finish with the data set they need. Further delays mount when another tester uses or deletes data, or when useful data is lost during a data refresh. Testers must then repeat the time consuming and frustrating hunt for new data or must create new data by hand.

Data Storage Cost

Firstly, there is the infrastructure cost associated with storing several full-size copies of production data. The fast-decreasing cost of data storage helps, as do technologies like database virtualization. Nonetheless, the lack of discernible advantage to retaining unwieldy copies of low-variety, low coverage data means much of the storage cost is waste.

Test run requirements

More problematic for test teams is the resource-intensity of running the large data sets during test execution. Automated test execution with terabytes of data will be highly time consuming and costly, as will executing queries against the data. The test runs will furthermore produce unwieldy resultant data that then needs to be analysed. Test teams can waste a large chunk of their time assessing millions of rows of complex data, comparing it to expected results to produce run results, and 56% of respondents to the World Quality Report cite managing the size of test data sets as a test data challenge.

Increasing regulatory requirements, increasing risk

46% of organisations cite “complying with data security and data privacy regulations for test data” as a test data challenge.

TDM “best practices” at numerous organisations risk non-compliance with data protection legislation, and increase the risk of costly data breaches. This is true in spite of the wealth or writing warning against using potentially sensitive production data in test environments, and the fact that global data protection legislation has been growing more stringent for over two decades.

Recent regulation includes the EU’s General Data Protection Regulation (GDPR), as well as the California Consumer Privacy Act of 2018. The GDPR carries staggering maximum fines of 20 Million Euros or 3% of annual turnover Much has been written on the challenges of ensuring that test teams have consent to use sensitive data in test environments, while new data protection regulation present challenges that current storage techniques struggle to fulfill.

Throw in the fact that test environments are necessarily less secure than production environments, with a greater associated risk of data leaks, and the risk of non-compliance grows further.

Key Technologies for a Modern TDM Strategy

  1. Data modelling and “data crawling”, to retain referential integrity as high-speed TDM tasks are performed.
  2. Sub-setting with a view to reducing test data size while maintaining variety and the interrelationships in data.
  3. Reliable data masking to mitigate against the risk of costly non-compliance with data protection regulations.
  4. Data cloning to provision test data sets rapidly in parallel, while preserving rare or useful data sets.
  5. Synthetic data generation to supplement production data sources and create every combination of data needed for maximum test coverage.
  6. Automate “Find and Make” provisioning, where the exact combinations needed for given test cases are searched for automatically, with any missing data needed generated on demand.
  7. Dynamic data definition, generating test data at the same time as test cases.
  8. Automated and repeatable test data preparation, in which previously fulfilled test data requests are performed automatically.
  9. Data comparison, enabling test teams to compare expected results to the data produced during test execution, rapidly formulating accurate run results.
  10. Virtual data generation, creating every Request-Response pair needed for accurate service virtualization.

Why TDM is critical to a project’s success

  • Your test data determines the quality of testing
    • No matter how good your testing processes are, if the test data used is not right or of adequate quality, then the entire product’s quality will be affected.
  • Your test data should be highly secure
    • It is absolutely mandatory that your test data doesn’t contain data from production without being masked. If the data is not secure enough, then there is every chance that a data breach might happen, which can cause the organization dearly.
  • Test data needs to be as close to real time as possible
    • Not only that test data needs to be of quality, it should be as close to real time data / production data as possible. Why? Simple reason is we do not want to build a system/application/product for 6 months and fail in the production just because there was not adequate real time data to test.
  • Lowers test data creation time which results in overall test execution time
    • This is self explanatory. This drastically reduces the overall test execution time.
  • Testers can focus on testing rather than test data creation
    • The main focus of trying to automate the test data management process is to allow the testers to focus on the actual testing than worrying about how the data is created and the technicalities surrounding it. This allows the team to remain focused on the job at hand (The actual testing) so that it can be done more effectively.
  • Speeds up time to market of applications
    • Faster & Effective test data creation leads to faster & effective testing, which in turn leads to faster time to market for the application. It is a cycle and hence it has a compounding effect, release on release.
  • Increases efficiency of the process by reducing data related defects
    • Due to the accuracy of the test data, data related defects will reduce enormously, thereby increasing the efficiency of the process.
  • You can manage lower volumes of test data sets more efficiently
    • Any time, managing lower volumes is better and more cost effective than managing higher data volumes. The maintenance costs associated with higher volumes will increase over time and will affect the operational costs.
  • Process remains same even though team size increases
    • This is a critical point, you would not need to reinvent the wheel if the team is ramped up. The same process can be followed/extended even if team size increases.

Test Data Life Cycle

Requirement Gathering & Analysis

This is pretty straightforward. In this phase, the test data requirements pertaining to the test requirements are gathered. They are categorized into various heads:

  • Pain Areas
  • Data Sources
  • Data Security/Masking
  • Data Volume requirements
  • Data Archival requirements
  • Test Data Refresh considerations
  • Gold Copy considerations

Planning & Design

As the name indicates, based on the requirement analysis an appropriate solution is designed to solve the various pain areas in the Test Data. After looking at the problem scale and the feasible solution, a suitable test data process is suggested and we would need to choose between an In-House solution or a Commercial Product or a combination of both. Also in this phase, an effort estimate is done for the entire project. And a test data plan/strategy is also developed that will propose a direction that the project will take and what approaches will be followed to solve the test data problems. That could be either in the form of process improvements or in the form of an automated solution.

Test Data Creation

In this phase, based on the Test Data Strategy, the solution is developed and test data is created through various techniques depending on the project test data requirements. It can be a combination of manual and automated techniques.

Test Data Validation

In this phase, the created test data is validated against the business requirements. This can be done by Business Analysts or using automated tools if the volumes are very high.

Test Data Maintenance

This is similar to a test maintenance phase, where there might be requests for changes in the test data according to the changes in the tests. Hence again the entire life cycle is followed for maintenance of the test data. This might include creation of Gold Copy for future use, Archives for size management, updating of Gold Copy, Restoration of older data for testing, etc.

Gold Copy Data

What is a Gold Copy Data?

In essence, a gold copy is a set of test data. Nothing more, nothing less. What sets it apart from other sets of test data is the way you use and guard it.

  • You only change a gold copy when you need to add or remove test cases.
  • You use a gold copy to set up the initial state of a test environment.
  • All automated and manual tests work on copies of the gold copy.

A gold copy also functions as the gold standard for all your tests and for everybody testing your application. It contains the data for all the test cases that you need to cover all the features of your product. It may not start out as comprehensive, but that’s the goal.

A Hybrid approach to Gold Copy Data

Sub-setting with Data-Masking & Modelling

Sub-setting production data reduces the time that must then be spent masking and copying it to test environments. It helps to quickly provision smaller but representative data sets, so that QA teams spend less time hunting for data. However, data subsets must be complete and coherent for testing, retaining all the relationships between tables in the production data. This can be complex when performed manually or using tools that require manual definition of all the subset criteria.

“Data crawling” refers to the automated technique by which you retain Primary and Foreign key relationships during sub-setting. You automatically “crawl” up and down Parent and Child tables, collecting all the data needed for a coherent data set. This is performed recursively until a complete data subset is made, producing smaller data sets that retain full referential integrity.

Reliable Data Masking

The sub-setted test data must be anonymous to ensure regulatory compliance with data protection laws. This is where data masking comes in. Any effective test data masking must perform two interrelated tasks: first, it must scan the data for sensitive information; second, it must mask this information. For testing, this masking must furthermore retain the referential integrity of the data. Modelling the relationships at the same time as scanning data for PII is an effective way of achieving this.

Data Cloning

Data cloning facilitates parallel testing and development by rapidly copying isolated data sets to multiple environments. Like sub-setting, cloning reads data from a source database or schema and copies it to a target. Also like sub-setting, effective cloning copies a complete and coherent data set, retaining full referential integrity. Adopting cloning at the same time as sub-setting and masking is a natural step for organisations with distributed test teams. Provisioning numerous copies of isolated data avoids the delays caused by cross-team constraint, while sub-setting the data prior to cloning reduces the cost of maintaining the numerous isolated copies. It furthermore reduces the time required to mask data.

Working from isolated data sets removes the frustration of useful and rare data being cannibalized by another team, while useful data can be cloned and preserved during a data refresh. Alongside these logistical benefits of rapid and parallel provisioning, data cloning furthermore enhances testing quality. It can be used to multiple the data needed for particular test scenarios. This is particularly useful for automated testing that burns rapidly through data, as it ensures that new data is always readily available.

Synthetic Data Generation

Cloning, masking, and sub-setting alone are capable only of satisfying test scenarios for which data pre-exists in production data. The first section of this article highlighted the low coverage associated with low-variety production data, and how using production data alone therefore leaves a system exposed to costly bugs. Adopting synthetic test data generation is therefore a key step in enhancing testing rigor, supplementing pre-existing data sources.

This streamlined approach to data generation is capable of creating data for complex and distributed systems, providing a range of techniques for inputting data into test systems:

  • Direct to Databases: You can generate data directly into numerous databases, including SQL Server, My SQL, Postgres, and Oracle.
  • Via the front-end: There is not always direct access to the back-end database, and VIP therefore also uses automation frameworks like Selenium to input data via the front-end. (NOT PREFERRED)
  • Via the middle layer: Alternatively, You can leverage the API layer, inputting data via SOAP and REST calls.
  • Using files: VIP can generate data in flat files, XML, EDI, and CSV.
  • Mainframe emulation: Mainframe data can be particularly difficult to create manually, and Curiosity still find organisations inputting data laboriously via green screens. You instead provides accelerators for creating complex Mainframe data via emulators, for instance creating synthetic data for X3270.

Test Driven Provisioning: Automatic “Find & Make”

Test data provisioning within modern software delivery projects must accordingly be test driven: data must be found rapidly for every test cases that needs executing within a short iteration, and any required data that cannot be found in existing data sources must be generated on demand.

Test Driven Provisioning: Dynamic Data Definition

Creating test data at the same time as test cases is another approach to test-driven data provisioning. Model-Based Testing provides is an effective approach to generating test cases and test data in tandem.

Repeatable preparation

TDM tasks are necessarily rule-based, with logical rules dictated by the source data and the test cases that need to be executed. TDM tasks are therefore ripe for automation, and 77% of organisations cited in the World Quality Report state that they are using or considering bots for test data generation.x If TDM tasks are automated and rendered repeatable, QA teams can invoke them directly. This significantly reduces the burden on central data provisioning teams, who can focus only on fulfilling new test data requirements. These requests thereby become repeatable in future, and QA teams increasingly do not need to wait for data to be provisioned.

Data Comparison

Analyzing data to formulate test run results can be time-consuming and cumbersome, especially when feeding large data sets through automation frameworks. QA teams can waste time scanning high-volumes of complex data and comparing it to expected results. This is not only laborious, but is also subject to human error, undermining the reliability of testing. Robotic Process Automation excels when performing rule-based and repetitious processes, and data comparison is no exception.

Virtual Data Generation

Service Virtualization can deliver significant time and efficiency gains to QA teams, providing on demand access to realistic test environments. It is also a technology that depends on effective data management. With Service Virtualization, test teams no longer need to wait for constrained or unfinished components of a system and can instead work in parallel from readily-available, production-like environment. However, accurate service virtualization requires realistic virtual data with which to simulate components. This data must furthermore be capable of satisfying the full range of requests made during testing, and effective service virtualization for testing therefore requires a rich set of Request-Response pairs. The full range of RR Pairs are rarely found in production data and are absent completely for unreleased components. Just like test data, virtual data is therefore prime for synthetic data generation.

Test Data Ageing

This is useful for Time based testing. Let’s assume you create a customer and it requires 48 hours for activation of that particular customer. What if you have to test the scenario that will occur after 48 hours? Will you wait till 48 hours for that scenario to happen for your testing? The answer is No. Then how will you handle this scenario?

There are basically 2 approaches by which we can do this:

  • Tamper the system dates
    • Although it is possible in some cases to tamper the system dates and continue with the testing, this method will fail if the date is generated by a database server or an application server instead of the client.
  • Tamper the dates in the back-end
    • This should be most viable and practical solution for such scenario. In this approach, we modify the date at the back-end so that it reflects the new date. But care should be taken to ensure that data integrity doesn’t get lost or the data semantics doesn’t get lost.

This method of modifying the date according to the scenario needs is known as Test Data Ageing. Depending on the scenario that needs to be tested, we can either Back date or Front date the given date.

Data Archive in the Scope of Test Data Management

  • Maintenance of test data
    • Typically used in maintenance of test data over a period of releases.
  • Archival of older release data
    • You can always archive your older release test data so that it remains intact for future use
  • Archival of multiple environment’s test data
    • If there are multiple test environments, the test database size grows proportionate to the number of environments. In this case, archiving the data would save a lot of disk space.
  • Restore whenever necessary
    • An archive should be easily restore-able.
  • Release/Build/Cycle wise snapshots for easy restore
    • Snapshots can be maintained as per the project release cycles. This is useful in case of production support wherein, we would need an older environment for testing the production support release.

Part 2:

Managing the Coupling between Tests and Data

When it comes to test data, it is important that each individual test in a test suite has some state on which it can depend.

Only when the starting state is known can you compare it against the state after the test has finished, and thus verify the behavior under test. This is simple for a single test, but requires some thought to achieve for suites of tests, particularly for tests that rely upon a database. Broadly, there are three approaches to managing state for tests.

  • Test isolation: Organize tests so that each test’s data is only visible to that test.
  • Adaptive tests: Each test is designed to evaluate its data environment and adapt its behavior to suit the data it sees.
  • Test sequencing: Tests are designed to run in a known sequence, each depending, for inputs, on the outputs of its predecessors.

Test isolation

Test isolation is a strategy for ensuring that each individual test is atomic. That is, it should not depend on the outcome of other tests to establish its state, and other tests should not affect its success or failure in any way. This level of isolation is relatively simple to achieve for commit tests, even those that test the persistence of data in a database. The simplest approach is to ensure that, at the conclusion of the test, you always return the data in the database to the state it was in before the test was run.

A second approach to test isolation is to perform some kind of functional partitioning of the data. This is an effective strategy for both commit and acceptance tests. For tests that need to modify the state of the system as an outcome, make the principal entities that you create in your tests follow some test-specific naming convention, so that each test will only look for and see data that was created specifically for it.

Setup and Tear Down

Whatever strategy is chosen, the establishment of a known-good starting position for the test before it is run, and its reestablishment at its conclusion, is vital to avoid cross-test dependencies. For well-isolated tests, a setup stage is usually needed to populate the database with relevant test data. This may involve creating a new transaction that will be rolled back at the conclusion of the test, or simply writing a few records of test-specific information. Adaptive tests will be evaluating the data environment in order to establish the known starting position at startup.

Coherent Test Scenarios

There is often a temptation to create a coherent “story” that tests will follow. The intent of this approach is that the data created is coherent, so setting up and Managing Test Data tearing down of test cases is minimized. This should mean that each test is, in itself, a little simpler, since it is no longer responsible for managing its own test data. This also means that the test suite as a whole will run faster because it doesn’t spend a lot of time creating and destroying test data. The problem with this strategy is that in striving for a coherent story we tightly couple tests together. There are several important drawbacks to this tight coupling. Tests become more difficult to design as the size of the test suite grows. When one test fails, it can have a cascade effect on subsequent tests that depend on its outputs, making them fail too. Changes in the business scenario, or the technical implementation, can lead to painful reworking of the test suite. More fundamentally though, this sequential ordered view doesn’t really represent the reality of testing. In most cases, even where there is a clear sequence of steps that the application embodies, at each step we want to explore what happens for success, what happens for failures, what happens for boundary conditions, and so on. There is a range of different tests that we should be running with very similar startup conditions. Once we move to support this view, we will necessarily have to establish and reestablish the test data environment, so we are back in the realm of either creating adaptive tests or isolating tests from one another.

Data Management in the Deployment Pipeline

Creating and managing data to use with automated tests can be a significant overhead. Let us take a step back for a moment. What is the focus of our testing? We test our application to assert that it possesses a variety of behavioral characteristics that we desire. We run unit tests to protect ourselves from the effects of inadvertently making a change that breaks our application. We run acceptance tests to assert that the application delivers the expected value to users. We perform capacity testing to assert that the application meets our capacity requirements. Perhaps we run a suite of integration tests to confirm that our application communicates correctly with services it depends on. What is the test data that we need for each of these testing stages in the deployment pipeline, and how should we manage it?

Data in Commit Stage Tests

Commit testing is the first stage in the deployment pipeline. It is vital to the process that commit tests run quickly. The commit stage is the point at which developers are sitting waiting for a pass before moving on. Every 30 seconds added to this stage are costly. In addition to the outright performance of commit stage testing, commit tests are the primary defense against inadvertent changes to the system. The more these tests are tied to the specifics of the implementation, the worse they are at 338 Chapter 12 Managing Data ptg performing that role. The problem is that when you need to refactor the implementation of some aspect of your system, you want the test to protect you. If the tests are too tightly linked to the specifics of the implementation, you will find that making a small change in implementation results in a bigger change in the tests that surround it. Instead of defending the behavior of the system, and so facilitating necessary change, tests that are too tightly coupled to the specifics of the implementation will inhibit change. If you are forced to make significant changes to tests for relatively small changes in implementation, the tests are not performing effectively their role of executable specifications of behavior. This is one of those key points where the process of continuous integration delivers some seemingly unrelated positive behaviors. Good commit tests avoid elaborate data setup. If you find yourself working hard to establish the data for a particular test, it is a sure indicator that your design needs to be better decomposed. You need to split the design into more components and test each independently, using test doubles to simulate dependencies.

The most effective tests are not really data-driven; they use the minimum of test data to assert that the unit under test exhibits the expected behavior. Those tests that do need more sophisticated data in place to demonstrate desired behavior should create it carefully and, as far as possible, reuse the test helpers or fixtures to create it, so that changes in the design of the data structures that the system supports do not represent a catastrophic blow to the system’s testability. Fundamentally, our objective is to minimize the data specific to each test to that which directly impacts the behavior the test is attempting to establish. This should be a goal for every test that you write.

Data in Acceptance Tests

Acceptance tests, unlike commit tests, are system tests. This means that their test data is necessarily more complex and needs to be managed more carefully if you want to avoid the tests becoming unwieldy. Again, the goal is to minimize the dependence of our tests on large complex data structures as far as possible. We should be creating just enough data to test the expected behavior of the system. When considering how to set up the state of the application for an acceptance test, it is helpful to distinguish between three kinds of data.

  • Test-specific data: This is the data that drives the behavior under test. It represents the specifics of the case under test.
  • Test reference data: There is often a second class of data that is relevant for a test but actually has little bearing upon the behavior under test. It needs to be there, but it is part of the supporting cast, not the main player.
  • Application reference data: Often, there is data that is irrelevant to the behavior under test, but that needs to be there to allow the application to start up.

Test-specific data should be unique and use test isolation strategies to ensure that the test starts in a well-defined environment that is unaffected by the side effects of other tests.

Test reference data can be managed by using prepopulated seed data that is reused in a variety of tests to establish the general environment in which the tests run, but which remains unaffected by the operation of the tests. Application reference data can be any value at all, even null values, provided the values chosen continue to have no effect on the test outcome.

Application reference data and, if applicable, test reference data—whatever is needed for your application to start up—can be kept in the form of database dumps. Of course you will have to version these and ensure they are migrated as part of the application setup. This is a useful way to test your automated database migration strategy.

This categorization is not rigorous. Often, the boundaries between classes of data may be somewhat blurred in the context of a specific test. However, we have found it a useful tool to help us focus on the data that we need to actively manage to ensure that our test is reliable, as opposed to the data that simply needs to be there. Fundamentally, it is a mistake to make tests too dependent on the “universe” of data that represents the entire application. It is important to be able to consider each test with some degree of isolation, or the entire test suite becomes too brittle and will fail constantly with every small change in data.

However, unlike commit tests, we do not recommend using application code or database dumps to put the application into the correct initial state for the test. Instead, in keeping with the system-level nature of the tests, we recommend using the application’s API to put it into the correct state .

This has several advantages:

  • Using the application code, or any other mechanism that bypasses the application’s business logic, can put the system into an inconsistent state.
  • Refactorings of the database or the application itself will have no effect on the acceptance tests since, by definition, refactorings do not alter the behavior of the application’s public API. This will make your acceptance tests significantly less brittle.
  • Your acceptance tests will also serve as tests of your application’s API


Consider testing a financial trading application. If a specific test is focused on confirming that a user’s position is correctly updated when a trade is made, the starting position and finishing position are of prime importance to this test.

For a suite of stateful acceptance tests that are being run in an environment with a live database, this probably implies that this test will require a new user account with a known starting position.We consider the account and its position to be testspecific data, so for the purposes of an acceptance test, we may register a new account and provide it with some funds, to allow trading, as part of the test case setup.

The financial instrument or instruments used to establish the expected position during the course of the test are important contributors to the test, but could be treated as test reference data, in that having a collection of instruments that are reused by a succession of tests would not compromise the outcome of our “position test.” This data may well be prepopulated test reference data.

Finally, the details of the options needed to establish a new account are irrelevant to the position test, unless they directly affect the starting position or the calculation of a user’s position in some way. So for these items of application reference data, any default values will do.

Data in Capacity Tests

Capacity tests present a problem of scale in the data required by most applications. This problem exhibits itself in two areas: the ability to deliver a sufficient volume of input data for the test and the provision of suitable reference data to support many cases under test simultaneously. we see capacity testing as primarily an exercise in rerunning acceptance tests, but for many cases at the same time. If your application supports the concept of placing an order, we would expect to be placing many orders simultaneously when we are capacity-testing. Our preference is to automate the generation of these large volumes of data, both input and reference, using mechanisms like interaction templates.

This approach, in effect, allows us to amplify the data that we create and manage to support our acceptance tests. This strategy of data reuse is one that we tend to apply as widely as we can, our rationale being that the interactions that we encode as part of our acceptance test suite, and the data associated with those interactions, are primarily executable specifications of the behavior of the system. If our acceptance tests are effective in this role, they capture the important interactions that our application supports. Something is wrong if they don’t encode the important behaviors of the system that we will want to measure as part of our capacity test.

Further, if we have mechanisms and processes in place to keep these tests running in line with the application as it evolves over time, why dump all of that and start again when it comes to capacity testing, or indeed when it comes to any other postacceptance test stage? Our strategy, then, is to rely on our acceptance tests as a record of the interactions with our system that are of interest and then use that record as a starting point for subsequent test stages. For capacity testing, we use tools that will take the data associated with a selected acceptance test and scale it up to many different “cases” so that we can apply many interactions with the system based on that one test. This approach to test data generation allows us to concentrate our capacity test data management efforts on the core of the data that is, of necessity, unique to each individual interaction.

Data in Other Test Stages

At least at the level of design philosophy, if not specific technical approach, we apply the same approach to all postacceptance automated test stages. Our aim is to reuse the “specifications of behavior” that are our automated acceptance tests as the starting point for any testing whose focus is other than purely functional.

For manual testing stages, such as exploratory testing or user acceptance testing environments, there are a couple of approaches to test data. One is to run in a minimal set of test and application reference data to enable the application to start up in an empty initial state. Testers can then experiment with scenarios that occur when users initially start working with the application. Another approach is to load a much larger set of data so that testers can perform scenarios that assume the application has been in use for some time. It’s also useful to have a large dataset for doing integration testing.

While it’s possible to take a dump of the production database for these scenarios, we do not recommend this in most cases . This is mainly because the dataset is so large as to be too unwieldy. Migrating a production dataset can sometimes take hours. Nevertheless, there are cases where it’s important to test with a dump of production—for example, when testing the migration of the production database, or determining at what point production data needs to be archived so it does not unduly slow down the application. Instead, we recommend creating a customized dataset to use for manual testing, based either on a subset of the production data, or on a dump of the database taken after a set of automated acceptance or capacity tests have been run.


Due to its lifecycle, the management of data presents a collection of problems different from those we have discussed in the context of the deployment pipeline. However, the fundamental principles that govern data management are the same. The key is to ensure that there is a fully automated process for creating and migrating databases. This process is used as part of the deployment process, ensuring it is repeatable and reliable. The same process should be used whether deploying the application to a development or acceptance testing environment with a minimal dataset, or whether migrating the production dataset as part of a deployment to production.

Even with an automated database migration process, it is still important to manage data used for testing purposes carefully. While a dump of the production database can be a tempting starting point, it is usually too large to be useful. Instead, have your tests create the state they need, and ensure they do this in such a way that each of your tests is independent of the others. Even for manual testing, Summary 343 ptg there are few circumstances in which a dump of the production database is the best starting point. Testers should create and manage smaller datasets for their own purposes. Here are some of the more important principles and practices from this chapter:

  • Version your database and use a tool like DbDeploy to manage migrations automatically.
  • Strive to retain both forward and backward compatibility with schema changes so that you can separate data deployment and migration issues from application deployment issues.
  • Make sure tests create the data they rely on as part of the setup process, and that data is partitioned to ensure it does not affect other tests that might be running at the same time.
  • Reserve the sharing of setup between tests only for data required to have the application start, and perhaps some very general reference data.
  • Try to use the application’s public API to set up the correct state for tests wherever possible.

In most cases, don’t use dumps of the production dataset for testing purposes. Create custom datasets by carefully selecting a smaller subset of production data, or from acceptance or capacity test runs. Of course, these principles will need to be adapted to your situation. However, if they are used as the default approach, they will help any software project to minimize the effects of the most common problems and issues associated with data management in automated testing and production environments.

1 Like

I’m afraid I’m not sure what kind of thing you’re after.

If it’s “How do I get better at exploring data e.g. from observability from a system?” I suggest that you see if a notebook is available for your data stores. By notebook I mean e.g. a .ipynb file, that allows you to combine human-readable text, queries, and results from queries (a bit like Excel, but better) You can write notebooks to keep track of queries, explain why the queries are relevant and important, and include graphs / tables / etc. that were produced by exactly those queries. If the data cleaning is ad hoc, you could do that as part of the notebook too (by including the relevant queries). Some blog waffle on notebooks and some analysis using one: Analysing the names of artists in the UK music charts using R – Random Tech Thoughts

If it’s “How do I test how good our system’s observability is?” I suggest you imagine you’re a support engineer (or talk to an actual one), and pretend that you’ve been got out of bed at 3 a.m. with error X. Does it provide you enough context, or can you easily assemble that context yourself?

For instance, if the error says that a queue is full, do you also have things like min / max / average age of things on the queue? If the queue’s full of old things, that suggests that the consumer process is struggling / dead. If the queue’s full of new things, that suggests that the producer process has suddenly started producing much more quickly than usual. (This could be the producer acting correctly, in response to an increase in traffic upstream from the producer.)

If it’s “What stories can I extract from our current observability data?” I suggest that you go through people who might be interested in the system, and imagine what info about the system would make them particularly happy / sad / angry / surprised. Then pick one of those bits of info and see if you can find the data that supports it. If you can’t find that data, is that OK? If you can find that data, how could it be expressed in terms that the audience will already care about and understand? I.e. write a story in the audience’s language rather than in yours.

It might help to construct a user story for these people interested in the system. More blog waffle Visualisations and the stories behind them – Random Tech Thoughts

1 Like