How much do you spend to create test data manually?

I have read in World QUality Report 2019-2020 that 59% of teams create test data manually with every test run.

How much time do you spend creating test data manually?
0, 1h/w, 1d/w, 2h/d?


Depends on what is being tested. For most projects I’ve worked on, there is usually some basic test data automatically generated when the test environment is setup which can be used for 90% of tests.

In cases where the data isn’t already there, then I setup what I need in advance. I see test data setup as being part of test planning, and I usually do most test planning while the development work is in progress. That way, I’m ready to start testing (including having all relevant test data created) as soon as its ready for testing.

In my view, the time spent creating test data is not important. What is important is preventing a testing bottleneck. If the test data setup is done in advance, then it is not creating a bottleneck.


It varies from project to project. I do a lot of web and mobile app testing aimed at general public.

Here these apps are designed or at least should be designed to be intuitive and data prep is usually fairly minimal unless I am testing a specific risk that needs a lot of data. By recreating small amounts of data on every session I get the variation in data that may find new things purely because of the variation. Using the same data every time and it lessons the chances of finding new things of interest.

On larger more complex projects its very different. I’ve worked on fund portfolio prediction applications with loads of variables with often months worth of data being used by any individual user. Here data is much more important, multiple full databases in play for different coverage, masking real data at times to get closer to actual user data, multiple scripts to create data and variations aimed to discover different things. No way will I create this every time as I do not have six months to spare so its sensible to leverage from tools to assist.

Automated scripts will also create and leverage from pre-generated data pools.

As Louise mentions the last thing testing should be is a bottleneck.

Worth a quick mention of valid data, at times I’ve used very complex excel sheets as an oracle cross check against system calculations and I often wondered who tested the spreadsheet and is it a valid oracle.