Generic in-house test data generator with a GUI. What might this look like?

We have a requirement to provide the ability to create realistic test data across our website and present this as some sort of customizable GUI.

Has anyone built or seen something like this where it worked reasonably well?

The reason’s for wanting to do this are…

  • Creating data quickly for anyone wishing to use the site, e.g. a QA wanting to test a new feature that requires allot of manual steps to setup the data
  • Black box automation tests that require allot of data setup and teardown.
  • Performance testing against a live site
  • Load testing
  • Potentially others

The intention is that this could be used by both technical and non-technical people alike. So for those that are comfortable, they could call the backend of this test data generator app directly, or they could use some sort of GUI to trigger the data generation.

The challenge is making this customizable in terms of the data created, and I can’t quite visualise how this might look.

For example, we could create an easy way to create a realistic customer, but then a requirement comes in whereby we need to customize the customer address. We add that to the GUI and then another requirement comes in wanting to customize the customers marketing preferences and so on and so on. You end up with a screen littered with customizable fields and still there might be scenarios that are not catered for. You also potentially have nested data, so you may want to customise 3 different addresses for a customer.

It would need to be…

  • Simple - Just generate me this data. I don’t care about the details
  • Customizable - I need this data with these specific details
  • Safe - The data needs to be valid and robust

Any thoughts on how one might do this from a visual and technical perspective? Am I asking for too much?!


There are some that have created something that generates data with a GUI but all of those tools cost a lot of money :stuck_out_tongue:

I’m also interested in an open source one with a GUI for a few of our teams :wink:

Probably, and me too since nobody has build it yet for free :smiley:
Just kidding, I think a lot of companies don’t even think about generating data and that’s the struggle. A lot of companies don’t do performance testing, or create manual data and have no budget for test data generation due to ‘testing is at the end and it’s replaceable or skip-able’. I think when this attitude is over, then companies will realize the benefit of Test Data.

1 Like

Yep. Definately in uncharted territory!

It is probably difficult to create an open source or commercial tool that is all things to all people. The different types of data stores alone would make that difficult.

I’m wondering if some sort of json schema type form might work…

1 Like

In another thread I pointed out some risks involving private data:


I don’t see the point of a gui when this gives you perfectly random data, not private at all, and is free

When I’m doing CI/CD testing we create user accounts using a random char generator, the random chars get appended to each user name which is based on the environment name. At the end of each test run we leave the account records behind in the system so we have some “load” which builds up over time on the database. So far with a 12 character random string, (not the only thing that makes an account unique) not noticed any clashes. I’ve never thought about randomizing all of the input data, it starts becoming hard to generate valid input values randomly, but still look human readable in logs. I cannot believe there is not a github project that does this already? This is a US dump of useable data GitHub - EthanRBrown/rrad: Real, Random Address Data (RRAD)


I suppose it’s for non-technical people, I can imagine business people not knowing how this works :stuck_out_tongue:


My big LOL here, was because I was working at a place once, where the support desk needed to assign a unique tracking ID to tickets so they could correlate them in 2 separate systems.
So the guys wrote a little thing to paste a guid to the clipboard when clicked on so it could be pasted into both systems.

I did find this to be a bizzaire automation gap. I wanted to help, but sometimes it’s best not to help non-technical people when they don’t ask for help.

Pretty sure we could whip up a local Python script though to run it like a service that gives you a random “seeded” name, address, social security and credit card number etcettera with some validation rules thrown in to make the data spicey. I’m thinking of not having a GUI, but creating a Python Flask web-server, much less pain and you can run it locally. I would volunteer, but Venkat has me tasked to write a JSON logging events parser/instrumentation demo app.

1 Like

We created an internal web site to collect parameters for the test data and submitted those parameters to another system that created the test data. As more people (not just Testers) used it, it grew in functionality and flexibility.

“Feeling old” side note: We started that in 2007. I moved on from that project to other projects and, very recently, into a new role. The other day, someone referenced that test data generator by its internal name and was surprised to hear it after all this time.


1 Like

I love these @devtotest , did the tool have a cool name like JRG? (JoesRandomnessGenerator)

Good to hear.

Was it able to handle both the simple stuff like just ‘create me a record’ and more complex stuff like ‘create me a record with X,Y,Z’?

When you say ‘parameters’ what did that look like? Was it a bunch of fields on a page? A table? json data?


1 Like

I don’t know about “cool” but as we were getting ready to launch, my manager called me aside and said we were going to call it POD. P is the name of the test data type (probably can’t divulge the actual name here) and the remaining is “On Demand”.
So, say P stood for Pizza, the it was Pizzas On Demand. Since it is accessible as an internal web site, it even got its own internal vanity URL.


1 Like

Thanks for asking!

The tool created a record of the data type and it was placed into the non-production operational database where testers could use it.

Yes, parameters were collected through text fields on a web page. Set up the parameters, click Submit, and the tool provided a reference to the data just created.
For example, if I were creating a Pizza data type, then you would collect toppings and size.
We later added test data “recipes” where a user could select pre-defined data types such as “large pepperoni pizza”, or “veggie pizza”, and the like. Users liked this feature because they often wanted the same kind of data for their tests. Using a recipe made it quick and easy to create the data.
If you expanded the menu, then you could select other products (wings, calzones, etc.) and collect parameters about them.


1 Like