How will you validate or check AI-generated test cases in real projects?

Are you doing it through manual review, via heuristics, by cross-checking with existing tests, or by something else?

I’m experimenting with GenAI tools and have this interesting question: How do others ensure that the generated test cases are indeed correct, complete, and useful-outside of the automation of those test cases themselves.

Would love to hear about what you are doing in your space! :rocket:

#AIinTesting #TestAutomation #QualityAssurance #MinistryOfTesting

1 Like

I’m assuming the test cases that are generated here are in automated script format, for me it would miss the point if it generated cases for manual execution and it for any reason it was the latter I’d assume those executing them would validate them as they go.

I suspect its a good question as we may be fasttracking down a path that most automation is generated as the product code is built, i.e nobody writing automated scripts any more.

Let me expand on that a bit. We already have playwright and llms where a couple of prompts update all your product code with test ID’s, create POM models for each views, generates, runs and fixes test cases and very quickly you can have a basic UI health check.

That with some learning may advance to no prompting required and apply at all layers in the stack that all automated tests are covered with minimal intervention.

Have you got a lot of slop though, are the tests useful, do they find the things they are intended to find. Will it scale is another decent question.

At this point probably a lot of slop but if its taught better that should hopefully reduce.

I’m of the view that having a critical thinking, skeptical human at the helm of most AI use is not going to go away and I doubt is going to be an admin activity so likely having someone taking specific ownership of each layer makes sense and developer or quality engineer can likely do that role.

I’m not sure if this will be similar to current validation process though those skills will apply as I suspect it will need more than current, its not just a validation activity at least for now. It’s a throw away, refine and expand activity so decent automation and testing skills.

I’m also not sure about usefulness outside the automated coverage, what uses did you have in mind? Testing as a learning, discovery and investigation activity may become even more important as the machines take over the scripted coverage.

I remain skeptical whether this path will occur but I’m not ruling it out as things are moving fast even if the learning of the tools remains slow or at times non existent.

1 Like

Mmm, I wonder parts of my thoughts on this might make separate topics on there own?

1 Like

I have evaluated multiple AI tools for web and api , also had discussion about the coverage, most of the paid tools provide assurance for 80 % of test coverage, when asked on what basis claiming 80 % , then not getting any solid answer.

Yes , human touch is required to evaluate on AI generated test cases, and during this human evaluation found:

  1. Duplicity in test cases
  2. Ask AI tool to generate test case for same requirement, every time getting +/- in number of test case & test steps
  3. Provided good Requirement document, where comes enterprise level complex scenarios, it failed to provide test cases

Attended couple of AI tools demo / conference / webinars: everyone is showing simple use case to buy mobile / laptop from online shopping portal , or Login functionality to UI with different data sets.