How are you actually using AI in testing right now?

On our side, it started small. Generating draft test cases from requirements. Summarising long bug reports. Suggesting edge cases we might not think of immediately. Nothing magical, but it shaved time off the repetitive parts.

More recently, we’ve been experimenting with using AI to review requirement changes and flag which existing tests might be impacted. That’s been surprisingly useful, especially in larger suites where things quietly drift. Some tools have started baking this in directly, such as test management platforms that can analyse gaps or redundant cases and suggest updates instead of forcing manual audits.

For teams leaning into AI in testing:
Are you using it for test generation, maintenance, flakiness detection, coverage analysis, something else entirely?
And has it genuinely reduced effort, or just shifted it around?

2 Likes

One of the things I’ve found is that a lot of people are using it for problems I do not have, this risks a knock on effect that if I had a manager who was seeing all these posts and seeing people get a lot of savings or 10x multipliers they could be asking why I am not.

The answer to that is that we likely have already been for the last decade 10x to what some teams are doing.

I have not written test cases in well over a decade, I am not doing mundane boring tasks that machines could do quicker and I do not have flakey automation suites that need self healing. Okay so by not doing things saying I am already 10x could be a bit of a push but it’s definitely a discussion.

So where do we then use it.

General Automation - same as a developer would use copilot - I’d say fairly significant gains, it also gives me a bit of coding edge, it’s better than me at coding.

Web UI new automation. - Playwright agent usage here - often where it was deemed too costly to have this for the ROI - now this can give a basic health check with model and in the CI in a few hours. So not so much a saving but extra cover for low cost.

Vibe coding building small tools or general copilot - data generation, building a mutated code base for testing how good the automation tests actually are, adding an english translation to ease my testing. I’d add to this root cause analysis - tell it the issue you are seeing and it can often find the piece of code and offer suggestions. Code access is key to gains on this.

Research and ideas - the questioning going deeper aspect really matches testing. Whilst this speeds up the activity it tends to give me more ideas to go deeper so increased coverage rather than savings. This is likely my biggest usage.

Dev buddy copilot. I build a lot of apps locally, lots of installations, dependencies, setup and configs, windows versus mac, IOS, Android. When I hit blockers before I would take developers time, not AI is helping me do this much quicker with less interruption for developers.

Testing Agents - Spending a lot of time experimenting here. Getting some okay results on some specific things like accessibility where it does a lot of scans and some exploration via mcp. This for now takes time but is interesting. I am still not so clear on capability for the basic test cycle though.

“Risk hypotheses, investigate and experiment, interpret findings and review, revise hypothesis and loop”

If anyone has a tool that does this, please do a write up - my buddy is experimenting with the security tool Shannon and he’s been impressed to a level with its explore ability but its still not full loop and needs him hands on.

Some will not have the challenges I have.

If you do not have code access just getting code access will offer benefits before you add in AI and then it in my view will offer more.

Note I often test four products a day, a 10x so going to 40 products and my brain will pop so it’s not a goal, fairly fast already.

1 Like

Yeah, I’ve been seeing a similar shift. I’ve been using AI-powered automated pentesting tools like ZeroThreat AI, and it’s genuinely changed how I approach testing. It takes care of a lot of the repetitive discovery and triage work, and I’m getting much higher-signal results compared to traditional scans.

It hasn’t removed effort, but it’s definitely moved it to where it matters more… for business logic testing and validation instead of surface-level checks. To put it simply, I feel like we’re finally moving from just finding issues to actually understanding risk faster.

1 Like

Nothing special, just limiting it to help automate little tasks :slight_smile:

When you say “reduced effort” or “x10” time savings. How do you prove this assertion to your managers? And, how do you answer to the expected question: What did you do in the time you gain?

When all the industry is measuring AI with this kind of metrics, something is really really wrong.