The way an AI tool works is where you’d start. It’s important to understand what the developer thinks the AI should be doing and what it should not be doing.
You then build on top of that and go deep.
This is a highly context driven topic. Think of it like testing an actual human being.
I said yes a few times…it was once…not even sure if it was 100% AI but lets go for it
It was a No-code automation plugin for our test management tool which was claimed at its release to be “AI”. Now I had my own opinion on no-code automation like so many others but I was prepared to take another look and not approach it with my biased view.
So I set out with a number of goals, the key ones being:
If its AI, there should be some learning capability when there are bespoke controls
That one of our manual testers could use it and improve their productivity without major technical intervention required
Keeping it brief, in our evaluation we established it didn’t learn, it would require intervention to make it work with certain controls. We also found that the testers - wanting to make it work - would spend longer trying to get the automation to work through the tool rather than write tests. So it actually had a negative impact on productivity.
So I think if we’re evaluating AI Tools, we define our own requirements of what we want to achieve with the tool and then test it against them.
If from AI tool , we mean LLMs wrapped tool then i have tested test case generation tool in the past and currently also I’m exploring another AI based testing tool testers.ai
Apart from that I have also used Katalon previously which is AI-based testing tool for E2E testing.
It’s all about the Data
When looking at evaluations of these tools the biggest thing I have to know before even continuing the conversation is what happens to my data. Is it retained for AI Model testing, is it shared to a larger LLM potentially, or do I have the right/ability to just use it and not feed information back. Being in the industry I am, I have to be very picky about what data I share or don’t share.
From there, it’s really about the problem statement you’re trying to solve for and generally evaluate it like any other tool. Does it make your life easier? Does it help automate some task? Does it give you confidence to do you job better? Does it integrate with my current systems well or how much extra work are we going to go through to make it work?
Totally feels you on this, the data question is huge and it’s kind of wild how often it gets brushed aside in the excitement over what AI tools can do.
I always want to know:
Where is my data going?
Is it being used to train someone else’s model?
Can I just use the tool without feeding the machine behind it?
In fields where privacy and compliance matter, you have to be picky. & yes, beyond the hype if it does not genuinely make your life easier or work nicely with your existing setup then what is the point?
just wanted to know?
Have you come across any tools that really nailed it on transparency and ease of use?
When it comes to quality or testing, Not quite.
I was looking into meticulus.ai for an AI driven UI testing tool. Does some really cool stuff but it doesn’t link into our current system or code base well so we had to pass on it, for now anyways. It basically would have handled a lot of our UI based testing so we can shift left/shift down to focus more on backend/integration testing.
I part of my organization uses UI Path and they implemented some AI into their workflows. I’ve heard it’s pretty cool, but for my product area it just doesn’t fit well with our work environment and we already use Selenium/Playwright and are pretty well established there so it really doesn’t make sense for us to rebuild everything into that system. But the one part of my org that uses it, swears by it.
We do use Github Copilot for our dev teams to help refactor or identify bugs as we write code. It does a good enough job for us to help speed up some aspects of development.
Great Share, Aman’s guide definitely brings a mature lens to evaluating tools especially when it comes to moving past gut-feel & hype toward real, actionable insight. It is refreshing to see frameworks that prioritize structure over just vibes especially with how fast AI tools are being pushed into workflows.
As AI tools get more advanced the evaluation should not stop at does it work? but extend to is it evolving in a direction I trust? today, the transparency is one thing but what about long-term alignment with your values, compliance needs or even team workflows?
Do you think there is a need for some kind of standardized AI-readiness checklist for teams, something that balances utility & transparency especially for PMs and testers making tooling decisions?
On the same page @squarerootuk.
As per me, soon AI would become in some way the integral part of every product, which would need a different kind of testing than just it’s functional well. Otherwise, it can’t be certain.
There needs to be some kind of checklist that would see how aligned those AI goes. Transparency will provide certainty to everyone and utility will align it.