As a QA engineer, I hardly got to use AI (Claude) for a short time before I was let go. A teammate had developed a Claude skill which wrote “high level” test cases and also automated those test cases. The automation seemed to work but was a hit or miss. E.g. Element locator code was often repeated in tests instead of being put in the obvious page object, even though our code was well written (OO principles mostly followed, rare static waits etc). I could not inspect the skill and see what it did under the hood. Maybe the skill could be updated to make it organize the code well and produce better tests.
I now wonder what is AI (Claude or not) useful for in testing and for which things it is mostly/completely useless. Since I cannot see it being used at a workplace, I don’t know how to figure out if AI is useful for testing. Is there any reliable article on this or any Youtuber who continuously evaluates AI for testing ?
“Hit or miss” being the operative word here. I haven’t heard of an AI tool that does the assigned work without errors (apart from simple tasks like parsing). More complex tasks like designing and coding test cases is something that needs human oversight. The question is if the human is quicker than the AI that way?
Not sure about the skills part, depends on how well or specific and clear the instructions there are. In terms of prompting the AI agent to do stuff, you have to be very specific in the objective and task for effective output. The more narrow, specific and focused, well defined the prompt instructions, the better it does. This was my experience with software development and unit tests, assume it is similar for test automation.
I have heard colleagues use it for debugging and bug analysis. If the agent has access to the relevant codebases being tested, and you prompt it well, it could help you (or developers) figure out where the problem lies and suggest potential fixes as well. So it’s not just a tool for test automation.
But this is still new territory, so we have to explore and figure out how it can work best for us. Hopefully there are more articles/posts over time with tips about AI use with testing that is more than just about test automation aspect.
This is a fair question of any new tool or technology that may or may not have impact on your work, positive or negative. I can tell you how I’m currently using it and how I approach the question.
AI as a Resource:
In the early days of ChatGPT I honestly didn’t think about it as a resource. However, after seeing some colleagues discussing their use of it in troubleshooting technical issues, I gave it a try. Through very detailed interaction and providing specific links to technical specifications, I found I was able to solve some problems faster. I subscribed shortly after to get the full power of ChatGPT and I’ve been using it ever since for coding and scripting ideas. I still have to do the bulk of the work, but I can get from A to Z faster and this frees up time for other tasks. I also use it as I previously used any search engine, approaching with a question and gaining knowledge about topics by letting ChatGPT do the heavy lifting of searching and cross-referencing, and allowing me to focus on the path to get to the answers I need.
AI as a Testing Tool:
We just starting using Keysight’s Eggplant DAI tool and I’ve almost completed my first exploratory model for testing our website. Conceptually it makes sense to hand over testing of the user-centric functionality to an AI model because both manual and automated testing does not truly mirror the behavior of a user on the website. There are likely a myriad defects that could be revealed through a more random and “human-like” approach to testing. The expectation I have is that, with a full set of functions that define small chunks of website usage, over time the AI model will begin to do things I wouldn’t do in testing but may mirror actions impatient users perform, as one example. Additionally, at some point I expect to see each run behave differently and somewhat mirror different user types. I could test all day long in a random way, and I’ll still be me. AI has the potential to produce testing profiles that can “be different people” during each run.
That’s my limited experience thus far as a tester with AI tools, but I see the potential and I see the need for the human element in all of it. For me AI is a very well-made wrench but it still needs a human hand to move in the right direction
Where I find AI useful is not “let it own testing”, but using it to increase QA capacity.
For example, I use AI to help generate deterministic UI-driving code that QA can use for API regression testing. The UI flow exercises the real product workflow, while the test records and checks the API behavior behind it.
That gives QA something more useful than just a generated test case: a repeatable workflow, API evidence, and a basis for deciding whether a change is safe to release.
Where I would not trust AI is owning the test strategy or release decision. It can help create the test
code and analyze results, but QA still needs to own what should be tested, what counts as meaningful evidence, and whether the signal is good enough.