Inspired by a post from @maareton a LinkedIn post where she raises the positives of GenAI Pair Testing:
’I’m manual tester, genAI tools aren’t at all helpful to me’ speaks of not understanding the core role of reflection/introspection to testing. We can both be surprised of things it ‘knows’ and hate-inspired to do better ourselves. Pair test with it. Works on your schedule.
Review requirements and adjust acceptance criteria
Create test cases & scenario’s
Think about edge cases
Predict where bugs will occur from developers; based on:
Previous data
Often forgotten things
Create Dev Code
UI => IDs and locators
API => Specs & code
Create POM Files
Create test scripts
UI & API & Unit
Create performance tests based on Flows from API tests
Create CI/CD
Create containers
Creating pipeline files
Reporting
Test case reviews
Pass/Fail
Performance Testing review
Make powerpoints
Make Graphs
Review multiple iterations
Write documentation
Monitoring & Alerting
Tell where it’s going to be needed
Reviewing of Logs
Anomaly detection
I wanted to do more but I left the company and now I have to restart XD
Was it perfect? Absolutely not in the beginning. I made separate AI Agent for EACH Role and each Job (eg: performance testing, automation, analysis, etc…)
At the beginning the output was bad but that was because my prompting and trainings were bad. I trained each agent more and more with online content and internal content (careful that you don’t share secrets).
After ~2 months the results were amazing. I felt like a DevLead who did 80% code review and wrote 20% himself. I was the “AI Reviewer” for 80% and I added 20% myself.
The main problem is often:
You need mature content of your project and almost nobody has that. You’ll need to change a lot of structure internally towards your User Stories and way of working.
See it as the receipt of the smoothy.
You need a very good specific AI Agent which is trained on things you specifically want for your project.
You want it to be fed with online YT videos, blogs, posts, internal content.
You need to describe your Agent, but you really need to describe your agent.
When I made the AI Agents, I started off with maybe 3 sentences of what my AI agent should be and how it should act. Eventually they were almost a full A4 page long, which made it so much more specific and perfect.
The pro part is, it’s basically the description of your of your AI Agent and not really the “prompt” itself. It’s a continuous journey to edit the description.
You need to prompt
Asking "can you write me a test case for X or Y " is not going to cut it. You need to add a lot of detail to it. (You can make custom commands for this, so you don’t always have to re-write it helped me a lot!)
It’s not going to work from the initial start
You need to train train train the model. It took me a long time to get decent outputs, but overtime you can see it improve as you get better at prompting and describing what it needs to do.
Fun Part:
You can integrate AI Agents with webhooks towards JIRA or Azure and then you can teach it that the requirements are inside of your user story, so you can create workflows between your AI Agents.
Workflows?
When one reviews the user story, he can potentially add new acceptance criteria and if it’s done, he trigger another AI Agent to write test cases, he could then trigger another Agent to write automated test scripts, etc etc… so many options.
In general, LLMs can serve you with testing in providing you fresh insights on test scenarios and how to perform the testing. If not that, it might just even be about a simple check that you did not miss any obvious scenarios.
Negative aspects involve relying on them too much, use it as a junior that you want to check, at least till you understand what are the strengths and weaknesses of it.
Regarding prompts, it’s most important that you give it enough context and a clear instructions. Being clear about it’s role and what you expect really works. ‘You are an expert QA Tester that writes testcases with clear steps and per each step an expected result’ for example. Or, explain that you want BDD scenarios and the rules applying to those. Then give it enough context.
Honestly, there are sufficient tools out there that are optimized for test case writing which I would use rather than copy pasting these things in ChatGPT or whatever. For example, this jira add-on creates the test case for you, then you can adjust it to your needs. No need to write instructions or copy paste the context in ChatGPT.
How to support gaps in your knowledge? Now that you have options to connect LLMs to the internet, it’s easy to gather more info by simple questions. Product specific knowledge is a bit harder, since it needs to be a very well documented product that has sufficient coverage on the internet to be trained on. For testing best practices etc, it works quite well to ask questions and let it browse the internet for the latest best practices.