Introduction:
In software testing, setting the right context is crucial for obtaining accurate and relevant responses from AI tools. By clearly defining the role and the specific task, you can guide the AI to provide more precise and useful outputs. This activity will help you practice crafting prompts that set the appropriate context for various testing tasks.
Purpose:
This activity aims to enable you to understand how to frame prompts effectively by setting the right context, which will improve the quality and relevance of the responses you receive from AI tools during testing.
Activity Steps:
Select a Testing Role: Choose one of the following roles to frame your prompt:
Tester
SDET (Software Development Engineer in Test)
Requirement Engineer
Coder / Programmer
Utility Developer
Test Lead
Test Manager
Define the Testing Task: Here are three sample testing tasks. Select one to use in your prompt:
Task 1: Identify potential edge cases for a new login feature.
Task 2: Create test scripts for the Amazon add-to-cart feature using Selenium with the language of your choice.
Task 3: Review and share ideas to improve the test coverage for an e-commerce website.
Craft Your Prompt: Combine the selected role and testing task to create a prompt that sets the right context. Use the following format:
“Act like a [Selected Role]. [Additonal elaboration of the context].
Your task is to [Selected Task].”
Execute the Prompt: Input the crafted prompt into an AI tool (e.g., ChatGPT, Gemini, etc.) and observe the response.
Evaluate the Response: Assess the quality and relevance of the AI’s response. Consider whether the context was clear and if the output aligns with the expectations for the given role and task.
Share your observations: Share your input, your output and your thoughts in reply to this post.
For more information and tips on using Generative AI to enhance your testing check out my Prompting for Testers course with MoT.
From my experience, selecting the right AI tools is also very crucial. I’ve encountered many AI tools that sometimes produce irrelevant results, especially when it comes to images. For example, Canva and LinkedIn image generators often fall short of expectations, and even images generated by ChatGPT are unrealistic.
I tested both Gemini and ChatGPT, and found that ChatGPT provided significantly better results compared to Gemini.
Another observation is that we need to manually review every word generated by these tools. For instance, while using ChatGPT, I noticed it sometimes used “with” instead of “without,” which changed the entire context of the result despite the correct prompt. When I pointed out the error it admitted the mistake and regenerated the result with the revised text.
These tools are very handy while understanding many errors that testers notice while testing the software and may not be understandable in the first place.
For test cases, these AI tools generally focus on functional and security testing, and we need to specifically mention in UI/UX considerations. I believe there is considerable room for improvement as these tools continue to evolve.
I agree with your point about us being the real “masters” of these systems.
LLMs & Gen AI is essentially a probability-based word generator. If you set the temperature high, you may get more diverse (creative or totally irrelevant) responses.
The point about specifically mentioning “UI/UX” is essentially the essence of this post, i.e., setting the “right context”.
Act like an expert software tester who works on designing test scenarios for their team.
Your task is to identify potential edge cases for a new login feature.
Context: the login feature is for a document sharing platforn. The documents must be stored securely and only be access by the owner (who created the document), the editors (who can edit, but not delete the document) and the viewers (who have read only access). All users must be logged in to be able to view the document, with an email address and a password, and, optionally, MFA. The login feature must be easy to use for non-technical users.
Normal non-edge cases, such as happy cases and login with wrong email or password or invalid MFA code, re already done. I am specifically asking for edge cases.
I tried chatGPT and mistral. Both came with good test scenarios, chatGPT slightly better.
You can just combine the responses and you have more test scenarios.
It’s a lot of test scenarios, you will need to prioritze them and remove those that are out of scope. Best to do this during refinement already I guess.
I imagine you get fewer (relevant) test scenarios for a less common and well-known feature.
Refining the prompt by adding more requirements of the feature and mentioning a few things that were out of scope helped a bit, but not much.