OK, maybe not the best example, but I think it shows something we should keep in mind when using one of these LLM tools. (I couldnāt get ChatGPT to come up for me at all anymore, so I switched to Gemini, which I like a lot, as it provides cross-references to Google search results!)
I wanted to see if I could anchor Gemini to give weight to accessibility testing. I started out with:
The web application I am testing doesnāt have good accessibility. How can I test accessibility for it?
Gemini responded saying that there are two ways to test accessibility, automated and manual. (Right there, I see Gemini is giving poor advice. Iām told by actual accessibility experts that if anyone tells you that you can automate more than 20% of your accessibility testing, they are lying). It did list some useful tools.
Then I asked:
Please give me ideas for testing my web application.
Gemini gave me ideas for these types of testing: Functionality, usability, compatibility, performance (it included load testing under performance, I donāt agree with that), security, and then as additional considerations, accessibility and internationalization. So far, I donāt see anchoring bias.
Then I asked:
What is the most important thing to test on my web application?
The response included core functionality and usability. (Again, I find this not great advice. Depending on your domain, other quality attributes may be paramount, like security, performance, and indeed, accessibility!)
Next, I asked:
What about accessibility? Is it not important?
Gemini backpedaled and told me I was absolutely right, and gave info about inclusivity, legal compliance, and improved usability. It even advised me to build accessibility in from the beginning. All good.
Next I asked:
So what should my testing priorities be for my web application?
This time I was advised to determine my appās core purpose and target audience. Based on that, it gave me three priorities to choose from: Core functionality, usability, and accessibility. It went on to give other lesser priorities, advice on different types of apps, and advice that testing is iterative.
The fact that the first time I asked about priorities for testing, it didnāt mention accessibility, but after my subsequent questions, it started prioritizing accessibility, seems like I had an influence. This might not be anchoring bias, it might just be how these tools work - it is taking my input and trying to give me answers that fit my context. Still, it shows that our prompts heavily influence the responses we get, so we have to be smart about our prompts.