How are teams perceiving GenAI/LLM today?

A friend told me today that his teammates believe that AI is a solved problem and there is no risk to using AI to build their apps and indeed using AI in the app itself. And most people I know see many potential perils and pitfalls in GenAI/LLM tools and assistants. And, the teams I have been working with the last few years are not using any AI tools at all. I believe AI is here to stay and we all need to learn how to make it an effective part of our toolbox. But I’m concerned that there’s a perception out there that this is a done deal, we can safely use AI tools. What are other people seeing, thinking?

4 Likes

We use some AI tools and incorporate some into our product. There is always a risk (see Gemini CLI) and we still need to be aware that there are limitations. It does not solve all the problems, but it does help. We limit our generative AI to only use specific content to avoid hallucinations in the product, but it can still get things wrong.
We use github copilot as an extra layer of code review. We don’t take it as gospel, but it saves code reviewers some time if some of the basics were already checked. It can also help clean up code.
A tool is only as good as its user.

5 Likes

I love this sensible and practical approach!

At the moment our approach is somewhat similar. All the inventions that there have been, are for our help gladly. However we cant trust them 100%, there is always a human input involve. We use Github Copilot for reviews + some coding & automation test writing, but its more like ‘guiding an agent’ rather ‘relaying on it’. It definitely saves time and human effort!

Its also up to us to keep the confidential information saved, importantly to use licensed reliable AI agents! But AI is here to stay for sure! :slight_smile:

Edit: I would also like to emphasize that it also depending upon the need of the product, sometimes industry demands to use some kind of AI agent to showcase to wider audiences and to highlight impact in the market!

2 Likes

Maybe send them to a link to this collection that I’ve been building up. :face_with_hand_over_mouth:

2 Likes

and may be this too :smiley:

1 Like

@lisacrispin

I’ve seen folks running the gamut at both ends of the spectrum, some charging full speed ahead thinking it’s an opportunity for everything, while others remain skeptical and wary to even try it.

The truth is, AI is not yet a “solved” problem: It is a powerful tool with detrimental consequences. It can be used to increase productivity if done thoughtfully or to create errors and bias, with the detriment of blind spots.

The key is in balance - experiment, but question. Adopt, but verify. Use, but understand.

Would love to know—how are your teams thinking about AI these days?

1 Like

Its in fairly common usage where I am, designers, developers and also building AI apps for others. We have active training on this front and everyone is going through EU AI Act training that contributes to that “safely” aspect.

I did a course “elements of AI” back in 2020 but its only now that I’m skeptically embracing/experimenting with it on the testing front, but its still go a way to go to be a done deal.

What I am finding that may not initially be intuitive is that I am actively using a lot of critical analysis and thinking skills as I use it and I’m fairly confident others within my company do the same and that for me is a key element for safely using these tools.

In the wrong/its just magic thinking hands or AI being left to its own devices it remains high risk.

AI llm’s not learning is understandable but restrictive in my view not just from that whole idea that for me AI should be self learning, spend a day working with AI trial and error to solve a problem and it does not learn from that exercise so the next 100 people with same issue with also go through that trial and error flow. At least until the model is update offine.

1 Like

I’ve been experimenting with some tools recently with mcp and llm and what I noticed is that as a copilot for automation it seems reasonably okay for the basic things as its similar to developer use which has had a lot of attention.

“Go to this website, scan the site, generate some tests, include accessibility tests using axe, use pom structure, reiterate and refactor until all tests pass, oh and while your at it write both a test plan and test report summary.”

It quietly surprised my on that capability, yep some rework required, some suggestions made but I’m a very light automated UI coverage advocate so for the basics this showed some initial potential.

The part though I want to investigate further is using the same tool not for test cases or scripts but for more investigative and experimental testing. “Explore the website using tab only and flag any points of interest or problems discovered”, “here’s a pre-made tapping risk list prompt, explore the site and experiment with taps and clicks”, its whether I can go further than this where I just tell it the experiments I want it run it whether it can do those fairly well and whether it speeds up my own investigations

I’d like to see more on the latter side of things but it may struggle as most cannot real time learn and uncertain how it deals with unknowns.

I’m jumping way ahead of course due to my quiet surprise from a couple of days looking at it, either way its peaked my interest even if it still needs a critical thinking user at the helm.

I’d been interested though how others are using the tools in exploratory test sessions.

2 Likes

AI is there to help, not solve problems.

As per my observation, a worrying number of people see it as a problem solver. I’d go ahead say the “novice” user thinks more in along those lines than the smart user.

2 Likes

I have observed rampant stupidity in the IT sector for more than 45 years, but it is undoubtedly getting worse. The idea that “AI is a solved problem” is so laughable it’s not going to be possible to have a sensible discussion with someone who believes it.

Just today, I learned that ISTQB has a Certified Tester AI Testing (CT-AI) certification. WTF do ISTQB know about AI testing? They don’t even know how to do regular testing properly. No one knows how to do AI testing well, although Bach and Bolton are at least researching the topic. If others are doing credible research, I’m unaware of it.

I’m not averse to the idea of using AI-based tools when they are credible, but I’m going to test the hell out of them before doing so because my current trust level is close to zero. I only do accessibility testing now, and I’m not aware of any credible AI-based tools for that.

Much hinges on how bad you are willing to allow your work products to become. Many people seem to be content to churn out mundane AI slop in the form of CVs, blogs and articles, YouTube videos etc. Some developers seem happy to churn out filthy AI-generated code. But as a tester, every single thing I do has to be right. We can’t test everything, but stakeholders must be able to trust what we tell them. I worry that AI in testing will conceal risk.

I don’t see any evidence that AI will help us do better work. It can do some things faster, which is certainly useful in some contexts. It can help low skilled people do slightly better work than they would otherwise be able to do, although that is at the cost of them never increasing their skill level and becoming eternally dependent on AI support. But does it help skilled people do anything better?

With most new technologies, you gain something and lose something. Everyone is focused on the gains, but we need to pay attention to what we lose.

I was looking for a collection on it but obviously not correctly! What’s the best way to find a collection on a given topic?

If you do a search then you can filter on content type on the left hand side.

I’ve highlighted the “collections” filter in the image below.

1 Like

I had a lot of fun reading through them yesterday. Thanks Rosie :grin:

1 Like

To me its common sense. Use Gen AI as your assistant, not as your leader. You mark its work, it doesn’t mark yours. But common sense, isn’t necessarily common practice.

But I have faith that common sense will win as I have faith in people to make the call on how to get the best out of it, some quicker than others. I’ve been involved in the developers trialling a new AI Agent with the remit of “find out what it can do” - no limits. What happened was the developers naturally reigned it in, established where the agent was useful and where it wasn’t, and the agent became just another tool that had strengths and weaknesses.

At the end of the day the one overriding factor is that people take responsibility for outcomes.