Agentic AI, misgivings and not being seen as a blocker to progress/speed

My company is adopting Agentic AI wholesale, via Claude Code.

I’m being asked to lead the charge on the quality/testing side.

However, I have strong misgivings about the role of gen AI within software quality!

From the ethical basis of gen AI in general, to the strong feeling that code isn’t the best way of checking code, the persistence of error problem (bad requirements = bad code + bad tests = bad deliverable), and the fundamental worry companies see AI as a way of reducing staffing costs and that by doing this, I increase the risk of putting my team out of work.

The problem is, I’m trying to be a positive and proactive member of my department. I want to support the business direction. I believe I am likely to lose MY job before anyone else, if I’m the one person pushing back and saying no. Also… this is my industry, and my discipline. It’s moving in this direction. I could change job, but sooner or later they’d be doing the same thing. I can’t be the 2026 version of an agile-naysayer, or I get left behind.

TL;DR: how do I get on board with AI in quality, what’s the “best in class” level of agentic AI support within quality, and really, what can you tell me which will help?

5 Likes

In the end AI is a tool. When adopting a new tool it’s important to understand what it can and more importantly what it can’t do. Learning how to work with it and for which tasks it is suited. Find the ways it can help you do your work and be aware of the risks.

As testers it can be harder to get into AI because we can be hard to flash with fancy tech. We tend to see the issues or find the limits. Many of us have the talent to make software break when we use it in ways it never does for others. The more I see AI code with less human involvement the more I’m convinced that you can’t get rid of the human in the loop.

Here is what I like to use AI for as a tester:

  • helping me understand projects/code
  • develop test automation code as a mostly manual tester
  • a second opinion or for brainstorming
  • test data creation
  • raising detailed bugs from just a couple info points
  • writing fancy work emails

I can work more independently with AI, finding answers quicker and reduce time spent on things I don’t like to do.

4 Likes

I have been - and to some extent still am - in the same position as you. 6 months ago our CTO decided the same, lets go with Agentic AI (we use Augment).

My role is to be sceptical and I’m proud of that, so my fundamental question was “Why?”, “What problem are you trying to solve?”. The response was “Its the way the industry is going?”….I couldn’t accept that as a valid reason. So I had many a row with my CTO about the fact it was directionless. We didn’t just have Devs playing with Augment, but product and in fact anyone who said they were willing. However, I kept pressing because I always prefaced my pushback is “My job is to focus on quality right? I’m saying I see a quality risk with where this is going. My job is to explain why I see that, but you don’t seem to want to hear it”.

Now I wasn’t against Agentic AI at all, I was against the motivation behind it, not the tooling. Ultimately, what are we trying to achieve by using Agentic AI and how will we know we’ve been successful? If we were talking about any other tool that involved a license cost, we would have had to put a case for buying it and what the benefit to the business would be. But the CTO just bought it and told people to use it. No cost/benefit case necessary!

So 6 months in this is what I established:

  • The developers had very different adoption. Some didn’t use it because it wasn’t great with C++, some used it but were cautious where and some were all in - almost vibe coding.
  • The sprints of the high adopting teams slowed down - for exactly my concerns around quality. They could change code quickly, but when we tested it we would find flaws quickly - some bugs were so obvious that our reaction would be “hang on, why didn’t you see this bug when you coded it?”. The problem was exaggerated by the turn around time to fix the issues, because the devs were trying to use Augment to do the fixes.
  • The devs on rare occasions when using augment could be guilty of scope creep. I.e. if they were working on a bit of code for a ticket and there was an underlying bit of tech debt they never got around to, they might ask augment to look at it. So there was an impact to quality at times there that again, slowed us down
  • Those outside engineering where using augment for analysis, prototyping and workarounds. Useful no doubt, but nothing directly improving the engineering throughput

So in summary, my job is to be sceptical and focus on quality. I’m accepting that the AI agents are here to stay but I’m highlighting the risk to quality in how we are adopting it. I’m showing clear metrics that we are not actually getting faster, we’re getting slower and I can prove that QA/QE are not the blocker in this. So I’m showing my CTO, that their implied goal of “getting faster” isn’t happening. Hope that helps :folded_hands:

2 Likes

A great quote I got from someone at my company (who likely took it from somewhere else) was “AI won’t take your job, a person who knows how to use AI to do it will.” It flipped the proverbial switch in my brain at the time, as I too was on the pessimistic side of things AI-y.

AI, at the end of the day, is a tool to help do a job. Sure it’s getting pretty good at doing those jobs, but at the end of the day there’s still much that needs the human-in-the-loop, be it code reviews, testing, whatever. Understanding where those limits are and helping teams to identify them and where the focus now needs to shift to is so so valuable. Learning how to get those quality outputs from the agent and championing the approach goes a long way (solid testing approaches like TDD, BDD and mutation testing are actually great for getting agents to do the things well).

For actual day-to-day testing help from AI, there’s plenty of opportunities. Some, but not all, of the things I use it for are:

  • Helping transcribe my session-based testing sessions so I can focus on the testing
  • Helping to understand and document codebases that I’ve not previously worked with
  • Ideas generation - i.e. giving me a starting point for exploratory testing
  • Writing automated tests - both functional and non-functional
  • Bug reports and test case generation
  • Creating live-like test data

It’s not a magical solution, but it helps to reduce the tedious or the admin-y side of the role and gives me time to focus on helping my team(s) understand what Quality is, and really champion it across the board.

3 Likes

Very commendable, you protecting your team and the quality of product!
I have no experience with Agentic AI products (yet) but a colleague visited a conference on this topic and informed us on it. According to these informations you are right, bad input means bad code, something AI can’t fix. In regards to quality you’ll have the “usual” AI problems, eg. a AI will test your product based on the tests it designed itself (this being the “agentic” part) but what can happen is that these test will run green beacause the AI anxious to please it’s master has changed the test cases previously designed to fit to the problem of the software. (“Master wants a test run that’s green, these tests run red, so I’ll change the test cases to run green”) So, much like a child you can not leave the AI to work autonomously. To me this is a principle established with human testers already: when a tester builds test cases usually you would have a four eye review (or six eyes or whatever). Discarding these principles in the mad belief that AIs are a kind of superhuman tester that somehow controls itself is childish or, if management already knows that, a cheap ploy to shift left to the detriment of quality.
To me, AI is a tool which can do “dumb” work and that could include test design. But it’s not a complete team that checks it’s actions. A (human) tester must be responsible for the output and has to constantly check it to avoid the usual AI drivel. And it can not replace bad input. Sorry management, you still have to work!