Demand for QA will soon explode. The flood of AI-generated code will require rigorous testing and validation.
And the author emphasised the following:
Increased emphasis on QA
Innovation in QA methodologies
Shifting QA roles and responsibilities
Software development blurring into specification writing
Emphasis on soft skills
Iām curious about this all and still trying to wrap my head around how the role of the tester may need to innovate and change with the growth of AI-generated code.
How about you? Where are you at? How do think the role of the tester/QA person will adapt? Have you seen a change already when working with applications that have been supported with AI-generated code?
From first hand experience the A.I.'s output is very unpredictable.
Large language models like ChatGPT are very very big - and sometimes that leads to unpredictable behaviors.
For this reason people are creating more and more guardrails around generated content.
TBH - AI is a game-changer for all of us in the tech world, and others too!
Unlike traditional coding, AI is all about learning from data and making predictions. Itās opened doors to cool stuff like image recognition and language understanding, which was unthinkable before. But itās not all roses - AI can be unpredictable, and that brings a new set of challenges. So yeah, with AI, the āimpossibleā keeps getting redefinedā¦and thatās unpredictable!
And when thereās unpredictability, thereās QA!
I agree and disagree in equal measure with the points made by this fellow on linkedin, Iāve probably been building software for too long.
But at least todays weather prediction was spot on, itās raining in normally dry Cambridge this morning. When anyone makes predictions and your prediction is that āthings will changeā, I get bored; but points I do side up with are:
The responsibility of QA shifting, being more present in the āgo-liveā decision process. If only as a way of feeding back production defects more quickly, getting your QA team up to the coalface can only be good for everyone.
Emphasis on people skills, is for me key, maybe because Iām getting old, but that has only taught me, that people write the code together. No X10 developers, coding standards , frameworks, nor clever processes can guarantee victory, cohesive teams do. Especially in todays isolated remote-working mode.
But I generally see AI as a thing of scale, not as a fresh enemy, rather as growing up and coming of age. QA are going to have to pick up the baton, if they donāt, then someone else will. And Iām talking about using large-dataset systems, learning to work with tonnes of statistics and turn it into product knowledge or insight. Gone are the days when users would report bugs, today they just swap platform, and for QA to find new ways to detect the āunreportedā bugs that cause users to leave may be just one of the ālarge-data-setā tool areas we need to grasp. For example: Use our test workflow analysis mindset to help orgs build using analytics to uncover workflow issues in a product. I donāt think QA people will suddenly find they have more jobs in the marketplace, I rather believe they will have to work alongside a different set of people in the org.
/edit My bias comes from how ChatGPT-4 is much like iPod was to .MP3. Nobody outside of tech circles knew what music compression would mean, until someone commercialized it. Mp3 as a format changed how artists distributed their art, and more recently how they built it. Some of the impacts will lag a lot.
My relevant experience is not with AI, but with testing a complex system weāve designed and built with lots of internal rules that affect the result. The similarity to AI is that I test it by trying it out on examples and assess the results via high-level acceptance criteria and my own judgement rather than specific requirements specifying the expected result. However, I also have the option to inspect the inner workings (to an extent) and manually configure certain variables and see how they affect results.
In addition to greater visibility during testing, if we donāt like the results for a specific input, we can identify which rules are causing it and design changes to improve the behaviour. I donāt see this being possible with AI, though as I said I donāt have experience with it.
One of the big problems Iām facing with testing this system, which would apply to AI too, is getting a good overview of behaviour and how it changes. Ideally I want to see results from many different inputs and an analysis/overview of key features, and the ability to compare todayās results to yesterdayās with any differences highlighted. I guess this big picture, as well as the details of the inner workings, are the things that may present difficulties when testing AI.
My fear is that, as already before LLMs, people will underestimate the necessity for testing (and also overestimate automation).
Especial as LMLs output sounds most times quite reasonable. Which is dangerous when the content is just made up.
People should make more use of testing, but if they will?
Therefore they at least have to acknowledge that LLMs are not a magic wand.
So far I have no interaction, nor necessity to, with any AI.
With AIs changes one more time the subject of the testing, but the basic principals will stay the same. Finding problems which matter to humans.
It is always hard to predict what will happen, but it is still important to imagine what might happen. Thatās why Iām enjoying reading this thread.
We can influence what happens with AI in QA if we think creatively about what might happen and what could happen, and use those ideas as we make decisions about where to spend our time now.
As everybody here already knows, we can influence how AI itself develops if we interact with it. Several months ago, I tried to get ChatGPT to write a poem in iambic pentameter and realized it couldnāt. So I tried to teach it. I started with āWrite me a sentence with ten syllables in it.ā It couldnāt, so I tried some more prompts like writing words with certain amounts of syllables, etc., and got an idea of how well it could deal with counting syllables (very poorly at the time).
A couple of months later, I asked it to write me a poem in iambic pentameter, and it got pretty close! So it looks like lots of people were teaching it to do similar things.
How does that translate into how QA can use AI to improve quality? Iām not sure yet, but Iām not going to wait passively to find out. Iām going to go out and explore and use my imagination!
From what Iāve seen so far, AI is already useful for giving an outline framework for test cases (our Tech Team are already using it for producing various scripts) and can have a pretty good stab at producing some automated tests (Iāve seen a few examples using Cypress) but what is produced is only the first cut and needs a thorough review and usually some tweaking.
Obviously AI is improving all the time (presumably the more itās used for testing, the faster it will improve) but for now itās only as good as the information itās provided with (and sometimes not that good). I doubt itās going to vastly increase the demand for QA but is more likely (in the medium future) to change the role of the tester to improve the level of precision given in test case definitions (AI is only as good as the instructions itās provided with at present).
Iām not sure the premises in āthe flood of AI-generated code will require rigorous testing and validationā are well-founded, even if they sound logical. Are LLMs concretely leading to a significantly higher rate of customer-facing digital products being launched? Are LLM tools leading to significantly higher release cadences for existing products? Sounds plausible to me, but is it true? Will it be true? And if a lot of new products are being launched, do they require rigorous testing or are they tentative prototypes to test the waters for a large number of possible business ideas?
Anyway, if we assume we work for a company thatās found a way to produce a āflood of AI-generated codeā that meets the need for new features, change requests, and new products, will demand for QA increase? Thinking out loud:
If itās a developer driving the code generation, I expect the tools that help them be more efficient (in scaffolding code, debugging issues, rubber ducking technical decisions, and so on) to be joined by tools that also make testing more efficient (in identifying issues, generating test scripts, running only relevant tests, and so on). Perhaps that arms race will keep everything in tune.
If itās a business person driving a fully LLM-produced product, well, I will echo that āthe need for testing will increase, but the demand probably wonāt.ā If that demand does increase, because the business person realises they need a better result than theyāre getting from the LLM, then it wonāt just be a need for human testing, but also for human development. Again the demand for testing matches the demand for development.
Iām inclined to agree with Anna (@sles12 ) that the need for testing might increase but the demand probably wonāt.
About the only thing I can see truly increasing the demand for testing is C-suite folks starting to see testing more in the line of essential insurance than a cost that gets tacked on at the end so they can say theyāve done due diligence.
It will probably take a fair few more bankrupted companies and dead patients before the idea starts to percolate, too.
As a QA/Tester when we use the AI , we have pro and cons but how are we going to use this was the more challenging thing need to discussed .
For an example
You wants to create the billion of test data for doing your test activities for that if you need to generate the script . Yes you can go ahead but is that script more optimized one to use . That mean when u create your own script it took 3 min for 1 billion records but AI generated script took 1 min then it will be more effective way to use the AI in QA activity
As mention how are we going to use this AI to QA is that challenging decision.
Not had to test any AI generated code yet (to my knowledge, at least)
I think the sentiment that testers will be needed to test AI generated code is right, but to describe it as a āfloodā - nah. Not yet. Plus if the codes getting generated, what are the developers doing? Theyāll test it too, right? RIGHT??
Itās an interesting time with AI. Weāve got writers and artists literally crying out plagiarism whenever writing or artwork is generated by AI.
So why are developers not crying out when AI code is being generated?
Iām perplexed
āuse AI to generate test dataāā¦ um that makes no sense at all to me, itās only going to generate cases that are popular from itās training set surely, and so it likely wont include any cases that are āunknownsā.