This is what OpenAI says and advises about agentic testing.
What are your thoughts?
What is good about it?
What is not so good?
What would you wish it said? ![]()
4. Test
Developers often struggle to ensure adequate test coverage because writing and maintaining comprehensive tests takes time, requires context switching, and deep understanding of edge cases. Teams frequently face trade-offs between moving fast and writing thorough tests. When deadlines loom, test coverage is often the first thing to suffer.
Even when tests are written, keeping them updated as code evolves introduces ongoing friction. Tests can become brittle, fail for unclear reasons, and can require their own major refactors as the underlying product changes. High quality tests let teams ship faster with more confidence.
How coding agents help
AI coding tools can help developers author better tests in several powerful ways. First, they can suggest test cases based on reading a requirements document and the logic of the feature code. Models can be surprisingly good at suggesting edge cases and failure modes that may be easy for a developer to overlook, especially when they have been deeply focused on the feature and need a second opinion.
In addition, models can help tests up to date as code evolves, reducing the friction of refactoring and avoiding stale tests that become flaky. By handling the basic implementation details of test writing and surfacing edge cases, coding agents accelerate the process of developing tests.
What engineers do instead
Writing tests with AI tools doesn’t remove the need for developers to think about testing. In fact, as agents remove barriers to generating code, tests serve a more and more important function as a source of truth for application functionality. Since agents can run the test suite and iterate based on the output, defining high quality tests is often the first step to allowing an agent to build a feature.
Instead, developers focus more on seeing the high level patterns in test coverage, building on and challenging the model’s identification of test cases. Making test writing faster allows developers to ship features more quickly and also take on more ambitious features.
Delegate Review Own Engineers will delegate the initial pass at generating test cases based on feature specifications. They’ll also use the model to take a first pass at generating tests. It can be helpful to have the model generate tests in a separate session from the feature implementation. Engineers must still thoroughly review model-generated tests to ensure that the model did not take shortcuts or implement stubbed tests. Engineers also ensure that tests are runnable by their agents; that the agent has the appropriate permissions to run, and that the agent has context awareness of the different test suites it can run. Engineers own aligning test coverage with feature specifications and user experience expectations. Adversarial thinking, creativity in mapping edge cases, and focus on intent of the tests remain critical skills.
Getting started checklist
Guide the model to implement tests as a separate step, and validate that new tests fail before moving to feature implementation.
Set guidelines for test coverage in your AGENTS.md file
Give the agent specific examples of code coverage tools it can call to understand test coverage