The term “flaky tests” gets used a lot when we’re referring to automation in testing.
What’s the most common cause that you’ve seen for flaky tests?
The term “flaky tests” gets used a lot when we’re referring to automation in testing.
What’s the most common cause that you’ve seen for flaky tests?
I like this question a lot! I’ll give my answer by asking another.
Are it really flasky tests or are there flasky automators? 
Why would you even implement a test that only works 95% of the time. Refactor & re-write it.
Quite often it’s the system under test that’s the issue.
I’ve recently felt pain trying to write an end to end test using some AWS APIs. A lot of their requests are “fire and forget”, meaning you don’t know when a process has been completed as he API returns a 200 OK to say he request has been sent. This led to us considering putting hard sleeps/waits into our code.
Instead we opted to mock/stub responses, but had we gone with the above end to end test, it would’ve ended up as a flaky test through no fault in our own code.
‘Automation in testing’ or otherwise, is nothing else but writing code or implementing a solution to a problem.
Any solution can have bugs, especially the ‘automation product’ which is built directly with a dependency on a changing product.
So the flakiness is caused by humans who either:
Other than the sub par standards of automating tests…
There are two other aspects I can think of (answering based on my past experiences) flaky tests - these are quite common in the UI layer automation. this is due the the performance of the application under test. Some objects/elements in the UI may be dynamic and hence detecting exact time to load on the DOM is unpredictable by the test developers, hence tests might pass most of the runs and might fail on some occasions (depends on the visibility of the elements in UI). These tests need to be refactored with customised wait logics.
On other instances, when the test environment configuration is not even closer to the production environment, we may see tests flakiness while attempting to do parallel testing (grouping & running different tests cases at the same time in order to speed up the test execution time).
From my experience there’s been different definitions of flakey - in UI cases:
Tests that don’t work properly
Tests that are redundant and are no longer relevant
In some other cases though, test environments can be an utter dumpster fire, and tests can fail because the test environments are too volatile. In some cases, they’re often dumping grounds for different versions used by different teams that will resemble nothing like production!