Flaky Tests - What Causes Them?

The term “flaky tests” gets used a lot when we’re referring to automation in testing.

What’s the most common cause that you’ve seen for flaky tests?

1 Like

I like this question a lot! I’ll give my answer by asking another.
Are it really flasky tests or are there flasky automators? :stuck_out_tongue:

Why would you even implement a test that only works 95% of the time. Refactor & re-write it.

1 Like

Quite often it’s the system under test that’s the issue.

I’ve recently felt pain trying to write an end to end test using some AWS APIs. A lot of their requests are “fire and forget”, meaning you don’t know when a process has been completed as he API returns a 200 OK to say he request has been sent. This led to us considering putting hard sleeps/waits into our code.

Instead we opted to mock/stub responses, but had we gone with the above end to end test, it would’ve ended up as a flaky test through no fault in our own code.

1 Like

‘Automation in testing’ or otherwise, is nothing else but writing code or implementing a solution to a problem.
Any solution can have bugs, especially the ‘automation product’ which is built directly with a dependency on a changing product.
So the flakiness is caused by humans who either:

  • implement the wrong solution or address the wrong problem;
  • write buggy code;
  • create an additional un-tested product;
  • create a program based on moving elements (another product that’s under development);
  • do not maintain or stabilize(bug-fix, refactor, etc) the initial version that was built;
  • plus lots of other causes that any other product would have bugs for;
1 Like

Other than the sub par standards of automating tests…
There are two other aspects I can think of (answering based on my past experiences) flaky tests - these are quite common in the UI layer automation. this is due the the performance of the application under test. Some objects/elements in the UI may be dynamic and hence detecting exact time to load on the DOM is unpredictable by the test developers, hence tests might pass most of the runs and might fail on some occasions (depends on the visibility of the elements in UI). These tests need to be refactored with customised wait logics.
On other instances, when the test environment configuration is not even closer to the production environment, we may see tests flakiness while attempting to do parallel testing (grouping & running different tests cases at the same time in order to speed up the test execution time).

1 Like

From my experience there’s been different definitions of flakey - in UI cases:

  • Tests that don’t work properly

    • Failing on timeouts because proper waits haven’t been established (or worse, gratuitous usage of Thread.sleeps()
    • Selectors that are long and specific div > div > div > .some-css-with:nth-child(99) > div (if the page format changes, so will the selector!)
  • Tests that are redundant and are no longer relevant

In some other cases though, test environments can be an utter dumpster fire, and tests can fail because the test environments are too volatile. In some cases, they’re often dumping grounds for different versions used by different teams that will resemble nothing like production!