On Tuesday, @alyssaruth will be joining us for a masterclass focusing on what we can learn from flakey automation and what we can do about that.
If we donât get to your questions on the night, weâll add them to the thread below for Alex to answer later. If youâd like to continue the conversation from the webinar, this thread is an excellent place to do that Share resources and follow up success stories from your learnings here!
Of course, the recording of the masterclass will be available to everyone in the Masterclass section of our website after.
Is there ever a time (exception) where a sleep would be a good way to handle a flaky test? (you wrote generally itâs a bad idea, but when is it a good idea?)
What was the alternative for cypress?
ES and Kibana looks very helpful - Any tips about how to start the integration between your automation to these tools?
What is the average of single test execution time?
My team has considered moving off of our current UI testing framework and moving to another. You mentioned your team moved to Cypress a few years ago. Which tool were you using before and what made you make the switch? What was your process in considering making the switch?
How much trust should we have in those flakey tests?
Not a question from me. Just wanted to say I enjoyed your session @alyssaruth. Lots of similar issues and ideas Iâve encountered before, and some that I think would be really help me. Thanks for sharing your story. And a great host as usual @gwendiagram
Is there ever a time (exception) where a sleep would be a good way to handle a flaky test? (you wrote generally itâs a bad idea, but when is it a good idea?)
Yes, I think the reality is that there will be some circumstances where a sleep is unavoidable - which is fine, provided theyâre the exception and youâve exhausted other options first! In general, itâs always better to wait for an explicit condition to be met (e.g. wait for the page to have loaded by checking for the presence/absence of a particular element) than to wait for an arbitrary length of time.
So, the scenarios where a sleep is a good idea are exactly those where you have been left no other choice. For us, so far this has only applied to a certain third party integration around processing payment, which pops up a window (an iframe) to interact with. Cypress doesnât deal with iframes particularly well in general, and this particular iframe also doesnât expose as much as weâd like in terms of CSS classes to signal state changes. This means we currently donât have a way to concretely tell the difference between when the iframe is in a âloadingâ state vs when the form within it is ready for us to type into - and so weâre stuck with a wait thatâs long enough to âpractically guaranteeâ that itâs ready. Weâve pulled this wait out into its own support method, with a descriptive name so the intention behind the wait is clear.
But if that were to change - maybe after upgrading the third party version or Cypress - then weâd rip those waits out right away!
My team has considered moving off of our current UI testing framework and moving to another. You mentioned your team moved to Cypress a few years ago. Which tool were you using before and what made you make the switch? What was your process in considering making the switch?
Hi both, thanks for the questions! Iâm going to tackle these two together.
The project that weâre using Cypress in was actually greenfield, so Cypress was our first choice (we didnât switch to it from another framework). We had a few devs on our team whoâd had bad experiences with Selenium WebDriver in the past, so we decided to give something else a try for comparison. We found it to be a really good developer experience - things we liked were:
Writing tests was straightforward, for example Cypress has built-in waiting for your assertions so you donât have to think about that bit much.
The documentation is pretty extensive and useful.
Tests run quickly and easily locally.
The UI when running the tests is really clean/nice.
It wasnât all smooth sailing by any means - we did have to maintain a fork of it for a little while due to a Service Worker specific issue that took a while to get fixed. But in general theyâve been responsive to issues and itâs been a good fit for us - weâre back on the main trunk now!
I guess if we were to look into a replacement for some reason, the sorts of things that would be top of my shopping list would be:
Built in videos/screenshots for when tests fail.
Ability to run tests locally easily
Ability to write the tests in your desired language (we write in TypeScript, which is nice as itâs the same as what we use for the regular frontend code)
Support for multiple browsers / the ones you care about
An active community, regular updates, issues being responded to etc.
Back when our E2E flakes were really bad, we thought a couple of times about spiking TestCafe because weâd heard good things about it. At this point Iâm kinda glad we didnât, though, as I think it would have been a bit of a distraction - it turned out most of our problems were our own doing and just needed better investigation. I think itâs easy to fall into a âgrass is always greenerâ mentality when it comes to tooling - you feel the pain points of what youâre currently using, but donât necessarily know what new ones you might inherit if you switchâŚ
Sorry - a bit of a long answer! I guess the TL;DR is, there are a bunch of tools out there and probably multiple that will do the job you want well. Before considering switching tools (potentially a long and painful process) you want to be really sure that the problems youâre facing are because of your current tool and not a symptom of something else.
What is the average of single test execution time?
This question comes at a good time as Iâve just finished going over all our Cypress tests and pruning/speeding them up so I have the answer to hand!
As things stand, we have 139 cypress tests across our two webapps, and adding up the runtimes of each (across a few builds) gives a total execution time of around 17 minutes. So that would mean approximately 7.5s per test on average.
We parallelise the tests per build too, though - those 139 tests are effectively divided into four roughly even chunks taking around 4 minutes each. So, allowing for a bit of variation here and there (waiting for build agents and suchlike) weâre currently in a position where the âend to end testingâ stage of our CI pipeline takes around 5 minutes per branch.
ES and Kibana looks very helpful - Any tips about how to start the integration between your automation to these tools?
They are pretty powerful tools, although Iâm sure others would do the job just fine as well. The important thing is that you have the logs somewhere so you can diagnose whatâs going on when your tests fail.
In terms of getting started with ES/Kibana, there are various hosted solutions (be it in AWS, or native ElasticCloud) that get you up and running quickly - and they usually include Kibana out of the box too. Once youâve got that up and running, you just need a way to post logs to your ES instance - since our services run in Kubernetes weâre using a gadget called filebeat for that purpose, but again there are other things we could have gone with.
As I showed in the presentation, in terms of getting data on which tests are failing it should be as some kind of âon failureâ hook which does an HTTP POST to your elasticsearch instance with the relevant info.
How much trust should we have in those flakey tests?
Flakey tests definitely erode trust, which is why itâs so important to fix them before there are too many. With any flakey test, though, itâs also worth remembering that:
The original intent behind it was good - itâs attempting to add coverage to some flow in your system
At the time it was written, it presumably worked well enough to make it into main - so it should be salvageable now itâs (slightly) broken.
Tempting as it might be, just removing the flakey tests is not the way forward!