Nightly builds for ui automation

Hey all.

Iā€™m currently working in an environment which is trying to introduce nightly build to runs of our ui automation against. Currently we run unit tests on merging to master but not UI automation. The aim is to run automation each night (takes 3-4 hours, about the same time as our unit tests :frowning: ).

UI automated and manual testing is ran in sprint targeted to areaā€™s impacted, the aim Iā€™m Iā€™m told is to find things that have been caused merge issues and or things that are missed by squad targeted/full automation run testing.

Iā€™m of the view that nightly build and automation is an anti pattern - common path but a better path exists.

My view is fix the underlying issues - eg why squads arenā€™t able to target effectively, why we find issues after merge issues etc

Doing it nightly means you have multiple teams changes at once and require a team/teams to rotate review of the results and finding the team to fix issues usually not linked to the changes.
+ve
Nightly would find things faster than you would waiting for a decision to release master
Reduces time people are building of broken software than if you do automation on release.
Flaky environment/tests failing more obvious

-ve
Always need resource to review failures - if no one looks they donā€™t provide value
Teams fighting over if it was them or another team
Lower priority to review failures
Flaky tests will waste a lot more time review else results/failures start to get less trusted - aka that one always fails
Teams are made up of humans if they know a nightly build is being run they are less likely to run wider testing prior to merge

Personally I think we should invest the time in solving the issues as to why targeted testing is so hard, eg tasing in automation, and at invest in getting automation to run on merge against tests that are assigned to areaā€™s of code. eg change the method around sign up - automatically on merge kicks off tests on sign up etc.

Whats the communityā€™s thoughts?

Moving to nightly builds is usually a True North adventure for companies looking to reduce their cycle time. Automatic check suites are designed to pseudoverify a bunch of facts. When this happens is not necessarily important, but having the facts to hand more frequently is good for testing and may force improvement of your autocheck suite codebase. Fixing why your squads arenā€™t targeting or why you find post-merge issues would be helped by more visibility. More frequent UI check runs give you more visibility (unless theyā€™re awful). So your solution is their solution. Problem solved.

Always need resource to review failures - if no one looks they donā€™t provide value

You do always need a resource to review failures, but thatā€™s checks for you. What youā€™re saying here is that you value access to the information less than the time it takes for a review. Thatā€™ll depend on the quality of your UI checks. If you need to have that information at a certain time to inform people that matter of things they want to know then you have two choices: review a check suite or attempt the checks yourself.

Teams fighting over if it was them or another team

If your teams are fighting then you have a whole other problem that management needs to get on top of.

Lower priority to review failures

Why? If the build canā€™t complete then thatā€™s a big blocker. It has to be someoneā€™s job to review these things.

Flaky tests will waste a lot more time review else results/failures start to get less trusted - aka that one always fails

Good. This pain will force something that few companies do enough of: maintaining the development project your company launched (you may call it your ui automation). When you feel that pain youā€™ll start to make it more efficient, and do some much needed pruning.

Teams are made up of humans if they know a nightly build is being run they are less likely to run wider testing prior to merge

I think I might know what youā€™re saying. I may go into it below.

Now if you were to ask me for arguments against nightly builds that are not continuous deployment (i.e. autocheck builds that run overnight but do not actually deploy) Iā€™d probably bring up the stop/start issue. Testers may feel that testing an old build is pointless, so they begin a new build each day. They therefore may put their sessions into one day, or not start any testing at the end of the day. Frequent builds also mean that regression bugs are ever-present. Thereā€™s much less sense of certainty in what we know if we keep changing things. We may also need more systems to control semi-finished development and that can introduce its own problems. If you are turning off code in the build for in-progress development then surely youā€™ll find the vast majority of merge issues when you turn all the code on. Run UI checks for that, instead, and put it in the lap of the developing squad.

As for your closing ideas I think that nightly builds could help the process improvement projects youā€™re talking about. I donā€™t see why you canā€™t do both. A lot depends on your context, so itā€™s hard to say exactly, but in general if thereā€™s more useful and reliable data gotten easily then I want it. If not, then we have to ask questions about cost-efficiency and pragmatism.

2 Likes

Thanks for your answer Chris sorry it took me a while to get to back to you. I think weā€™re on the same page kinda.

My post was saying the above are common issues with nightly testing of a build. I didnā€™t really explain my view on a better pattern just hinted at it in the closing statement.

I liken nightly builds to doing only ui checks and no unit/integration checks - This is better than doing no tests/checks so is good but it is an anti pattern for automation vs doing checks at unit, integration and ui level - the ice cream cone.

My issue is this pattern of nightly checks on a builds not of checks at the right place at the right time is an anti pattern as its a common approach but a better solution exists as you mentioned at the end and I hinted at. Test in the pipeline prior to merge and make team doing work responsible and make it a blocker if there are failures.

Our devs want to have the information as quickly as possible, so we do not have nightly builds for UI (ā€œnightliesā€).

Nightlies are usually done by Ops team for testing deploys to production (aka preproduction) and testing various pipeline tasks, such as if an instance can be compiled, built, etcā€¦

We were indeed toying with the idea of having nightlies for UI automation as well but then we realised the same thing as you - since in a day there can be a lot of commits into pipeline and when something fails it would be really hard to pinpoint the exact commit that caused it.

So we run UI automation on each commit to main branch and it runs along with other frontend and backend unit tests.

At the previous company I worked with, we began introducing more UI / end-to-end automation into the development workflow. For some projects, these builds significantly increased the feedback loop and the team wasnā€™t happy with it.

The compromise made was to split up the test suite into different priorities and test types and only run a portion of the test suite when there were code updates. When new code was merged in, weā€™d run a subset of the tests that the QA team deemed as higher priority / higher risk. At night, weā€™d run the entire test suite to make sure other areas werenā€™t affected by the dayā€™s updates. I feel like it worked well enough - it kept the feedback loop tight during normal working hours, and weā€™d get alerted to potential issues in other areas when we got back to work the next day.

Ideally, I would have preferred a solution like you mentioned, where code changes would trigger a build for tests covering the areas of potential impact. That would work well for unit and API tests since those are typically more isolated, but itā€™s incredibly hard to do right for UI / end-to-end testing since those tests cover a lot more ground.

4 Likes

I can understand that Dennis. If your end-to-end UI automation suite was taking more than 1 hour to run, then it would roadblock things. Itā€™s very relative, sometimes anything more than 5 minutes per build is too expensive, but if the Cloud deploy of your app takes 15 minutes anyway, it changes the answer significantly. You will always be running out of time even if you have an effective smoke test. Very often, less is more. My experience is that if you do create a very good smoke test that takes 1 hour and gives high confidence, that everyone will want it triggered every night on every branch they own, even if it clogs up things until 9am the next day to run it. Optimize often.

You have to decide whether building a fake environment, and verifying just enough to know that the software can deploy successfully and log a user in, is enough to know with 80% certainty, that the build is not toast, then thatā€™s a smoke test. Sometimes faking and removing concerns that are low risk in a quick and nasty check are more valuable ways to get confidence in a build.

Like, try shrinking the deployed environment somehow to run standalone on one VM, a bit like a system-in-a-teacup. By deploying all of the parts to one machine, you can speed up a deployment at the risk of adding some complexity, but that might be some integration knowledge you need to gain as a tester too. See post higher up by Kinofrost about this move to True North.