Adding visual testing to an existing functional UI test suite. Will the tests get too bulky?

I’m looking into the possibility of adding some visual checks with something like Applitools to my functional UI tests to validate the test better and to pick up on any visual issues on the way.

I can’t decide whether to slot them into existing functional tests or maintain a seperate ‘visual’ test suite…

  • On the one hand once you’ve setup the visual testing tool with your existing UI test framework, it’s quite easy to add additional visual assertions into your functional test. For lengthy end to end tests you could get good ‘value for money’ in terms of how many screens you could check in a single test.

  • On the other hand your tests start to become even less deterministic. You could for example have a login test that fails visually but passes functionally and all you would get is a test failure.

Thoughts?

At the moment I’m leaning towards seperate test cases but re-using the same code that drives the app for the functional tests.

N.B I am using BDD so I would like to avoid polluting my scenarios with stuff like ‘Then The user validates the login screen looks correct’ for every visual check.

Many thanks,

1 Like

We use a BDD style format and have a separate directory for visual checks within the same test pack.
We use cucumber and tag our visual regression tests with a different tag to our functional regression tests so we run them separately but they can share the same code.

We’ve just changed visual test provider so it’s given me a chance to refactor tests I’ve never had a chance to go near. We have a “then” step to take a screenshot and give it a label but it’s individual for each test which is terrible really as its basically the same every time. I’m working on refactoring to parameterise this so it can be a shared step and then hopefully utilise the majority of the same step definitions that our functional regression uses to avoid duplication.

I hope that helps. I’m happy to try and answer any questions if you have any

2 Likes

Good luck with your refactoring.

1 Like

Thanks marissa. Interesting.

So you do have some duplication of tests in the sense that a visual test and a functional test may cover the same ground via the automation tool?

Also what does your actual visual validation step read like in BDD? How do you make it unique to that particular screen you want to check?

Do you also repeat visual tests for similar tests? For example say a test navigates from screen A to B to C. You then have a test that navigates from A to B to C to D to E. Do you still do those same validations on the longer test even though they are covering the same ground between A and C?

Many thanks,

1 Like

Hi @konzy262 - long post, sorry!

Yes our visual and functional automated tests definitely cover the same ground as we don’t run visual every time depending on the changes made, we might choose to just run functional, the duplication we’re trying to avoid is the Cucumber step definitions and the methods we use, not the features.

I don’t know that this is the correct way to do it but this is what we currently have - and how I’m working on changing it

Previously a very simple visual testing feature we had looked like this:
(We use Cucumber/Gherkin/Ruby.)

Scenario: Take screenshot of homepage
Given a user visists the homepage
Then a screenshot is taken of the homepage

Given(/^a user visits the homepage$/) do
  method_to_visit_homepage
end

Then(/^a screenshot is taken of the homepage$/) do
  method_that_takes_screenshot_and_gives_it_a_label('homepage_label')
end

I’m aiming to remove a lot of code by replacing the ‘then’ to something more common like this:

Scenario: Take screenshot of homepage
Given a user visists the homepage
Then a screenshot is taken of the “homepage”

So the ‘Then’ will now be somewhere separate to the other steps and “homepage” is the value of the label parameter in this case but could be a different value for another test. The ‘then’ may become an ‘and’, I’m not sure yet.

Then(/^a screenshot is taken of the "([^"]*)"$/) do | label |
  method_that_takes_screenshot_and_gives_it_a_label(label)
end

We also have a LOT of step definitions that are only used in the visual tests but are the same as the functional ones. Using the example above, we might have a Given a user visits the homepage - visual test and a Given the user visits the homepage for the functional tests when they share the same code. I’m hoping to fix that too where I can because it’s totally unecessary!

Using your example, we do repeat steps in tests. For instace, we have a test to check page C, and it needs to go through pages A and B first to get there. We will just have a single visual test feature to check C but within the steps it will go through A and B without doing a visual check. We will have separate tests that will do a visual check of A and another that will do a visual check of B.
Currently they don’t always use the shared steps but I’d like them to and they don’t take any screenshots for the visual check until the final step.
I don’t know which is better in this case and it’s not something I’ve looked in to. Perhaps we should do A, B and C all in a single feature and do visual checks at each stage. :woman_shrugging:

I hope all of that made sense!

1 Like

Yeah that sounds really good thanks for that.

I think it’s probably better to have seperate tests for each visual check. You could combine a bunch in the same test but an early fail may hide an error later on. For example…

Test 1 - A → B (Check fails) → C → D - Are C and D okay?

This is assuming your test runner stops when hitting the failure. I guess you could configure it in a way where all the data is collected and asserted on at the end so it at least navigates through the entire test. Seems like it would be harder to report on.

I was thinking something like…

Test 1 - A - Check A
Test 2 - A → B - Check B
Test 3 - A → B → C - Check C
Test 4 - A → B → C → D Check D
etc…

1 Like

So for the visual checks there isn’t actually any expectations. It just takes a screenshot which is then added to our visual testing tool to check against a baseline image which is reviewed separately. (Applitools, Percy etc).
From what I can tell you can have multiple screenshots in the same test sequence as long as they’re differently labelled.

1 Like

I don’t know if it’s applicable/possible, but could you add in conditional logic execution to your tests based on tagging? This way, you can have the visual test steps be a part of the original test cases but they only get invoked when a specific tag is referenced during test execution. Reducing duplication, and still allow you to choose when to execute the visual tests or not.

In terms of BDD, hopefully there is a way to chain the THEN clause with ANDs? And the AND is optional and only gets “inserted” when the visual test tag is passed.

3 Likes

The last week I thought about the best way to use visual testing. Here are some things which might need to be polished.

Assuming the functional test has passed, then a visual test must be executed. According to me there are basically two ways for visual testing:

  • check one screen per scenario.
    If there are 4 screens involved, then there are 4 scenarios.
  • check all screens in 1 scenario.
    If there are 4 screens involved, then there is 1 scenario.

In the following sections I will look at an interaction of a user with the system using 4 screens.

Check one screen per scenario
Advantages:

  • because only 1 screen is tested, the Given When Then construction is clear.
  • if a difference is spotted by the tool, then the tester can determine the screen in question right away.

Disadvantages:

  • the execution time can double. The navigation though the system is fast, but extra time might be needed for setup and clean up.
  • the maintenance of four scenarios is relatively high compared to the other option. It might also lead to errors.

Check all screens in 1 scenario
Advantages:

  • 4 screens are tested on visuals in 1 run.
  • there is only 1 scenario, so there is low maintenance.
  • the execution time is minimal.

Disadvantages:

  • if a difference is spotted, then the tester has to sift through the results. A tool like Applitools provides a portal with an overview of all the visual comparisons per scenario.
  • The Given When Then construction is used for 4 screens. This can look awkward to read. From a business point of view the only concern is to assure that all screens still look the same.
2 Likes