Applitools Eyes - Accept or Reject Env-Specific Problems

When using Applitools Eyes (or other visual testing), if you get an unresolved batch because of a test-environment-only problem, do you Accept the change and create a new baseline with said problem or Reject the change and see it flagged again each day until the problem is addressed?

The visual testing paradigm is stretching my intuitions. But I’m thinking…if said problem is not expected to be addressed, mark that problem as a hotspot to ignore in the future and Accept the new baseline. If said problem is expected to be addressed, Reject the checkpoint. Continue rejecting said checkpoint until it is addressed.

But I can see how my approach may not work once additional differences are detected. I’ll be held hostage by the env-problem, if I want to update my baseline b/c of an expected change in the checkpoint.


Feels like there are more than 2 variables here?

  • One of the problem causes will continue to impact as long as test code and datas are not kept in the same branch and repo. Whenever the app code for a feature branch gets ingested the test tool needs to keeps it’s learnings/AI separate for that branch, and know when to prune. I’m assuming you have feature teams.
  • Likewise for environments, which might mean tests run in a different location or tests run on a different platform sometimes. If you have variation, you will always have “oracle” trust issues. Either only use consistent environments, or don’t run some test suites on the outlier hardware. This is always hard to do since we often have to support multiple platforms that customers use. It might make sense to only use AI on one of the platforms/environments that you decide is most suited for the AI/eyes/tool to execute in.
  • Hostage to Reject or Accept and Baseline question, may be a gatekeeping issue. Created by the tool or the development process. Basically if you or the tool is acting as a gatekeeper, it will eventually be ignored. If the developers don’t look at the results each morning/nightly, engage with why they don’t trust the results, but also if they can come up with a plan to solve the “reference” image checkpoint problem.

At the end of the day, the developers or the feature-owners need to actually approve a checkpoint and tie it to the branch the feature team is working on. The pain of how you make sure that the checkpoint follows the code when the feature merges into main/master branch is a curious one. As the moment it merges it will cause a test fail in the target branch otherwise. I’m keen to know the answer there, since I’d like someday to use such a tool, but this problem frightens me because it does not scale well. I don’t think the test role should “own” this problem, but one we can own is trust in the tool. And my biggest suggestion there is to limit the tool to run only in consistent environments in your test farm, and use it as a visual detection tool, not as a tool for detecting edge cases and form factor issues, by removing those concerns from the suite run setup. Good luck Eric.

1 Like

@conrad.connected , this is my first sprint with ApplitoolsEyes and first visual testing experience. We are trying to start small. We are visually testing 4 web pages, daily, in one env (call it “staging”). Staging env is the most integrated and gets a nightly deploy of latest version of product-under-test from an integration branch shared by 7 dev teams. Your 2nd bullet resonates with this…visually test in one env.

The problem I attempted to articulate in the 3rd paragraph of my original post is likely a very pedestrian problem for visual testing. The “basepoint” is the last accepted page appearance. The “checkpoint” is the latest page appearance. Said problem arises if the checkpoint includes two distinct differences from the baseline, and one of the differences is a bug and the other difference is an expected change that needs to be approved as a new baseline. I think one can only approve or disapprove a checkpoint in aggregate.

1 Like

yeah I used to maintain a GUI based suite many years ago and it was a one-man band thing because it required specialized knowledge, and ultimately because I was the only person driving the tool (the pervious person left) it fell to me to fix all the tests, and when branches or features broke a test, I had to deal with the pain, not the developers, mainly because the tool was too hard for them to fix their own tests and approve changes to the “models” we used. So that tool eventually got canned. Mainly because I failed to sort out how the rapid feature changes that devs made were faster than I could adapt and update tests to deal with new pages - we had almost a hundred pages. So I had a lot of stress about deciding which “branch” was correct at any one time, so I decided to only ever test against the main/trunk branch of product.

So I think it’s a great experiment, see what it tells you, but always be limiting to only cover a small area of the web app using the tool, make sure you can keep on top of things easily. You may in fact want to spend more of your time doing exploratory testing, which you wont be able to do if you are busy doing maintenance, busy expanding automating and busy repairing automation.

I did some web app testing 2 years ago, and I wanted to use Eyes as well, but I found I got more bugs raised by actually exploring and then spent far ages in meetings getting the design and CSS agreed on and then fixed. Which are tasks that the Eyes tools and their like are not able to do. Just getting things fixed takes huge effort from a tester, effort that the tool does not help you with much. So asking it to test too deeply may really yield diminishing returns. But it looks like you have worked that out already.

ha ha, I wish. But this is the first test tool I have been excited about in a long time. Applitools uniquely solves one problem for us. At least it appears to so far.

It lets us run Selenium tests optimized for desktop chrome, on desktop chrome, export the html and css to Applitools, and then let Applitools render UIs on any platforms we want, and perform visual testing on them.

We are currently running one chrome desktop Selenium script that merely walks through our website-under-test, and sends each page’s resources to Applitools, where Applitools renders the pages in iOS and Android devices and takes new checkpoints. It actually works.

That said, to your point, no important bugs have been found yet. :frowning:

it’s still early!

1 Like

Oooh I love ambitious projects that are unique and push the skills.
Best of luck, but know always that nothing is easy.