How do you tell how good an Automation implementation is?

Question:

How do you tell a good #Automation implementation from a bad one? What if it’s about #AutomationInTesting?

any suggestions?

1 Like

I can think of a lot of ways to tell a bad automation from a good one, but very few, if any, to tell a good one from a bad one, except when the team decides that it’s a good one.

For example, a bad automation has flaky tests.
A bad automation has poor code coverage.

But the converse is not necessarily true. No flaky tests does not mean that the tests are good.
A good coverage normally doesn’t mean that the tests are good. Case in point, “great” code coverage, in my experience, usually means the opposite. The team has sacrificed good testing for good coverage.

Most metrics that you can throw at a project can be gamed.

So the only real indication that I have is if the team is satisfied with the automation, then it’s probably good. It’s at least good enough.

(Note: if the team is invested in the product rather than gaming systems, a lot of metrics can be valuable in helping the automation. But it comes more frequently to a people-problem in my experience)

4 Likes

Thank you Brian for sharing your thoughts. Yes, it is easy to tell a bad one but it’s not as easy as we hope the same for finding a good one.

With that said, it can be really helpful to work towards to good if we could list the bad things to avoid in automation.

I’m sure in this community we could come up or borrow some good things from others. I.e.

DRY (do not repeat yourself)
KISS (keep it stupid simple)
LEAN Approach (Value focused)
MVP (Minimum viable product)

At the fundamental level (so thinking in terms of different metrics than above)… Is it repeatable? Is it adaptable? Is it easy to extend? is the code that underpins the framework maintainable. In my current area of interest - Selenium, Cucumber (and tying this together with existing shell script test harnesses), I arrived at the conclusion it should be faster, for example, to add a new web screen into the framework than it is to write good Gherkin test cases for it.

Edit? I hit the first three of those. When I looked at changing how under underlying maintains state, a couple of days back, I found forty odd places where I needed to change code, so bombed on the fourth :-/.

Time spent maintaining tests is an important factor. Are we spending too long investigating failures? Is it down to our test code? Or has the feature we are testing not been built with testability in mind?

Does a passing test actually mean something to us? Is it validating nothing has changed with important customer functionality? Alternatively, does a failure mean something? Is it valuable information?

Are we getting feedback quickly enough? If there are issues, are we or developers jumping on them right away? Or do our tests take too long to run and we’ve jumped onto something else by the time they’ve finished?

1 Like

If you want to look at it from implementation point of view you can basically evaluate it as any product. How well does it meet the customers expectations? (Customer satisfaction). How fast is the turnaround on adding new value? (Time to market). How much does it cost you? (Return on investment). As a general note, never forget that those are you main parameters. Maintainability, flaky tests, false positives and false negatives. Are just potential indicators towards those areas. They may not be the most important ones or the correct ones. And if you focus to much on those you may forget the big picture.

But then I find that it is common that you do not identify what the customer needs are or that you have really bad strategies for what should be automated or not. And that those are bigger problems than the actual implementation of your code. The “Is it right vs. it is the right it”.

Good luck.

1 Like

Given the time, budget, resources constraints - is the automation approach giving me a reasonable advantage over:

  • not doing any automation and using a different strategy to come to the same results;
  • not doing anything at all and accepting the possibility, risk and impact of something going wrong;
  • a different automation approach(different level, tools, angle, implementation);

You can just play this game in your head and with the PM/PO/Devs/Stakeholders, each time you automate something with higher cost/impact:

  • Worst case scenarios of doing it - e.g. you take resources away from development or testing of other features that can bring or avoid losing money;
  • Worst case scenarios on not doing it - e.g. A bug appears in prod, is found by other systems or people, an estimated impact cost of about 10k ; can the bug or similar bugs with higher impact appear very often? compared to the cost of 50k to automate something to have caught that bug sooner.
  • Worst case scenario of automating in X way vs Y way;
  • Worst case scenario of automating vs different strategy to arrive to a similar result;

In my last team we were doing these exercises often.
We don’t just implement automation because it could someday find something. We put it in a balance with the other factors, we briefly discuss it, sometimes takes 1 minute, sometimes could be hours…
And everyone in the team automates something for their benefit or supposedly for the product monitoring: it can be logging system optimizations/db/graphs, notifications by e-mail for errors, API schema checks, API data checkers, integration or unit checks, etc…

If the automation ‘thing’ was already implemented you have a few options:

  • is it used or could be used? If not - delete it, archive it…it’s useless now…
  • if it is still used - what’s the result that it brings: are they relevant?
  • can it be better? how expensive it would be to modify it? what’s the worst case scenario for not doing anything?
  • is anyone still working on it? re-evaluate and estimate the risk of not doing it; compare with the costs of doing it; decide if it needs to be canceled, stopped, direction change.
2 Likes

There are a lot of reasons as to why automated testing is beneficial, and by using these best practices in your testing you can ensure that your testing is successful and you get the maximum return on investment (ROI).

  1. Decide what Test Cases to Automate
  2. Test Early and Test Often
  3. Select the Right Automated Testing Tool
  4. Divide your Automated Testing Efforts
  5. Create Good, Quality Test Data
  6. Create Automated Tests that are Resistant to Changes in the UI

You answer 4 questions:

  1. What?
  2. How?
  3. Why?
  4. How difficult was it to answer the first three questions?

“What:” What exactly what are all the checks checking for, or to put another way, telling us. Not approximately, exactly.

“How:” How is each check manipulating the application, how is it achieving its goal. Also, how is the solution implemented? Does it have any code smells? Is it following good development practices?

“Why:” Why does it exist? What is the problem it’s trying to solve? How does that problem relate to the real testing needs, independent of the solution?

If it’s really difficult to answer these questions, it’s a bad sign. If it was very easy to ascertain the answers, it probably means the solution was written and maintained well and kept its focus.

Source: my own journey of answering these questions on a very large selenium suite, 100,000 lines of code, with thousands of gherkin given/when/then lines, which as I learned, do not at all guarantee that the implementation of the gherkin matches what the gherkin says.

2 Likes

When you change important stuff in the SUT, it will scream at you exactly where is the change was, very fast.
And if you want to change the oracle for this important thing, you can do it with ease.

The more precise, the faster, the more important the thing, the easier to modify the oracle, the better.

1 Like

This sounds like an interview question. I guess it would be mostly the same as “good” software.

1 - Easy to understand the code.
2 - Clearly shows you the business logic.
3 - Easy to maintain/update.
4 - Modular.
5 - Clearly tells you what the problem is & where it might be.
6 - Can handle parallel execution.
7 - Provides the right kind of abstractions.

1 Like

I see good comments here. Another way to look at it from a test case writing, test framework design, and overall test maintenance perspective is if the following type can be achieved:

  • can you change the underlying automation tooling without affecting the tests? (i.e. changing tooling may not require rewriting tests)
  • can you write tests without going into low level details of clicking this or typing that, mouse over this, etc.? rather you write as higher level function/method calls abstracting away the actual low level logic handled by the framework
  • are tests written such that you can easily tweak things without major changes (like having to find a new element locator value for the tweak, etc.). As an example, what if i want to click the color blue instead of red, or select 5 items instead of 3. Can you just change “red” to “blue” in the test or 3 to 5 and the framework handles the logic change, or do you need to update the locators or add steps to achieve? For a simple tweak like this, a good framework was designed for templated iterative logic and locators where you can substitute/insert in values like colors and numbers instead of statically defined locators & fixed iterations.

Not to forget: When the business reads you automatic generated reports and uses them. You may have the best automation setup in the world but if business doesn’t read the reports it’s useless.

1 Like

A bit late to the party.

But I have a good answer for this:
Calculate the ROI.

That will give you all the information you need.

If it took you 7 months to build an internal testing framework and it’s not very reliable or user-friendly, it definitely has a terrible ROI.

could you illustrate what will help one to calculate the ROI in an example please?

i can see time is part of the formula from your high level description

how does it look like with “not very reliable or user-friendly”? is it going to be a rate? score? how do you measure one unreliable to another unreliable ?