What steps do you take to reproduce automation failures?

What steps do you take to reproduce automation failures?

Do you take snapshots, add breakpoints to code? We all have our go to when it comes to this situation, what’s yours?

1 Like

Automation failure or bugs appearing? :wink:

When an automated test fails or when a flaky person wrote a test… and I have to debug it then I do it step by step. If it’s a UI test then I try to run it a couple of times, just to see if it’s a flaky test and then I set breakpoints and see where it goes wrong (if it isn’t to obvious).

Snapshots can help I suppose but I haven’t used these to often.

2 Likes

I was initially thinking the former which could, of course, be caused by the latter :wink:

1 Like

Is this is another question from a user on MoT slack ? Did they give an overview of what they were trying to debug and which problems they faced ? I think the answer would depend on what was being debugged. Since the question is generic, here is a generic answer.

1 - Repeat the test on one or more systems.
2 - Use breakpoints and debug step by step.
3 - Check your test data.
4 - Look for any timing issues like not waiting enough for something to happen.
5 - Check if your systems and networks are slow today, and thus causing tests to fail.
6 - If its a new test and has few runs, then check if it has any logical errors.
7 - Are there any dependencies between tests i.e. order of execution matters?

That’s all I can think of at the moment. I am not sure if my answer is helpful or actionable though.

1 Like

I think it’s often not possible to use snapshots in increasingly complex system where services are deployed, and it’s not feasable to roll back, but to rather have detailed logging.

For small easy to deploy systems, I love the snapshot technique. If you can fully install an old snapshot in under 15 minutes, do it. But normally all you are going to have is the ability to run the same check a few times over, but with a variation. Which is why I like having the ability to swap tools, manual test execution is one of the more expensive tools, but ideally you want variations on the tool.

I’m a huge fan of writing the same test-case twice in my system, from either a different perspective or from a different viewpoint. It’s most helpful if you have a second version of the failing test that uses mocks or simulations, instead of the real components. If both are failing, you can guess a little bit more about the fault area.

2 Likes

Exactly Conrad! It isn’t always possible, but when it is I like to:

  1. Check logs, is there a smoking gun or at least a clue?
  2. If possible, set a higher logging level and run the test again, failing that add more logs
  3. Run the automated test locally, I am lucky I can do this in many cases thanks to Docker
  4. Re-create the test manually, can I test around the issue and learn more?

I don’t do this just to scrutinise the test, although that is important, but also if we have found a bug, I want to learn as muc has I can to help identify the root cause and a fix!

I usually stop short of debugging the product code myself, but I will bring a developer in to do that part with me.

3 Likes

Ah, the re-run with verbose logging enabled trick.

I use a home-brewed test framework, and one of the devs has now asked us for a single “global” commandline flag to allow running with detailed logging and artifact gathering across all components. Sometimes framework design makes digging for a cause easier. We could write the book on this I’ll bet @bencf1

1 Like

Nope, this is one that popped up in a call I had with someone a few weeks ago. I get inspiration for questions from all over. If I have the foresight at the time to note down the context, I will but that’s not always possible in a conversation flow :grin:

I ask these types of questions to see what people do in their own personal settings as often that gives inspiration for people who are in other contexts or situations.

1 Like