🤖 Day 20: Learn about AI self-healing tests and evaluate how effective they are

Only ten more days to go on this challenge, and today we are going to examine the claims about Self-Healing Testings. The idea of Self-Healing Tests was one of the early claims for the use of AI in Testing, but there are three key questions we want to answer:

  1. What does Self-Healing Tests really mean?
  2. What are the risks of Self-Healing Tests?
  3. Is this a useful feature?

We know that not everyone is interested in learning new automation tools, so similar to yesterday’s task, there are two options and you are free to pick one (or both).

Task Steps

Option 1: This option is for you if you currently use a tool that claims Self Healing Tests or are interested in a deep dive into Self Healing Tests and have time to learn a new tool. For this option, the steps are:

  • What types of problems does your tool claim to heal? Read the documentation for your selected tool to understand what the tool means by Self-Healing Tests. Try to understand the types of test issues that the tool claims to heal and how this mechanism works.
  • Test one of their claims: Design a 20 minute time-boxed test of one of the Self-Healing capabilities of the tool. Run the test and evaluate how well you think the Self-Healing mechanism performed. Some ideas are:
    • If the tool claims to detect changed element locators, you could change the locator in a test so that the test wil fail, then run the tool and check how well the tool heals the failing test.
    • If the tool claims to correct the sequencing of actions, swap two parts of a test so that it fails, then run the tool and check how well the tool heals the failing test.
  • How might this feature fail? Based on the claim you were testing and assuming the self-healing was successful, how might the self-healing fail? Can you construct a scenario where it does fail?

Option 2: This option is for you if you are interested in finding out more about Self Healing Tests but don’t have time to learn a new tool. For this option, the steps are:

  • Find and read an article or paper that discusses Self-Healing Tests: This could be a research paper, blog post or vendor documentation that deals with the specifics of Self-Healing Tests.
    • Try to understand the types of issues with tests that the tool claims to heal.
    • If possible, uncover how the issues are detected and resolved.
  • How valuable is a feature like this to your team? Consider the challenges your team faces and whether Self-Healing Tests are valuable to your team.
  • How might this fail? Based on your reading, how might this Self-Healing fail in a way that matters? For example, could it heal a test in a way that results in the purpose of the test changing from what you originally intended?

Share Your Insights: Regardless of which investigative option you choose, respond to this post with your insights and share:

  • Which option you chose.
  • What your perceptions are of Self-Healing Tests (what problem does it solve and how).
  • The ways in which Self-Healing Tests might benefit or fail your team.
  • How likely you are to use (or continue to use) tools with this feature.

Why Take Part

  • Deepen your understanding of Self-Healing Tests: Maintaining tests through new iterations of a product can be challenging, so tooling that can reduce this is valuable. By taking part in this test task, you are developing a sense of what Self-Healing Tests actually means and how they can help your team.
  • Improve your critical thinking about vendor claims: When selecting tools to support testing, we are often faced with many fantastic sounding claims. This task allows you to think critically about the claims being made, their limitations, and how they might impact your team.

:rocket: Level up your learning experience. Go Pro!

6 Likes

Hey all,

I’m going with the Option 2.

I’ve read some articles over Self Healing Tests & they seem to be pretty helpful. The pain point that it claims to resolve is maintaining the reliability of automated tests. It achieves this through:

  • Identification of broken locators
  • Alternative locator strategies
  • Suggesting alternative locators/Automatic locator updates (self-healing)

Few months back there is an update happened to our AUT and most of the locators have been changed. For our existing regression suites, to fix all these locators manually took so much time. Now, while reading these articles I’ve felt like, it would’ve been helpful if we’ve known this at that time.

But everything comes with a potential downsides, here the main concerns are:

  • False Positives
  • Integration Challenges
  • Limited Scope

I’d be open to using/exploring these tools to improve our team’s testing efficiency. Also to handle the downsides, we’d have a tester review the changes, combine with manual testing as they might not handle complex logic & get proper knowledge, implement it at smaller scale before the actual usage.

10 Likes

Hi my fellow testers,

For today’s challenge I am reflecting on my previous experience of a self healing feature within Ranorex Studio.

What types of problems does your tool claim to heal?

In their release notes they describe their self-healing feature as being able to do the following “We’ve introduced an intelligent self-healing feature to slash your maintenance effort. Now, you can automatically rerun a failed test with a more robust object path. This feature is off by default, simply enable it and watch your failed tests heal themselves!”

Test one of their claims

I created a simple test in an older version of our software and then tried to run it in the latest version where some UI elements had changed. It failed immediately to click on a changed button, I tried adjusting every setting I could that was related to the self-healing feature and re-running the test but nothing worked.

How might this feature fail?

This feature failed straight away for me and I couldn’t get it to work, maybe it was just not ready for my use case

If you are interested in learning more of my experience then I recommend you check out my post in day 5 of this challenge 🤖 Day 5: Identify a case study on AI in testing and share your findings - #3 by adrianjr

9 Likes

That’s really interesting - almost all the literature I’ve seen about self-healing tests indicate this is the type of issue they “heal”.

4 Likes

Hello Everyone :pray:t5:

Day 20: Exploring Self-Healing Tests

Today, with only ten days remaining in the challenge, I delved into the realm of Self-Healing Tests. These tests, leveraging AI, purportedly offer automatic identification and resolution of issues encountered during test execution. To gain insights, I pursued Option 2, immersing myself in various articles and papers on Self-Healing Tests.

Insights on Self-Healing Tests:

Aspect Insights
Understanding Self-Healing Tests respond to the complexity of software development, employing AI to automatically address issues during testing, aiming to reduce manual intervention¹.
Issues Healed They address common issues like failure to interact with web elements due to locator changes, boasting adaptability for seamless test execution¹.
Detection and Resolution Utilising machine learning, Self-Healing Tests monitor system behaviour, analyse patterns, and execute predefined recovery actions, learning and adapting to changes².
Value to a Team Valuable for teams dealing with frequent updates, Self-Healing Tests can enhance reliability, reduce downtime, and save time and costs¹.
Potential Failures Despite adaptability, there’s a risk of misinterpreting changes, potentially altering test intents².
Likelihood of Usage Considering benefits and reduced manual intervention, I’m inclined to use tools with this feature, provided potential failures are managed³.

(1) testRigor AI-Based Automated Testing Tool
(2) Navigating Self-Healing Test Automation
(3) Self Healing Test Automation: A Key Enabler for Agile and DevOps Teams
(4) What is Self Healing Test Automation | BrowserStack
(5) How to Use Self-Healing in Test Automation to Reduce Flaky Tests

6 Likes

We don’t use any self-healing tools, so I went with option 2.

We all know that self-healing usually refers to UI automation. These tools usually look into similarities in page structure (there’s no element with the same attributes any longer, but maybe there’s an element with mostly the same attributes, or with slightly changed attributes), or try to identify element visually (structure might have changed, but element is supposed to look the same as it used to). So I was wondering if there’s anything available that would apply self-healing to API tests.

I was not able to find much. One tool vendor published a blog post about self-healing tests that mentions API. Quoting entire paragraph:

During API testing, an API endpoint changes, moving from /v1/user to /v2/user. A self-healing test suite could detect the HTTP 404 errors resulting from calls to the old endpoint, recognize the new /v2/user endpoint in the updated API documentation or response headers, and update its tests to use the new endpoint.

The lack of detail makes me think this is not something that tool is capable of and author of article has no actual experience in that area. There’s no discussion how tool is supposed to figure out that endpoint changed (is it going to randomly try various endpoints until one responds?). There’s also no acknowledgement that these v1 and v2 are there for a reason, and maybe blindly changing one to another is not the best idea. In fact, if entire API endpoint disappeared from the application, that’s very significant contract violation. I would expect a tool to fail the test until explicitly told “yes, we removed deprecated API, this is fine”.

I also found someone’s article about their tool for API testing. This is not really AI. As far as I gather, the tool contains a collection of tests and runs them all against endpoints described in OpenAPI spec. So if OpenAPI says that field is integer, then tool will try sending various integer and non-integer values; if OpenAPI says field is mandatory, tool will try sending request without it; you get the idea. It’s actually pretty cool. The “self-healing” here means that as your API changes, you give new OpenAPI spec to the tool, and it will generate new tests based on new spec.

I tried searching arXiv for self-healing API, but that didn’t return any relevant results.

3 Likes

Hello, @billmatthews and fellow learners!

I chose option # 2 and referred to four sources to learn about AI for self-healing tests.

My sources were:

Here is a mindmap with my detailed learnings from today:

I have also done a video blog on this topic with my detailed thoughts. Check it out here:

Thanks,
Rahul

5 Likes

Awesome job @parwalrahul
Great mind-map and a fantastic walkthrough video

2 Likes

I read through the other guy’s testaments and decided to ask my new friend Bing :smiley:
“As an automation engineer can you give me an evalution of Self-Healing Tests. Please look at benefits, issues, risks and any future changes”

Self-Healing Tests: An Evaluation

1. Benefits:

2. Issues and Challenges:

3. Risks:

  • Overreliance: Blindly relying on self-healing without understanding its limitations can lead to false confidence. It’s essential to strike a balance between automation and manual testing.
  • Complexity: Implementing self-healing mechanisms involves additional complexity, especially when dealing with dynamic UI changes.
  • False Positives: Self-healing may occasionally misinterpret changes as defects, resulting in false positives.
  • Learning Curve: Teams need to learn how to configure and fine-tune self-healing algorithms effectively4.

4. Future Changes:

  • Machine Learning Advancements: As machine learning techniques evolve, self-healing algorithms will likely become more sophisticated and accurate.
  • Integration with AI/ML Tools: Integration with other AI/ML tools can enhance self-healing capabilities.
  • Industry Adoption: Wider adoption of self-healing practices will drive further research and improvements.

In summary, self-healing tests offer significant advantages, but they require careful implementation and ongoing monitoring. Balancing automation with human judgment remains crucial for successful test automation in the DevOps model.

Being upfront, I am bit sceptical of the idea of ‘self-healing’ tests, but that is maybe because I focus on data testing, and self-healing tests seem to geared towards UI(that dark art of Automation!)

I am an advocate of the fail-first approach, find it, fix it, retest. This is mainly because by practically ignoring an issue(there are no bugs in software, only issues) then we might inadvertently change the behaviour and thus mask another issue(s).

I do use Selenium to drive screens, so in this respect of driving screens and adapting to unforeseen changes, I can see massive benefits, as long as the actual tests themselves are not compromised.

I would tread very carefully with something that purports to be self-healing. It might just mask a more serious underlying issue.
And we have all probably been there where shortcuts prove more costly than fixing the issue(s) there and then.

3 Likes

I worked at a vendor whose tool included a feature of self-healing tests a few years back, so my knowledge might be out of date. However, even then, the self-healing was useful. Though the tool was advertised as using machine learning, it only used ML for the visual checking. The self-healing was just good heuristics. The tool took in the entire DOM. If some element the test was wanting to use was not where it was expected, there were alrgorithms to search for it elsewhere in the DOM, or find something similarly named or sharing characteristics that might be the right one. It would use it and attempt to continue the test. Whether it worked or not, it made clear in the test report that another element was chosen.

I’m curious if advances in AI and ML tools mean that these products are actually using them for self-healing, does anyone know?

While this can help with tests failing due to changes in the UI or the code that aren’t actually “bugs” - it doesn’t mean you are going to avoid flaky tests. IME the most common cause of flakiness is poor design of the test code. Not enough abstraction, too declarative, not using good design patterns. And while I am a fan of low code, even if you aren’t writing much actual code, you still need to know good design practices and patterns, whether you’re writing production or test code.

4 Likes

Here is the quick summary of what I learned:
Definition of self-healing tests: Self-healing tests are automated software tests that can adapt to changes in the software interface using AI and ML algorithms.

How self-healing tests work: Self-healing tests can identify elements based on various attributes, not just specific identifiers. If an element is not found using the original method, the test will try to find a similar element using other attributes. Some self-healing tests can also learn from past executions and improve their strategies.

Benefits of self-healing tests: Self-healing tests can reduce maintenance, improve reliability, and speed up testing. They are especially useful for software development teams that use continuous integration and continuous delivery (CI/CD) methods.

Risks of using Self-Healing AI tools : There are several risks associated with using AI for self-healing tests, especially in the critical medical/hospital setting. Some of the key concerns include inaccurate AI decisions, lack of explainability and transparency, data security and privacy, regulatory hurdles, and over-reliance on automation.

3 Likes

Day 20

Find and read an article or paper that discusses Self-Healing Tests

Self healing architectures have been around for a while:

Perhaps where a circuit breaker disconnects an erroring dependency and then tests periodically to see if it is still problematic and when its rights itself, reconnect it.

Self healing tests seem a little shakier use case to me. A failing test is an invitation to explore (or so the cliche goes), so to heal it seems a little of a waste.

Self healing tests seem to focus on changing identifiers. This article on Browserstack focuses quite a lot on that problem:

There seems to be a couple of angles:

  • Using multiple identifiers for an element (xpath, id, css selectors) to allow greater chance of healing the test.
  • Issue identification and analysis. In this article it infers that a tool would tell you what has changed and then heal the test. So it wouldn’t be too opaque.

I suppose it would be a shame to fail your millions of tests on your overnight run because a developer renamed an element id. A team that works together probably has less need of this type of technology but maybe better for massive honking test suites for massive honking companies who can’t help but build huge integration suites for multiple teams?

How valuable is a feature like this to your team?

We have a separate class for identifiers so we can see any changes without having to dig into UI code. Its not perfect but I think it helps here. Also, we are a small team who try our best to work closely together.

How might this fail? Based on your reading, how might this Self-Healing fail in a way that matters?

  • After multiple changes would the test retain its original intent. Maybe one change wouldn’t be so bad, more of a culmulative thing?
  • It could keep dead tests that don’t really do anything alive for longer.

I think my biggest fear is that everyone becomes sloppy when looking at automation like this, self healing ourselves into a false sense of self satisfaction. I don’t see a high functioning development team needing this too much.

1 Like

Self-healing automation may be difficult for integration testing because the AI may not understand the full intend of the test. The same may apply for complex tests where the change of one parameter may falsify the test results.
TThere is also a high risk that the self-healing tool fixes something that is not broken (for example in a negative test), leading to invalid tests.

Self-healing testing could be useful for BDD automation with Cucumber / Gherkin Framework, where the test cases are written in plain text format following a specific keyword structure.
It could also be used if the tests are close to the code of the application, e.g. in unit testing.

To reduce risk, all “corrections” made by a self-healing automation tool should be reviewed.

2 Likes

Hey there :raised_hands:

For today’s task, I choose option 2.

I think I already wrote about self-healing in another day, but for this one I read this article: What is self-healing automation.

And what I extracted from this article is that the main action of self-healing feature is finding and changing locators.

If the system does not have specific locators, has dynamic ones, or there is no collaboration between the dev team with QA team the locators can be a really serious problem, there are ways for us to mitigate the problem with locators path, but it is not magic, so self-healing seems to be a very useful feature for those situations.

But if it is not the case, or the changes are more drastic as changing a screen from main view to an iframe, the AI cannot do anything from you as the changes in the project code won’t be limited to locators.

And the locator changes can be tricky, so this doesn’t change the obligation of someone to review the changes to see that if the test is still working as it should be.

Maybe, when cypress implements a tool to support it I’ll give a try, but for now, I don’t see any valuable use for me.

That’s it for today :wink:

2 Likes

For me this has often been the most significant challenge - if a test is “healed” how do i know this hasn’t changed the intent of my test…if i have to go and check then i might as well have fixed the test myself :slight_smile:

When we get to the point where the tool understands the intent of the test then i might feel more confident about self-healing tests. However, it’s also worth noting that I personally don’t encounter many situations where such minor changes to an app impacts the automation that often.

For self healing tests with A.I., I see value in this feature as we are suffering in automation a lot with regression test cases that fail every now and then. The so called flaky tests. Sometimes you don’t have grip on this and it is good that A.I. assists you in recognizing these flaky tests and come up with solutions. These A.I. proposed solutions are indicating where in the code this flakiness appears. These proposed solutions from A.I. are for you to judge if it solves the problem or not.

Sometimes the A.I. proposed solution solves the problem because you have to deal with e.g. timing issues. But that is not always the case. The problem can also be in the application code. With a tool like Replay.io, it helps to bridge the communication gap between the testers and the developers by offering an exact replication of bugs and their execution steps. See this video “Replay.io: The Future of Test Automation” (URL: https://www.youtube.com/watch?v=XxYZhS9Tkzw)

If a test fails due to a modification in the software, A.I. with the self healing test feature assists you also with detecting these issues and come up with suggestions. In this situation proposal can be more reliable, because the problems which need to be solved are locator issues, buttons no longer there or renamed or link to next page does not exist anymore.

For sure I would delegate this self healing tests feature to an A.I., as I see the A.I. as an assistant. It helps me to quickly guide me to a possible outcome that can beneficial. As I don’t know how the A.I. comes with this possible outcome, it is for me to judge if it is correct and useful.

I mentioned in Day 6 a tool called Functionize (You build great products, we take care of the tests). I checked out their web site and I’m interested in looking how to integrate this in my daily work. The tool uses all the test runs executed in the past and check it with the current run. Based on this it can assist in the flakiness of the tests and help put with the regression issues.

3 Likes

I greatly enjoyed your video and mind map. One thing that is being not explored is visual testing. If we automatically look at the entire screen could there be a way that AI could figure out whether there has been a meaningful change or not? I remember looking at bitmaps early on in my automated testing career and they often gave me false positives. Could AI solve this problem in visual testing? I would add this to the Mind Map if it was my Mind Map.

As far as looking at objects, it seems to me obvious that the way to have less flaky tests is to write less flaky code. The way Cypress wants you to write less flaky code is to add your own attribute to each object in the code - a unique attribute that is just used by test. If the other attributes change, the test attribute will stay the same and will stay unique. This is supposed to solve the problem of flakiness and would make some of these self-healing algorithms unnecessary. It would also pass the user case when you change the size attribute of an object - the test would still pass (although it might cause a visual error).

I think the reason people are not using self-healing AI products in real life is because they sound like snake oil that may result in false positives - but even worse - false negatives without any transparency to see what is truly going on. I am of course making assumptions and I could be wrong.

1 Like

Yes, it seems like a bit of a non (or at least limited) problem.

If the tool were capable of changing more than an element id, like healing path/url changes perhaps then it might get a bit spicier.

Hey, there. Scroll down to " How Self Healing Test Works in Testsigma" in this artcle: Self Healing Test Automation | Tests that Not Require Maintenance

I’m still not sold on using something like this tool though… as it would still require human intervention to validate the change. It may even take longer to use this tool than to just fix the test yourself.

2 Likes

Hi, everyone,

For this day challenge I chose the second option so that I could learn more about self healing testing, its advantages, benefits for the team and possible risks.

Find and read an article or paper that discusses Self-Healing Tests:

  • Try to understand the types of issues with tests that the tool claims to heal.
  • If possible, uncover how the issues are detected and resolved.

What is Self-healing testing?
Self-healing tests identifies changes made in the application and make automatic adjustments to ensure that the test doesn’t lose its functionality. For excample, the email column in the users table is renamed to user_email. A self-healing test suite could read the new schema, identify the difference, and update its tests to use user_email instead of email.

Improper test maintenance can result in problems like false positives and reduced reliability, which can negatively impact production. AI and machine learning algorithms help solve this issues.

How valuable is a feature like this to your team?

:ballot_box_with_check: Adopt the software changes and fix broken tests.

:ballot_box_with_check: Ensure higher test coverage and accuracy.

:ballot_box_with_check: Reduces false positives caused by software modifications.

:ballot_box_with_check: Improves the accuracy and reliability of the testing process.

:ballot_box_with_check: Speeds up the testing process and reduces test maintenance time.

:ballot_box_with_check: CI/CD integration.

How might this fail?

I asked this question Copilot :robot: and got this suggestion below:

:one: False Positives and Negatives

Self-healing mechanisms can sometimes misidentify issues or falsely trigger corrective actions. For example, if the self-healing system incorrectly identifies a test failure as a genuine issue, it might attempt to fix something that isn’t broken.

:two: Over-Reliance on Automation

The automated system doesn’t catch everything. Human intuition and reasoning are still essential for understanding complex scenarios and making judgment calls.

:three: Inadequate Test Coverage

Self-healing mechanisms operate based on predefined rules or patterns. If the test suite lacks comprehensive coverage, certain issues may remain undetected. If the self-healing logic doesn’t account for specific edge cases, those cases won’t be addressed.

:four: Dependency on Historical Data

Some self-healing systems learn from historical data. If the system encounters a novel issue or a sudden change, it may struggle to adapt. For example, if the application behavior significantly shifts due to an update, the self-healing logic might not recognize it.

:five: Security Risks

Self-healing mechanisms can inadvertently introduce security vulnerabilities. If they automatically modify configurations or code, there’s a risk of unintended consequences. For instance, an automated fix might inadvertently weaken security settings.

Recourses:

Many of commercial tools like Accelo, Mabl, Katalon, Tosca, Functionize, Applitools, Mabl, Testim, Test.ai produce AI generated tools with integrated self-healing capabilities, therefore there are a enough information on this topic.

2 Likes