These regression suites grow because it’s very cheap to manufacture and execute an extremely specific check at the expense of long-term costs, and people want to profit from that saving without dealing with the debt. If nobody can figure out what the benefits are then I don’t see a problem with binning the whole thing.
These kind of issues are also symptoms of other sicknesses. How do you know that the tests that are consistently passing are even doing anything? After all by ignoring passing tests passing you’re essentially using natural selection to only have passing tests. There has to be more to testing than superstition, or you might as well not run the suite and say you did.
I’d say this is a context-sensitive decision based on levels of understanding of the software project, the other software project (automated suite), how the team(s) are organised and how much money you want to hurl at it. There’s lots of questions to ask here - which tests are failing, why, and who wrote them. What do we think these tests do - provide coverage, make managers sleep better, appease the customers, follow policy, abide by laws, etc? How are they serving the testing mission?
Marking as ignore will make everyone do just that - ignore the problem entirely. If those checks were providing less value than dealing with them that’s a fine idea, but you need to understand their purpose (not just the abstraction loss in their stated purpose). Perhaps even delete them. Nobody’s fixing them, nobody’s running them, get the hard drive space back. This has an added emotional impact - it’s more meaningful to say you’re deleting code. Imagine the difference between putting your possessions in a box in storage, even though you know you’re never going to even look at them again, and choosing what to put immediately in the bin. Ooh, y’know, having to throw this stuff out, maybe I will take up watercolours again.
Leaving everything as is might be a way to provide enough pain to cause people to do something to stop it hurting, provided the pain is sufficient and the right people are feeling it. You could also delete all problematic tests and get those who wrote them to write them again. You could mandate each team is responsible for its own coverage and breakages, and they have to investigate any problems each time. The tiger team is a short-term solution but it’s shifting the cause of the problem away from the problem creators, and you can only clean up after people for so long before they have to learn to do it themselves.
One option might be to shred the whole endeavour. I think that helps to frame the ideas nicely - well, what are we actually destroying? Doesn’t it have value over making me feel all comfy inside? What problems are getting into production, out through the users and back to us? What actually is our coverage? Do we even need to run these any more? Who’s paid to make this work, and can we ask them why it’s not working? What would it cost to fix? What would it cost to replace? Really? Ooh, maybe I will take up watercolours again.
Edit: To be more solution-oriented I’ll say that looking at purpose can be a great way to facilitate change. If it’s just there because it’s cheap to run, in a “hey, who knows?” kinda way, then any cost is important. If it’s there because our testers know what they’re doing (including if those testers aren’t called testers), then that person holds the purpose. Sometimes tests go in because “eh”, sometimes because “ooh better check”, sometimes because “if this fails again we lose our biggest customer”, sometimes “if this goes wrong people die”. Purpose gives you a sort of nexus to make business decisions about projects like automation suites.