How does a feature ever get off the release train?

Whatever shape your Software Development Lifecycle takes, it’s a cycle that keeps on going round. Features, bug fixes, paying down tech debt, whatever the change to the system is, there is always some new code to introduce.

As a product grows, be it a monolith or a collection of interconnected systems, typically changes continue to be introduced, and the list of expected behaviours grow.

All these behaviours need to be developed, tested, monitored, maintained, and no doubt tested again and again. Over time, this means the capacity of your teams can be reduced, as the burden of carrying the ever more complex system builds on top of what came before.

So, how do you handle making sure features become dependable enough to be left out there in production, minding their own business, without needing to keep adding more and more capacity to look after them? How do you ever find capacity to add new features, without neglecting the careful maintenance of existing high value features, after all those are the ones making you money!

While I’ve got some ideas, and I’ve seen a few different approaches, I really want to hear about your experiences on this one. Not theory, not what a book says, but what you’ve personally experienced.

Has regression testing or monitoring become overwhelming?

Maybe you have novel method for removing cruft from your test suite as you go, or even murcaily remove underused features?

Whatever your answer, let me know! Let’s talk more about software after it’s first put into prod.

1 Like

The processes I use:

  • Have a process for removing unused or underutilized features. I’ve lobbied to get this tacked on to retro or sprint planning if the app is very old and crufty. Sometimes I’ve been successful, sometimes not.

  • Aggressive parallelization of all tests which aren’t run against deployed infrastructure (i.e. isolated E2E tests, integration tests, unit tests).

  • Do run tests against deployed infrastructure and real APIs, databases, etc. - but do it rarely and out of cadence (e.g. nightly, not necessarily after every pull request is merged) and try to have an equivalent that mocks those “outside things” (APIs, database).

  • Shift left testing - I will generally default to building isolated E2E tests, but sometimes it makes sense to push the test down the stack. Highly complex algorithmic or decision making code that is sensibly decoupled from its dependencies is generally better unit tested, for instance.

(isolated E2E test is an end to end test on an app, using playwright for instance, where it can run without connecting to anything in the outside world - databases, APIs, etc. It does this by running its own version of databases, mock APIs via mitmproxy, etc.)

I have more experience in how not to do it than how to do it. But here are a few of the nuggets I have gotten.

Contract based testing What is Contract Testing & How is it Used? | Pactflow is definitely a good way to help with the process of getting rid of outdated behaviours. For that to work you need a good architecture of your system, but if you have that it is very nice since it shifts the ownership of what is used in a service to the consumers of the service and not the maintainer of the service.

As for monitoring the best experience I have is when the development team is also responsible for the service agreement taken from the DevOps ideas. Basically if you are the one getting an alarm in the night and you have the power to make it more reliable to avoid that you tend to. If you do not feel the pain of your choices you might optimise for something else than reliability. As with everything this is not setup have pros and cons, but for the value of monitoring and reliability of this is the strongest pattern I have seen.

But I guess for most places the first and most important step is to acknowledge that this is a thing and actively work on it on all aspects of the lifecycle. If your product owners push to must for adding new features and not allow for the space to make it properly the team cannot take ownership of their product. Or if you reward the firefighting behaviour of “staying up all night and fix the problem” instead of reward the group that does not get problems in the first place you will have an organisation that will drown in maintenance.

We had monitors in the office for each team with a dashboard of the teams part of the system so this was top of mind for everyone. The tester in the team developed monitoring agents for the dashboard since they knew what the weak points of the system where. And if the team said this service have been causing us problems the product owner said then we fix that instead of push this new feature at this moment. Which was very refreshing.

We also had all tests for a service collocated with the service code which also increases the relevance of the tests, including both unit tests and performance tests in addition to the contract tests.