I’m working at a company that continuously deploys various microservices to kubernetes throughout the day and I’m trying to work out where best to slot in some UI E2E tests.
As part of any successful PR we first deploy to a staging environment, then if that succeeds it automatically goes out to production. Unit and integration tests are performed before deploying to staging.
Because of the complexity and size of the system, it’s not possible to spin up a kubernetes cluster and run the system as a whole.
Some UI tests could be run after deploying to staging but every pipeline would have to be updated (We have over 100) to accomodate these tests. It’s also possible the teams won’t be happy with the extra time that is adding to the build and deploy process.
Or you could have a seperate pipeline that just runs UI tests periodically. You wouldn’t get that fast feedback about what specific change might’ve caused an issue but you also wouldn’t get that potential interference that UI tests can potentially bring.
Are there already pipelines that run automated checks that are more expensive than normal? For instance, performance checks. I imagine that, while each kind of check will have its own requirements on infrastructure etc, they will also have stuff in common.
For instance, run overnight / at the weekend rather than with every commit or PR, giving feedback in a different way - not tied to a particular release’s pipeline failing, but just to a date so several releases might be relevant.
Sorry to not give direct answers, but I thought this might help to reframe the problem.
A couple of things that aren’t clear but I feel are important. I’ve tried to make some assumptions and provide some ideas if any of them are true.
Is there a single non-prod environment where all of your microservices run and interact with each other?
If so, you have a few options
a) run the UI tests continuously, if they find a problem, you’ll have a fairly small window to discover which deployment caused it
b) have your deployment system trigger the UI tests, so that after each deployment the run is started. You could have a running service, outside of your pipelines that watches for changes.
If not, you could do the same but in Production. It’s quite late but synthetics in Prod aren’t a bad thing as they’ll also help pinpoint issues which are out of your control, such as 3rd party services which you don’t have integrated in your non-prod env.
Why aren’t your UI pipelines owning this?
If so, then accept that these pipelines would need to change, building in the necessary checks to determine whether any previously valid assumptions made on integrating with the underlying services, are still valid.
Do you have any contract tests?
If the UI breaks following API changes, there’s a good chance that there’s a misunderstanding between how the API teams think there API is being used and how the UI teams actually use them. You could manage this risk a bit with contract tests, built with the UI teams
Don’t the API teams take responsibility for near neighbours?
In an ideal world, we’d never need to worry about our near neighbours (consumers and dependencies), but the world isn’t ideal. I’d still expect API teams to have an understanding of how the UI uses their API (see 3) and consider this whilst testing their own changes. If there are any identified breaking changes, I would expect the team to look in to these themselves or proactively communicate the risk to the UI teams.
I’m going to assume that the UI only talks to a few or group of services - this is a static analysis task to separate out anyway. Often it’s very hard to get “smart” and only test when needed. If your UI is talking “directly” to all of the services, for basic tasks, then I’m unsure how to advise, because that sounds like a separate exercise.
But if it were me, I would add a small “smoke” suite of UI tests to the pipeline job or any “deployment” which actually re-deploys every service. At most 4 tests, and not taking longer that 5 minutes. I am assuming building and deploying all (services plus the site), + all of the API/contract tests takes about 10 or more minutes anyway. So, if it takes longer than 1 minute to spin up chrome and login to your website, that’s not going to be a productive pipeline anyway. This would for you need to be at a specific staging/integration layer though.
I think your hastle is that you have a continuously deploying environment, so unless you can “trigger” or filter based on a small specific group of services, then it’s just too expensive. What if, you used a exclusion list (blacklist) of services that don’t trigger UI tests, then you could learn from this blacklist over time and use it to talk to your teams, about where the UI is fragile. I have used such “dynamic” blacklists in the past, for other test triggers. But be sure to make the list obvious and done in a way that is entirely transparent to developers.
So we have a full E2E test which runs on a daily basis.
Besides that in a microservice landscape we use tags or labels in order to run parts of the E2E tests.
So if you have 26 microservices from A to Z and you make a chance on X, we’ll run all the tests that are connected to X so we run: test x1 x2 x3 , … in the pipeline itself. These are triggered by the tags used on the PR/build (after the Integration tests).
If you don’t like it that your pipeline is taking to long. You can make a separate pipeline which does this exact same thing. So you’ll only need 1 pipeline which can run all the tests. From set A1 to Z100. Whenever your pipeline is finished with the new build and the integration tests are green, you can send a trigger to this pipeline with the tags/labels provided and so you can run this separate from the original pipeline. (Downside of this is, this can be triggered multiple times at the same time due to many changes in different microservices) which makes this a very active pipeline to run. But that’s dependent on how many build pipelines you have for your microservices.