Eric, I get your thing about how long should e2e be. I’m coming from a world where the app is a thing with a GUI or a web page that humans actually use and buy the product. If the application is a for example a web-service, with fewer dependencies, I would expect a lot quicker if for example there is no purchasing stage in the system. Skipping the purchasing entirely is very risky to the business.
In CI/CD you want as short a possible E2E smoke test. James Bach has written a lot about E2E, its about testing end-to-end. A bit like how buying a thing on amazon includes creating amazon account, adding credit card, add an address, adding item to basket, hit the buy button and then logout. In the amazon test environments they do all of these steps in mere seconds. That’s why amazon can and do deploy thousands of times per day.
In my first job where we did e2e properly, I wrote a post-build Smoke test that took average 1 hour to run. It was for a server-based app, no microservices, but included the time taken to deploy the app and all needed services, and back then, the setting up alone took 40 minutes, leaving 20 minutes of testing time. The Smoke test can be user as a gate, which if it passes triggers regression testing and other E2E testing.
I’m actually not directly involved in the services where I now work, but I know that deploys take about an hour, into an AWS environment. During that time each component runs loads of unit tests, often in parallel, any microservices also must run their contract tests too. But even so this leaves us very little time to run integration tests if we want to be truly not blocking the delivery pipeline. Luckily we have “functional” testing time down to about 10 minutes. This includes
- account creation (completely new user onboarding and provisioning) - about 1 minute
- web app client - about 30 seconds to spin up a selenium grid and open browser
- server-end app deployment - about 30 seconds
- login, add a resource, find the resource, edit/connect to it, close it - about 10 seconds
- logout - about 1 second
- repeat the above steps for each headline feature and in slightly different paths
There was a thread here not long ago defining E2E, I’ll try find it so long. But for me it means covering the entire user journey; account creation, login, do any 1 thing that 80% of users do, logout, and finally delete all resources and accounts (CRUD - create,read,update,delete) . That last bit about deleting the account takes only 5 seconds, but uncovers loads of bugs early. I try to pack as many quick but totally vital things into the smoke test as possible. Never ever add a test for something optional into the Smoke test. The smoke test must never fail due to a check that is not critical. It must not raise false alarms for things that can easily be caught later on in a regression suite. Basically test as much as possible as early as possible, without having noisy tests that fail and block the coders.
I have had to differ in my “definition of done” in every single place I have worked. “Done” must include testing, but if you work in a place where for example language translation is needed, you cannot let your team get held because translations take 5 working days. So for some teams done mean the team owns all external dependencies, but in some places that fails to hold true. Done should also include documentation written, user documentation becomes another matter. Because we have staggered deployment stages, Done means deployed to “dev” environment, and then promoted from there into a 2nd environment called integration, but not the deploy into “live” environment it no part of “Done” where I work, because that is still far too infrequent sadly.