I observe that some folks donāt bother, as itās more effort than the value it returns. I also see that there are teams who are super strict on how they estimate due to commercial client contracts.
Iāve used estimation by time and estimation by effort. And tried my best not to estimate by complexity.
In your context, what sort of estimation techniques do you use/have used? How has it helped and how has it hindered? How do your estimation efforts fit in with the bigger picture?
When we estimate tickets on the backlog (t-shirt sizes), itās an estimation for everything in our Definition of Done, which includes testing, merging, deploying and any post-deploy tasks or checks. So I donāt really estimate testing effort on its own.
We size by vote into agreement. So we all submit what we think the size is, and then if not everyone voted around the same, we talk about it until we agree. In practice, this means I might vote a relatively small dev task to be a 3 pointer, while devs think itās a 1 pointer, because it will involve regression of that horrible piece of complex functionality we havenāt got good automated coverage for at the moment.
If asked something like āhey Bruce dependabot wants to update the versions of these 10 dependencies, how big a task would it be for you to have a poke around and see if it breaks something?ā then I will usually give a time estimate rather than a t-shirt size.
My experience is quite similar to Bruceās. Since nearly a decade now Iām nearly always in a Scrum teams (different ones).
Other than that I also was not asked at all, but management put, by rule of thump and experience, e.g. 50% of the developers estimation.
Some times it was still to less and releases had to be postponed and maybe also other projects.
I like the way itās done in my Scrum teams.
Sadly still regularly is the sprint end, as date, the end of testing and not a certain agreed quality level.
Mostly experience based. Larger projects usually start about 15% of dev then adjust if there are any special risks, for example unusual 3rd party integrations, known deeper security risks etc.
Rarely have I found it worth estimating testing in isolation, at the end of the day I can adjust testing to meet any estimate by varying breadth and depth of testing.
Smaller projects I have done estimates based on risk charters and test sessions, rough 3 to 4 a day so if Iāve drafted 20 charters its 5 to 6 days plus 20% for new charters discovered and retests.
Both approaches are still very much finger in the air.
One of the biggest factors is the experience of the development team, the difference between testing a buggy product and non-buggy product can be ten fold easily and thatās what for me makes the estimations very rough at best.
For manual testing, a lot of experience-based testing a la Planning Poker mixed with estimation by effort works very well.
For automated testing, I prefer to use a model based on complexity (like Function Point Analysis, but modified for the Test Cases) still mixed with Planning Poker and an expected productivity index.
I donāt have too much problem under-estimating nor over-estimating this way
If itās smaller work like stories or bugs then a t-shirt swag is usefulābut only from testers whoād do the work or are very familiar with it. Product, scrum masters, and devs donāt get to estimate testing work.
For larger things like release regression testing, Iāll often bounce the question back to the product owner/business: āHow long would you like me to spend testing?ā Because that helps frame a conversation around what they worry about mostāsort of the Always/Never things. Asking them initially for a timebox helps drive out that risk coverage, and we can adjust the timebox as needed.
Depending on the team we have two methods. One side of the department uses story points for refinement estimates so testers give an outline estimate in points in parallel with the devs, the other uses t-shirt sizing and the estimate is discussed and given to include the dev and test effort (historical reasons for the differences). Within the testers we then estimate in hours when planning the work and also log time taken. Some of our projects task testing along side dev work within the sprint, others task testing outside of the sprint on a separate kanban board (we found that sprint burndown was being impacted too much by testing on the project not being possible until close to the end of the sprint so we took testing outside of the sprint structure which has given us a better quality of software because testers donāt feel so pressured to fit their testing into time available rather than test thoroughly).
The estimates themselves are usually based on prior knowledge (a similar dialog took 2 hours to test so this one with an extra whatever will probably take about 2 1/2 for example) but will include time for test case and data design as well as the actual tests.
I like estimating in an easy and helpful way. You shouldnāt spend more time estimating than the actual effort will be. The estimate should be meaningful to the team. Because we have a lot of bespoke development we need to know effort in hours. Our sizes are usually 0.5 or 1 day test effort. We estimate based on experience and itās easier than you might think. After a couple of wrong estimates you get the hang of it and we found out that the numbers usually balance each other out in the end.
Wish I could stop estimating in durations and switch to sizes. Itās problematic because my sizes will thus rarely translate well to a different person picking up the task. That said we often under-estimate the automation test effort on a ticket. We are finding unknowns too often.
However since I do push some tickets to the devs to automate, adding the test points to a dev ticket makes more sense, although it means that a 5 point ticket is often really a 6, which is not going to make sense anymore. All pointing to a need to either include test scripting in the definition of done implicitly, or split into another ticket. Iām loathe to do that splitting of tickets, but Iām also keen to ring-fence exploratory testing time (by a tester not by a dev) on every single product change. This might sound like Iām saying devs cannot exploratory test, but Iām really saying devs donāt have access to as many environments and kit. Not everyone has their act together, my team donāt⦠yet.
As discussed via our direct message conversation, I thought it would be useful for folks who arenāt familiar with t-shirt sizing to have a link to read up on it, given you mentioned it.