How do you estimate your testing?

How do you estimate your testing?

I observe that some folks don’t bother, as it’s more effort than the value it returns. I also see that there are teams who are super strict on how they estimate due to commercial client contracts.

I’ve used estimation by time and estimation by effort. And tried my best not to estimate by complexity.

In your context, what sort of estimation techniques do you use/have used? How has it helped and how has it hindered? How do your estimation efforts fit in with the bigger picture?

When we estimate tickets on the backlog (t-shirt sizes), it’s an estimation for everything in our Definition of Done, which includes testing, merging, deploying and any post-deploy tasks or checks. So I don’t really estimate testing effort on its own.

We size by vote into agreement. So we all submit what we think the size is, and then if not everyone voted around the same, we talk about it until we agree. In practice, this means I might vote a relatively small dev task to be a 3 pointer, while devs think it’s a 1 pointer, because it will involve regression of that horrible piece of complex functionality we haven’t got good automated coverage for at the moment.

If asked something like “hey Bruce dependabot wants to update the versions of these 10 dependencies, how big a task would it be for you to have a poke around and see if it breaks something?” then I will usually give a time estimate rather than a t-shirt size.

1 Like

My experience is quite similar to Bruce’s. Since nearly a decade now I’m nearly always in a Scrum teams (different ones).

Other than that I also was not asked at all, but management put, by rule of thump and experience, e.g. 50% of the developers estimation.
Some times it was still to less and releases had to be postponed and maybe also other projects.

I like the way it’s done in my Scrum teams.
Sadly still regularly is the sprint end, as date, the end of testing and not a certain agreed quality level.

Mostly experience based. Larger projects usually start about 15% of dev then adjust if there are any special risks, for example unusual 3rd party integrations, known deeper security risks etc.

Rarely have I found it worth estimating testing in isolation, at the end of the day I can adjust testing to meet any estimate by varying breadth and depth of testing.

Smaller projects I have done estimates based on risk charters and test sessions, rough 3 to 4 a day so if I’ve drafted 20 charters its 5 to 6 days plus 20% for new charters discovered and retests.

Both approaches are still very much finger in the air.

One of the biggest factors is the experience of the development team, the difference between testing a buggy product and non-buggy product can be ten fold easily and that’s what for me makes the estimations very rough at best.

It depends on the project.

For manual testing, a lot of experience-based testing a la Planning Poker mixed with estimation by effort works very well.

For automated testing, I prefer to use a model based on complexity (like Function Point Analysis, but modified for the Test Cases) still mixed with Planning Poker and an expected productivity index.

I don’t have too much problem under-estimating nor over-estimating this way

Edit:
Planning Poker: Planning poker - Wikipedia
Function Point Analysis: Fundamentals of Function Point Analysis.doc (unipd.it)

2 Likes

If it’s smaller work like stories or bugs then a t-shirt swag is useful—but only from testers who’d do the work or are very familiar with it. Product, scrum masters, and devs don’t get to estimate testing work. :slight_smile:

For larger things like release regression testing, I’ll often bounce the question back to the product owner/business: “How long would you like me to spend testing?” Because that helps frame a conversation around what they worry about most—sort of the Always/Never things. Asking them initially for a timebox helps drive out that risk coverage, and we can adjust the timebox as needed.

2 Likes

Depending on the team we have two methods. One side of the department uses story points for refinement estimates so testers give an outline estimate in points in parallel with the devs, the other uses t-shirt sizing and the estimate is discussed and given to include the dev and test effort (historical reasons for the differences). Within the testers we then estimate in hours when planning the work and also log time taken. Some of our projects task testing along side dev work within the sprint, others task testing outside of the sprint on a separate kanban board (we found that sprint burndown was being impacted too much by testing on the project not being possible until close to the end of the sprint so we took testing outside of the sprint structure which has given us a better quality of software because testers don’t feel so pressured to fit their testing into time available rather than test thoroughly).

The estimates themselves are usually based on prior knowledge (a similar dialog took 2 hours to test so this one with an extra whatever will probably take about 2 1/2 for example) but will include time for test case and data design as well as the actual tests.

I like estimating in an easy and helpful way. You shouldn’t spend more time estimating than the actual effort will be. The estimate should be meaningful to the team. Because we have a lot of bespoke development we need to know effort in hours. Our sizes are usually 0.5 or 1 day test effort. We estimate based on experience and it’s easier than you might think. After a couple of wrong estimates you get the hang of it and we found out that the numbers usually balance each other out in the end.

2 Likes

Wish I could stop estimating in durations and switch to sizes. It’s problematic because my sizes will thus rarely translate well to a different person picking up the task. That said we often under-estimate the automation test effort on a ticket. We are finding unknowns too often.

However since I do push some tickets to the devs to automate, adding the test points to a dev ticket makes more sense, although it means that a 5 point ticket is often really a 6, which is not going to make sense anymore. All pointing to a need to either include test scripting in the definition of done implicitly, or split into another ticket. I’m loathe to do that splitting of tickets, but I’m also keen to ring-fence exploratory testing time (by a tester not by a dev) on every single product change. This might sound like I’m saying devs cannot exploratory test, but I’m really saying devs don’t have access to as many environments and kit. Not everyone has their act together, my team don’t… yet.

1 Like

I couldn’t find any posts on t-shirt sizing that I liked—the ones I read really over-complicated things—so I wrote my own short post.

Let me know if this was clear, or if I can improve it!

1 Like

Thanks for posting, @jimholmescc. I like it! :smile:

As discussed via our direct message conversation, I thought it would be useful for folks who aren’t familiar with t-shirt sizing to have a link to read up on it, given you mentioned it.