Why would someone use Cucumber?

A great conversation starter came up on MoT Slack recently:


I actually would like someone who uses Cucumber or JBehave to convince me, because it’s so prevalent, for some reason

I wanted to reply on Slack but since this is a topic that comes up frequently, I felt a Club post was more suitable:

I think one of the issues is that Outside-in-Development (ATDD and TDD combined) gets conflated as BDD when really Outside-in-Dev is a part of BDD. Typically when we discuss Cucumber we are referring to Outside-in-Development and not the whole of BDD. With that in mind, if you want to define the value of Cucumber, you have to go back to risk, specifically the risk of building something the business doesn’t want. The point of Outside-in-Dev is to guide developers with automated examples by automating them to fail and designing production code to make them pass. Cucumber is used as a scaffold in which the developers can build a product. An analogy I like to use is: automated examples are like bumpers on a bowling alley. They stop the developer from bowling a gutter ball but leave space for them to design code that is a strike or a single pin.

The risk of incorrect delivery and it’s mitigation is very different from automated regression checking though. Automated regression checking is dealing with the risks of changes in a product and how we deal with said changes. Cucumber, or more specifically Gherkin, isn’t a great tool to use for detecting change across a complex system because GWT is designed to focus on the product as a whole. This typically results in the usual pattern of testers trying to automate numerous flows through the system from a high-level system perspective, which has inherent problems around brittleness, maintenance, feedback, etc.

Sure you get readable tests, but if they are tests that are designed to exhaustively (or at least attempt to) test an application. Then they cease to be documentation and are no better than test cases. Which in my opinion offer better readability than Gherkin (caveat: in the context of testing!!!)

So in summary. Cucumber, or any other Gherkin based tool, is used for mitigating risks around developers delivering the wrong thing and when you have a mature team that can develop around that approach, it’s great! If that is the risk you are mitigating, then go nuts! However, if you are focused on automated regression checking. Then I would encourage a team to go a bit deeper into their product, understand it better, leverage interfaces nearest to risks that they are trying to test and basically avoid using Gherkin tools to automate everything.

- Mark

P.S. Yes I know you can still use Cucumber for testing regardless and yes it may work for you. But I would challenge the naysayers to find a better way to automate. Just because a tool isn’t the right tool for a job doesn’t mean it won’t work. It simply means the work is much harder!

1 Like

I agree completely. BDD is an approach and a set of tools to drive development from behaviour. It’s a set of reminders to develop in a particular way. Using Gherkin to communicate beyond those that fear and respect its limits is dangerous, to me.

Are the tests “readable”, though? They have very heavy abstractions that their flavour text betrays. As far as I’m concerned we should never trust English language representations of computer scripts, and the human-readable part is just our attempt to write scripts that support the behaviour we’re trying to check. We don’t get readable tests we get checks based on something readable - I don’t think it fully goes both ways.


Having had a play, I think Gherkin potentially encourages some good practices if used well. These include tightly focused, readable test cases* and framework design that more closely models the user experience**. So I don’t necessarily view Gherkin as just a thing for developers.

  • And more importantly get to the point of what is being tested. I’ve seen too many test cases in my own environment where I’ve thought “Sure, I can do that, but what are you actually testing”.

** State awareness for example, just like a human being is state aware when using a bit of software.

Hi Mark,

I agree that BDD is not testing but I am not sure I follow. My teams have used cucumber and or jbehave for their BDD/ATDD scenarios very proficiently. If you are interested in seeing how they did, you can have a look at


You mention that “Cucumber, or more specifically Gherkin, isn’t a great tool to use for detecting change across a complex system because GWT is designed to focus on the product as a whole”, but this might be the approach you used. You can abstract any simple check even at the unit level to be a GWT, why can’t you?

Also, the fact that you use cucumber or jbehave doesn’t stop you to use other tools for your regression testing, personally we very rarely found it necessary.

For context, most of the teams that used the approach i describe, release multiple times a day and can count the production defects escaping every year on one hand.

Hi @gus

Yeah, that final P.S. was poorly worded. I used the term testing in there in a bad way, hence the confusion. Specifically what I am saying is that:

  1. Outside in Dev (ATDD / TDD) is for the risks around delivery. Devs use it as a guide when they implement production code to make the failing tests pass. After that, when they are run if they fail, they are telling us our product is drifting away from business expectations. But that only gives us partial feedback on a product (Although a part that is a priority)
  2. Automated regression checking (or whatever you want to call it, basically anything other than ATDD/TDD activities) is focusing on risks around the application that are not explicitly business focused.

So as an example:

I totally agree that a captured automated GWT can focus on a specific unit of a product. When that is run it and it ‘passes’ that tells us we have delivered what the business wants (will leave the debate of would a user interact directly with the unit or an abstraction above it for now). So that runs and it tells us we have mitigated the risk of not delivering the right thing. But there are other risks we need to be mindful of.

Now as a tester, I might be interested in things such as the boundaries of that unit, how does it handle failure, is it able to integrate with other units, or it’s security or it’s performance, etc. These aren’t behaviours or risks explicitly stated by the business but I may want automated regression checks/tests for them. So I will use tools that give me the fastest feedback, are the most deterministic and that typically means removing (or never using) abstractions like Gherkin.

To bring it back to my poorly original sentence. Doing all your change detection activities with GWT is an inefficient and ineffective approach. Base it on risk. Use Outside-in-Dev (ATDD/TDD) for specific risks around business delivery any other risks don’t really need GWT. They work side by side focusing on different risks.

Thanks for your answer Mark, much appreciated. And thanks for the respectful conversation!

You say "Outside in Dev (ATDD / TDD) is for the risks around delivery. " I would partially agree. Yes it is for the risks but more importantly is for discovery. The conversations are what drives what the team will deliver, the risk side is a very small pleasant side effect of test first approaches. Test first is way more than tests, automated tests are its pleasant side effect IMHO

You say " 1. Automated regression checking (or whatever you want to call it, basically anything other than ATDD/TDD activities) is focusing on risks around the application that are not explicitly business focused."

I agree with you, and nothing stops you to use GWT to describe the automated tests that are not explicitly business focused. I am not saying it is the best choice but it is a choice and it does work in some contexts

You say “Doing all your change detection activities with GWT is an inefficient and ineffective approach.”
we’ll have to agree to disagree on this :smiley: , I am sure your way works very well for you and I won’t try to change your mind. Mine works too :speak_no_evil:

Again, totally agree. It’s worthless without the conversation. In my original post, I am making the assumption that teams that do Outside-in-Development have nailed the conversation part first.

What I am attempting to do is try and frame the automation tooling part of BDD and how it compares to other automation activities. I’m trying to highlight the difference in a way that doesn’t tread the typical reply of ‘BDD is not about testing it’s about conversations’. Which I feel as a phrase doesn’t contextualise what the tools are actually for and is overly dismissive to both activities.

Fair enough will have to disagree on that. I personally think that the troubles teams have with automation are because they are trying to use GWT which is designed to make you think abstractly to test implicit ideals of a system. But yeah, it’s about preference and context. It tends to be my experience that this conversation isn’t even had when strategising automation so it’s worth discussing the merits and drawbacks of both approaches (ATDD without GWT tooling creates a barrier for the business). Automation strategies tend to agreed based on tooling rather than risk so I want to raise awareness of that.

However, allow me turn it my argument on its head and offer an alternative. Imagine if you have some well-formed examples that have come out from an excellent discussion and the team uses them in an ATDD approach to deliver what the business wants. Throwing another 20 or 30 examples in there that document all the other testing that was done would de-value the examples caught at the start and they cease to be living documentation, they becomes test cases, which doesn’t help the shared understanding we are trying to achieve with conversations.

I’ve had experiences where we have fallen into this trap and it’s bled into the conversations where we focus too much on capturing how we are going to test a feature, rather than what value do we want to deliver the business/user.

1 Like

Before discussing the use of cucumber, we should have knowledge of the cucumber tool. Cucumber is basically the testing tool that supports the BDD (Behavior-driven development). BDD is an approach that consists of defining the behavior of any feature through examples in plain text.

Cucumber does the same thing, it read out the specifications which are written in the plain text by the user and it validates the response as per the specifications are given. The main benefit of using cucumber is that it is an open-source tool and many QA testing services providers are using it because it helps the user to write the test cases that anyone can easily understand regardless of their technical knowledge.

Cucumber has the specific the syntax rules called ‘Gherkin’. It is the collection of the grammar rules which convert the plain text in such a way that cucumber can understand that.

Below is the example of scenario written in Gherkin language:

Scenario: AddTwoNumbers

  • Given I have two nos five & ten
  • When I add two numbers
  • Then their sum should be equal to fifteen

Many software testing companies use Cucumber because of following reasons:

  1. QA folks can write Test scripts without having in-depth knowledge of programming/coding which is a huge plus.
  2. Cucumber is also pretty simple and easy to set up as compared to other testing tools. This tool also supports multiple programming languages and flexible with different software platforms like Selenium, Ruby on Rails, Watir, etc.

Hope this information is helpful for you!