Cucumber

What is BDD, and How Do We Practice It? (Webinar + Q&A)

On March 18, 2019, I gave a webinar entitled, “What is Behavior-Driven Development, and How Do We Practice It?” in collaboration with Paul Merrill and his company, Beaufort Fairmont. It was both a pleasure and an honor to do this webinar with them. Paul is a top-notch test automation expert, and Beaufort Fairmont is doing really exciting things. Check out their two-day BDD training offering, as well as their blog and other webinars.

To see my webinar recording, register here.

During the webinar, attendees asked more questions than we could answer. I’m excited that so many people asked questions. My answers are below.

Questions about Process

How is BDD different from TDD (Test-Driven Development)?

BDD is an evolution of TDD. In TDD, developers (1) write unit tests and watch them fail, (2) develop the feature to make the tests pass, (3) refactor the code to make it stronger, and (4) repeat the cycle. In BDD, teams do this same loop with feature tests (a.k.a “acceptance” or “black-box” tests) as well as unit tests. Furthermore, BDD adds shift left practices like Example Mapping and Specification by Example so that teams know what they are doing and focus on developing the right things.

Check out Dan North’s article, Introducing BDD, for a more thorough answer.

Can BDD be used with manual testing?

Yes! BDD is not merely an automation tool – it is a set of pragmatic practices to help teams develop better software. Gherkin scenarios are first and foremost behavior specs that help a team’s collaboration and accountability. They function secondarily as test cases that can be executed either manually or with automation.

Can we use BDD with technical stories or backend features?

Yes! If you can describe it, then you can do it.

How many Gherkin scenarios should one story have?

There’s no hard rule, but I recommend no more than a handful of rules per story, and no more than a handful of examples per rule. If you do Example Mapping and feel overwhelmed by the number of cards for a story, then the story should probably be broken into smaller stories.

Should we do Example Mapping for every story? Spending 20-30 minutes for each story would take a long time.

Try doing Example Mapping on one or two stories to start. The first time is always rough, but as you iterate on it, you’ll get better as a team. Even though Example Mapping has an upfront time cost, it will save a lot of time later in the sprint because (a) acceptance criteria is clear, (b) tests are already written, and (c) everyone has a mutual understanding of the story. The team won’t suffer through the inefficiencies of miscommunication and poor planning. You may even want to replace planning meeting with Example Mapping meetings.

What metrics should we use with BDD?

All metrics are flawed, but some metrics are useful. All the standard testing and Agile metrics still apply: code coverage, story velocity, etc. Here are some additional metrics you may consider for BDD:

the percentage of stories that undergo Example Mapping before the sprint
the number of rules and examples that get “missed” during Example Mapping and need to be added later
the percentage of Gherkin scenarios that get automated in the sprint

If you choose to track metrics, make sure their feedback is used to improve team practices. For more info on metrics, please read my Quality Metrics 101 series.

What were the resources you recommended at the end of the webinar?

Questions about Tools

What test management tools should we use with BDD?

I’m sure there are BDD plugins for test management tools, but I don’t have any that I can personally recommend. To be honest, I try to stay away from large test management tools like HP ALM, qTest, VersionOne. When doing BDD, the Gherkin feature files themselves should be the single source of truth for feature-level tests, and they should be version-controlled in a repository. Don’t fall into the trap of slapping “Given-When-Then” keywords onto existing functional tests – that’s not BDD.

Does Jira support Example Mapping?

I have not personally used any Jira plugin for Example Mapping. It looks like there is an Easy Agile User Story Maps plugin that is similar to but slightly different from Example Mapping.

Are there other good tools for BDD and Example Mapping?

Cycle Automation provides a nice app with Gherkin steps out of the box, so you can automate tests without needing a programming language.
TeamUp Labs provides an online Example Mapping tool.
IDEs from JetBrains and Eclipse provide BDD plugins
Gherkin Syntax Highlighting in Notepad++
Gherkin Syntax Highlighting in Visual Studio Code
Gherkin Syntax Highlighting in Atom
Gherkin Syntax Highlighting in Chrome

What’s the difference between Gherkin, Cucumber, and SpecFlow?

Gherkin is the Given-When-Then spec language.
Cucumber is a company and its eponymous test framework that uses Gherkin.
SpecFlow is Cucumber for .NET.

Questions about Testing

Can BDD test frameworks be used for unit testing?

Yes, but I don’t recommend it. BDD frameworks shine for black-box feature testing. They’re a bit too verbose for code-level unit tests. Read BDD 101: Unit, Integration, and End-to-End Tests for more info.

Can BDD test frameworks be used for integration testing?

Yes! See BDD 101: Unit, Integration, and End-to-End Tests.

How long should Gherkin scenarios be?

Scenarios should be bite-sized. Each scenario should focus on one individual behavior. There’s no hard rule, but I recommend single-digit step counts. Read BDD 101: Writing Good Gherkin for more info.

What are “step definitions” in Cucumber?

Step definitions are the methods in the automation code that execute the steps. When a BDD framework runs a Gherkin scenario as a test, it “glues” each step to a step definition based on some sort of string matching.

How can we minimize duplicate code within a BDD test framework?

Know your steps. Always search for existing steps before writing new steps. Refactor existing steps whenever appropriate. Reuse steps when writing new scenarios. Do pair programming or mob programming when writing scenarios. Put scenarios through code reviews. Apply good coding practices – remember, test automation is software.

I write Gherkin scenarios, but I don’t write test automation code. What’s the best way to write Gherkin scenarios so that they can be automated?

Do pair programming with the automation engineers to write Gherkin scenarios together. Become familiar with existing steps by reading and searching feature files. Otherwise, the Gherkin steps you write in isolation might not be usable. Remember, BDD is a team effort!

The examples in the webinar were all fairly basic. Do you have any examples with more complex systems?

I have some example projects on GitHub in Python and Java with some basic unit, integration, and end-to-end tests, but I don’t have any large-scale examples that I can share publicly.

We wrote hundreds of SpecFlow tests without the other Amigos. Now, there are large test gaps, and many steps aren’t reusable. What should we do?

I’m sorry to hear that. It’s not an uncommon story. There are two paths: (1) refactoring or (2) starting over. Without really knowing the situation, I don’t think it’s my place to say which way is better. Here are some questions to help guide your decision:

What are your goals for testing and automation?
What’s your overall quality and testing strategy?
What parts of the code base are salvageable?
What parts of the code base should be removed?
If you started again from scratch, what would you do differently to make sure the same problems don’t reoccur?

I strongly recommend taking the Setting a Foundation for Successful Test Automation course from Test Automation University. (It’s free.) I also gave a talk about this very problem, Egad! How Do We Start Writing (Better) Tests?, at a few Python conferences.

We have a large BDD test suite with heavy coupling and slow execution times. The business amigos have also left the company. Should we try to fix what we have or just start over?

Sorry to hear that; same answer as before.

Final Questions

Why do you call yourself the “Automation Panda”?

Pandas are awesome. Everybody loves them. And nobody forgets my moniker.

Where can I get team training in BDD?

Beaufort Fairmont provides a one- or two-day course in BDD and writing Gherkin. Sign up for more information here.

Behavior-Driven Blasphemy

This is my 100th post on Automation Panda! I’m thrilled to see how much this blog has grown and how many people it has helped. For such a monumental occasion, I have chosen to voice a rather controversial opinion about test automation.

Behavior-driven development seems to be the software testing buzzword of the decade. What started as a refinement of test-driven development by developers in Europe and the UK quickly became the big process fad of the 2010’s. The Cucumber project (now 10 years old) developed or inspired Gherkin-based test automation frameworks in all the major programming languages. Companies started requiring Given-When-Then format for acceptance criteria and test scenarios. Three Amigos meetings became standard calendar fixtures during sprints. Organizations that once undertook “Agile transformations” now have similar initiatives for BDD. For better or worse, BDD exists and cannot be ignored.

The dogmatic benefits of BDD are better collaboration and automation. However, leaders frequently insist that Gherkin-style test frameworks add value only when paired with practices like Example Mapping. “BDD is a process, not a tool,” is a common mantra. “Otherwise, the Gherkin just gets in the way.” Although I wholeheartedly agree that behavior-driven practices add significant value to the development process, I nevertheless espouse a rather blasphemous opinion:

BDD test automation frameworks are better than traditional frameworks for black box functional testing even when BDD processes are not followed.

What Exactly Are You Saying?

My claim is that behavior-driven test frameworks like Cucumber, SpecFlow, and behave are significantly better than traditional xUnit-style frameworks for testing live features. For example, I would rather use SpecFlow than NUnit for testing a Web app with Selenium WebDriver, whether or not the other two Amigos are with me. The resulting automation code has better structure, readability, and reusability.

I’m not saying that teams shouldn’t do BDD practices, and I’m not saying that the Three Amigos should be separated. Collaboration is key to success, and BDD really helps. Example Mapping is one of the most useful practices a development team can do. I’m also not saying that BDD frameworks should be used for all testing purposes – they are poorly suited for unit testing and for performance testing.

Objection!

I find myself very lonely in this opinion. BDD leaders repeatedly insist that BDD is not about testing and automation:

BDD is not about Testing (slides by Dan North)
The world’s most misunderstood collaboration tool by Aslak Hellesøy
BDD is – BDD is not by Augusto Evangelisti
BDD Tool Cucumber is Not a Testing Tool by Jan Stenberg
3 misconceptions about BDD from ThoughtWorks

The most outspoken BDDers (mostly coalescing around the Cucumber community) have largely moved their focus to the collaboration benefits, almost forsaking the automation benefits. (This may not necessarily be true, but it appears that way based on the literature and materials floating on the Web.) That outlook is somewhat disingenuous because the main tools supporting BDD are, in fact, test frameworks.

BDD also has outspoken opponents – it’s love or hate. I’ve personally spoken with several engineers who despise Gherkin-based frameworks. “I can see how it would be valuable when a whole team embraces behavior-driven practices,” many have told me, “but otherwise, the Gherkin layer just gets in the way of automation.” I’ve heard it called “plaster” and “garbage.” Engineers just want to code their tests. And code should always be readable, right?

Testing is an inherently opinionated space. People can never seem to agree on things.

The Bigger Picture

Test automation must be developed regardless of any specific development practices, and its architecture must stand firmly in its own right. Unfortunately, both sides miss the bigger picture:

The best solution for test automation is a domain-specific language.

A domain-specific language (DSL) is a programming language with a purpose. It is designed to handle very specific needs, rather than general-purpose programming. For example:

SQL is a DSL for database queries.
XPath is a DSL for finding elements in an XML document.
YAML is a DSL for object serialization.

Gherkin is also a DSL – for behavior specification.

Domain-specific languages naturally suit test automation due to the clear difference between test cases and test code. Test cases are procedures that exercise product behavior. Anyone can write a test case. They are dictated or explained in plain language. Test code, however, is the software implementation of test cases. Test code handles function calls, logging, exceptions, and all those other little programming details that help run tests. A test automation DSL separates those concerns: test cases are written in a special language, and the interpreter handles repetitive, low-level details. Some type of extensions can handle product-specific interactions. The purpose of a language is to effectively express intention – and the intention is to test the product.

To truly achieve an optimal solution, however, the DSL and its interpreter must be treated as part of the automation software, just like the test cases and extensions. Remember, a language’s interpreter is just another piece of software. The interpreter is part of the separation of concerns and the single responsibility principle. Concerns that would typically be handled by classes and functions in traditional test code should be moved to the interpreter. For example, the interpreter should automatically log every test case step, rather that forcing the author to write explicit logging statements.

When I worked at NetApp years ago, I implemented a DSL to test platform-level features of our operating system. I called it DS – short for “Design Steps” (from HP ALM) (but also not without an affinity for the Nintendo DS). NetApp’s entire test automation code was developed in Perl at the time, so I implemented the DS interpreter in Perl to reuse existing libraries. DS test cases were typically only a dozen lines long each, and DS expressions could call specially-written Perl modules directly for complete extendability. During the first big release using DS, my team saved countless hours of automation development time as compared to the previous release while delivering a higher number of tests. I also did this before I had ever heard of BDD.

Unfortunately, most teams have neither the time to develop their own testing DSL nor the understanding of compiler theory to build it right. And if they were given such a language, they typically limit themselves to the provided implementation instead of taking ownership to extend the language for their needs.

The original Nintendo DS. Fun times!

Who Truly Misunderstands Gherkin?

Enter Gherkin: the world’s first major general-purpose, off-the-shelf language for test automation. It is general enough to cover any case through its plain language steps, yet specific enough to standardize tests. Users don’t need to be compiler theory experts – they just make up their own step names and provide the definition code to execute them. Early BDD projects like JBehave and Cucumber packaged an interpreter as a test framework and delivered it to a testing world still stuck on JUnit. The need for a testing DSL was there, whether or not the BDD folks meant to serve it.

Cucumber-ites frequently bemoan that their framework is misunderstood by the masses. They shudder to see teams using their framework purely for test automation. However, Cucumber effectively lowered the entry barrier for teams to make their own testing DSLs. Kodak did the same thing for film: they made it cheap and standard so anyone could be a photographer. Not everyone who uses a BDD framework misunderstands its purpose: some (like me) just see an alternative value proposition than what is preached by orthodox BDD practitioners. Gherkin fills a need that nobody knew. Its popularity validates that claim.

Benefits Apart from Process

Using a BDD framework adds much value to testing and development even without BDD processes. Below are just a handful of benefits:

Focus first on good scenarios. Gherkin forces authors to think before they code.
Faster automation development. Gherkin steps are reusable and parametrizable.
Stronger structure. Engineers know where to put things in the framework.
Test understandability. Anyone can read scenarios because they are written in plain language. Business people can help. New people can pick it up fast.
Test sharing. Feature files can be shared apart from test code, which can be helpful for business partners.
Test similarity. Tests all look the same. Team members can more easily help each other.
Clearer failures. When a scenario fails, reports show exactly what step failed.
Simpler bug reports. Use scenario steps as instructions to reproduce the failure.
2-phase test reviews. Review Gherkin first and then test code second to make sure the test cases are good before implementing the wrong things.
BDD enablement. Using a BDD framework opens the door for a team to embrace better behavioral practices in the future.

I wrote about these advantages before:

Case Studies

I’m also not the only one who finds value in BDD test frameworks outside of the full BDD process. Below are five case studies.

radish

radish is a Python test framework inspired by Cucumber. Its DSL syntax is a superset of Gherkin that adds preconditions, loops, variables, and expressions. These language additions indicate a bias towards automation because they enable engineers to write tests more programmatically, albeit in a Gherkin-ese way.

Karate

Karate is a test framework with a full DSL based on Gherkin with steps specifically tailored to Web service calls. Although it is implemented in Java, testers do not need to do any Java programming to write complete tests cases from day one. Peter Thomas, the creator of Karate, unabashedly declares that Karate does not truly adhere to BDD but nevertheless uses Cucumber for its automation benefits. (Note: Karate is working to move completely off of Cucumber. See GitHub issue #444.)

REST Assured

REST Assured is a Java package for testing REST APIs. Unlike Karate, REST Assured provides a fluent syntax (and not a DSL) for writing service calls directly in Java code. The fluent syntax is based on Gherkin: given() a request spec is created, when() the call is made, then() verify the response. Although REST Assured is not a full testing framework, it nevertheless pulls inspiration from BDD frameworks for order and structure.

Cycle

Cycle is a BDD-focused test automation platform from Cycle Labs for testing Web, terminal, and desktop apps. Cycle is unique because it provides out-of-the-box steps for all types of supported testing so that no programming experience is required. Testers write feature files using Cycle 2.0’s slick new Electron app. Scenarios are written in CycleScript, a Gherkin-ese language with additions like variables and sub-scenario calls. Steps tend to be imperative, but that’s the tradeoff for not requiring lower-level programming.

Hexawise

Hexawise is a combinatorial testing tool designed to maximize coverage with minimal test counts by smartly joining feature variations. It helps testers write better tests with less redundancy and fewer gaps. Although Hexawise has historically assisted manual testers, it also can generate Gherkin feature files for test variations.

Not all cucumbers are the same. Above is a sea cucumber.

Good Enough?

Gherkin-based test frameworks are not perfect, but they do provide good structure. They gained popularity outside of the pure BDD movement because they genuinely added value to testing and automation. Like any other tool, teams will use them in both good and bad ways. (Trust me, I’ve seen scary Gherkin.)

It’s interesting to see how groups outside the Cucumber diaspora are attempting to solve the limitations of pure Gherkin. Each case study above showed a unique path. Clearly, the test automation problem has not yet been completely solved, but current BDD frameworks are the best off-the-shelf solutions we have until a new software testing movement comes along.

A Cucumber Kimchi Recipe

My wife and I love to eat kimchi. Recently, I started making it at home instead of buying the big (expensive) jars at our local Asian market. I just made oi-sobagi (오이소박이), a cucumber kimchi, for the first time. Given that “Cucumber” is a test automation buzzword, I figured I’d share the recipe here. I hope you enjoy it as much as my Chinese cucumber recipe!

The recipe I used came from Maangchi, “YouTube’s Korean Julia Child.” She has many other delicious Korean recipes, too. Her oi-sobagi recipe is here:

5 Things I Love About SpecFlow

SpecFlow, a.k.a. “Cucumber for .NET,” is a leading BDD test automation framework for .NET. Created by Gáspár Nagy and maintained as a free, open source project on GitHub by TechTalk, SpecFlow presently has almost 3 million total NuGet downloads. I’ve used it myself at a few companies, and, I must say as an automationeer, it’s awesome! SpecFlow shares a lot in common with other Cucumber frameworks like Cucumber-JVM, but it is not a knockoff – it excels in many ways. Below are five features I love about SpecFlow.

#1: Declarative Specification by Example

SpecFlow is a behavior-driven test framework. Test cases are written as Given-When-Then scenarios in Gherkin “.feature” files. For example, imagine testing a cucumber basket:

Feature: Cucumber Basket
  As a gardener,
  I want to carry many cucumbers in a basket,
  So that I don’t drop them all.
  
  @cucumber-basket
  Scenario: Fill an empty basket with cucumbers
    Given the basket is empty
    When "10" cucumbers are added to the basket
    Then the basket is full

Notice a few things:

It is declarative in that steps indicate what should be done at a high level.
It is concise in that a full test case is only a few lines long.
It is meaningful in that the coverage and purpose of the test are intuitively obvious.
It is focused in that the scenario covers only one main behavior.

Gherkin makes it easy to specify behaviors by example. That way, everybody can understand what is happening. C# code will implement each step in lower layers. Even if your team doesn’t do the full-blown BDD process, using a BDD framework like SpecFlow is still great for test automation. Test code naturally abstracts into separate layers, and steps are reusable, too!

#2: Context is King

Safely sharing data (e.g., “context”) between steps is a big challenge in BDD test frameworks. Using static variables is a simple yet terrible solution – any class can access them, but they create collisions for parallel test runs. SpecFlow provides much better patterns for sharing context.

Context injection is SpecFlow’s simple yet powerful mechanism for inversion of control (using BoDi). Any POCOs can be injected into any step definition class, either using default values or using a specific initialization, by declaring the POCO as a step def constructor argument. Those instances will also be shared instances, meaning steps across different classes can share the same objects! For example, steps for Web tests will all need a reference to the scenario’s one WebDriver instance. The context-injected objects are also created fresh for each scenario to protect test case independence.

Another powerful context mechanism is ScenarioContext. Every scenario has a unique context: title, tags, feature, and errors. Arbitrary objects can also be stored in the context object like a Dictionary, which is a simple way to pass data between steps without constructor-level context injection. Step definition classes can access the current scenario context using the static ScenarioContext.Current variable, but a better, thread-safe pattern is to make all step def classes extend the Steps class and simply reference the ScenarioContext instance variable.

#3: Hooks for Any Occasion

Hooks are special methods that insert extra logic at critical points of execution. For example, WebDriver cleanup should happen after a Web test scenario completes, no matter the result. If the cleanup routine were put into a Then step, then it would not be executed if the scenario had a failure in a When step. Hooks are reminiscent of Aspect-Oriented Programming.

Most BDD frameworks have some sort of hooks, but SpecFlow stands out for its hook richness. Hooks can be applied before and after steps, scenario blocks, scenarios, features, and even around the whole test run. (Cucumber-JVM, by contrast, does not support global hooks.) Hooks can be selectively applied using tags, and they can be assigned an order if a project has multiple hooks of the same type. Hook methods will also be picked up from any step definition class. SpecFlow hooks are just awesome!

#4: Thorough Outline Templating

Scenario Outlines are a standard part of Gherkin syntax. They’re very useful for templating scenarios with multiple input combinations. Consider the cucumber basket again:

Feature: Cucumber Basket
  
  Scenario Outline: Add cucumbers to the basket
    Given the basket has "<initial>" cucumbers
    When "<some>" cucumbers are added to the basket
    Then the basket has "<total>" cucumbers

    Examples: Counts
      | initial | some | total |
      | 1       | 2    | 3     |
      | 5       | 3    | 8     |

All BDD frameworks can parametrize step inputs (shown in double quotes). However, SpecFlow can also parametrize the non-input parts of a step!

Feature: Cucumber Basket
  
  Scenario Outline: Use the cucumber basket
    Given the basket has "<initial>" cucumbers
    When "<some>" cucumbers are <handled-with> the basket
    Then the basket has "<total>" cucumbers

    Examples: Counts
      | initial | some | handled-with | total |
      | 1       | 2    | added to     | 3     |
      | 5       | 3    | removed from | 2     |

The step definitions for the add and remove steps are separate. The step text for the action is parametrized, even though it is not a step input:

[When(@"""(\d+)"" cucumbers are added to the basket")]
public void WhenCucumbersAreAddedToTheBasket(int count) { /* */ }

[When(@"""(\d+)"" cucumbers are removed from the basket")]
public void WhenCucumbersAreRemovedFromTheBasket(int count) { /* */ }

That’s cool!

#5: Test Thread Affinity

SpecFlow can use any unit test runner (like MsTest, NUnit, and xUnit.net), but TechTalk provides the official SpecFlow+ Runner for a licensed fee. I’m not associated with TechTalk in any way, but the SpecFlow+ Runner is worth the cost for enterprise-level projects. It has a friendly command line, a profile file to customize execution, parallel execution, and nice integrations.

The major differentiator, in my opinion, is its test thread affinity feature. When running tests in parallel, the major challenge is avoiding collisions. Test thread affinity is a simple yet powerful way to control which tests run on which threads. For example, consider testing a website with user accounts. No two tests should use the same user at the same time, for fear of collision. Scenarios can be tagged for different users, and each thread can have the affinity to run scenarios for a unique user. Some sort of parallel isolation management like test thread affinity is absolutely necessary for test automation at scale. Given that the SpecFlow+ Runner can handle up to 64 threads (according to TechTalk), massive scale-up is possible.

But Wait, There’s More!

SpecFlow is an all-around great test automation framework, whether or not your team is doing full BDD. Feel free to add comments below about other features you love (or *gasp* hate) about SpecFlow!

BDD Example Mapping

The two major goals of Behavior-Driven Development are better collaboration and automation. Even when the Three Amigos actually get together, collaboration can be tough. Where do we start? What scenarios should we write? What examples should be included?

Well, the Cucumber folks have a practice called “Example Mapping” to make it easier. All you need is a pack of index cards and a big table!

Write the story under discussion on a yellow card at the top of the table.
Write a rule for each known acceptance criteria on a blue card under the story.
Write each example for a rule on a green card.
Write each open question on a red card on the side to discuss later.

Keep writing cards until the team is satisfied with the story. This process provides clear, fast feedback for stories. It should take about 25 minutes per story. A team can quickly see if a story is too big or needs further refinement. Engineers can easily turn example cards into Gherkin scenarios. Remember to assign questions to owners to get answers.

Rather than duplicate documentation here, please read Matt Wynne’s seminal post on the practice, Introducing Example Mapping.

Also, watch this webinar recording from Cucumber about Example Mapping:

Pipe Character Escape for Gherkin Tables

For the first time today, I had to write a Gherkin behavior scenario in which table text needed to use the pipe character “|”. I wanted a generic step that would find and click web page links by name, and one of the link names had the pipe in it! The first version of the step I wrote looked like this:

When the user follows the links:
  | link              |
  | Category          |
  | Sub-Category      |
  | Index|Description |

Naturally, this step didn’t parse – the “|” was parsed as a table delimiter instead of the intended link text. I could have rewritten the step to search for partial link text, or I could have done a key-value lookup, but I wanted to keep the step simple and direct.

The solution was simple: escape the pipe character “|” with a backslash character “\”. Easy! Thanks, StackOverflow! The updated table looks like this:

When the user follows the links:
 | link               |
 | Category           |
 | Sub-Category       |
 | Index\|Description |

“\|” works for both step tables and scenario outline example tables. It looks like it is fairly standard for test frameworks that use Gherkin. I verified that Cucumber-JVM and SpecFlow support it, and it looks like Cucumber for Ruby does as well. It looks like behave will support it in 1.2.6.

After learning this trick, I updated the BDD 101: The Gherkin Language page.

Note that backslash escape sequences won’t work for quotes in Gherkin steps. Quotes in steps are merely conventions and not part of the Gherkin language standard.

Cucumber-JVM for Java

This post is a concise-yet-comprehensive overview of Cucumber-JVM for Java. It is an introduction, a primer, a guide, and a reference. If you are new to BDD, please learn about it before using Cucumber-JVM.

Introduction

Cucumber is an open-source software test automation framework for behavior-driven development. It uses a business-readable, domain-specific language called Gherkin for specifying feature behaviors that become tests. The Cucumber project started in 2008 when Aslak Hellesøy released the first version of the Cucumber framework for Ruby.

Cucumber-JVM is the official port for Java. Every Gherkin step is “glued” to a step definition method that executes the step. The English text of a step is glued using annotations and regular expressions. Cucumber-JVM integrates nicely with other testing packages. Anything that can be done with Java can be handled by Cucumber-JVM. Cucumber-JVM is ideal for black-box, above-unit, functional tests.

[Update on 7/29/2018: As of version 3.0.0, Cucumber-JVM no longer supports JVM languages other than Java – namely Groovy, Scala, Clojure, and Gosu. Please read Cucumber-JVM is dropping support of JVM Languages on the official Cucumber blog. The Gherkin parser is also now written in Go instead of in Java.]

Example Projects

Github contains two Cucumber-JVM example projects for this guide:

cucumber-jvm-java-example for traditional annotation-style step defs
cucumber-jvm-java8-example for Java 8 lambda-style step defs

The projects use Java, Apache Maven, Selenium WebDriver, and AssertJ. The README files include practice exercises as well.

Prerequisite Skills

To be successful with Cucumber-JVM for Java, the following skills are required:

BDD and Gherkin (read BDD 101 to get up-to-speed)
Intermediate-level Java programming
Test automation (like JUnit and TestNG)
Build process (like ANT, Maven, and Gradle)

Prerequisite Tools

Test machines must have the Java Development Kit (JDK) installed to build and run Cucumber-JVM tests. They should also have the desired build tool installed (such as Apache Maven). The build tool should automatically install Cucumber-JVM packages through dependency management.

An IDE such as JetBrains IntelliJ IDEA (with the Cucumber for Java plugin) or Eclipse (with the Cucumber JVM Eclipse Plugin) is recommended for Cucumber-JVM test automation development. Software configuration management (SCM) with a tool like Git is also strongly recommended.

Versions

Cucumber-JVM 2.0 was released in August 2017 and should be used for new Cucumber-JVM projects. Releases may be found under Maven Group ID io.cucumber. Older Cucumber-JVM 1.x versions may be found under Maven Group ID info.cukes.

Build Management

Apache Maven is the preferred build management tool for Cucumber-JVM projects. All Cucumber-JVM packages are available from the Maven Central Repository. Maven can automatically run Cucumber-JVM tests as part of the build process. Projects using Cucumber-JVM should follow Maven’s Standard Directory Layout. The examples use Maven. Gradle may also be used, but it requires extra setup.

Every Maven project has a POM file for configuration. The POM should contain appropriate Cucumber-JVM dependencies. There is a separate package for each JVM language, dependency injection framework, and underlying unit test runner. Since Cucumber-JVM is a test framework, its dependencies should use test scope. Check io.cucumber on the Maven site for the latest packages and versions.

Project Structure

Cucumber-JVM test automation has the same layered approach as other BDD frameworks:

The higher layers focus more on specification, while the lower layers focus more on implementation. Gherkin feature files and step definition classes are BDD-specific.

Cucumber-JVM tests may be included in the same project as product code or in a separate project. Either way, projects using Cucumber-JVM should follow Maven’s Standard Directory Layout: test code should be located under src/test.

Screenshot of the example project from IntelliJ IDEA’s Project view.

Gherkin Feature Files

Gherkin feature files are text files that contain Gherkin behavior scenarios. They use the “.feature” extension. In a Maven project, they belong under src/test/resources, since they are not Java source files. They should also be organized into a sensible package hierarchy. Refer to other BDD pages for writing good Gherkin.

A feature file from the example projects, opened in IntelliJ IDEA.

Step Definition Classes

Step definition classes are Java classes containing methods that implement Gherkin steps. Step def classes are like regular Java classes: they have variables, constructors, and methods. Steps are “glued” to methods using regular expressions. Feature file scenarios can use steps from any step definition class in the project. In a Maven project, step defs belong in packages under src/test/java, and their class names should end in “Steps”.

The Basics

Below is a step definition class from the cucumber-jvm-java-example project, which uses the traditional method annotation style for step defs as part of the cucumber-java package. Each method should throw Throwable so that exceptions are raised up to the Cucumber-JVM framework.

package com.automationpanda.example.stepdefs;

import com.automationpanda.example.pages.GooglePage;
import cucumber.api.java.After;
import cucumber.api.java.Before;
import cucumber.api.java.en.Given;
import cucumber.api.java.en.Then;
import cucumber.api.java.en.When;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import static org.assertj.core.api.Assertions.assertThat;

public class GoogleSearchSteps {

  private WebDriver driver;
  private GooglePage googlePage;

  @Before(value = "@web", order = 1)
  public void initWebDriver() throws Throwable {
    driver = new ChromeDriver();
  }

  @Before(value = "@google", order = 10)
  public void initGooglePage() throws Throwable {
    googlePage = new GooglePage(driver);
  }

  @Given("^a web browser is on the Google page$")
  public void aWebBrowserIsOnTheGooglePage() throws Throwable {
    googlePage.navigateToHomePage();
  }

  @When("^the search phrase \"([^\"]*)\" is entered$")
  public void theSearchPhraseIsEntered(String phrase) throws Throwable {
    googlePage.enterSearchPhrase(phrase);
  }

  @Then("^results for \"([^\"]*)\" are shown$")
  public void resultsForAreShown(String phrase) throws Throwable {
    assertThat(googlePage.pageTitleContains(phrase)).isTrue();
  }

  @After(value = "@web")
  public void disposeWebDriver() throws Throwable {
    driver.quit();
  }
}

Alternatively, in Java 8, step definitions may be written using lambda expressions. As shown in the cucumber-jvm-java8-example project, lambda-style step defs are more concise and may be defined dynamically. The cucumber-java8 package is required:

package com.automationpanda.example.stepdefs;

import com.automationpanda.example.pages.GooglePage;
import cucumber.api.Scenario;
import cucumber.api.java8.En;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import static org.assertj.core.api.Assertions.assertThat;

public class GoogleSearchSteps implements En {

  private WebDriver driver;
  private GooglePage googlePage;

  // Warning: Make sure the timeouts for hooks using a web driver are zero

  public GoogleSearchSteps() {
    Before(new String[]{"@web"}, 0, 1, (Scenario scenario) -> {
      driver = new ChromeDriver();
    });
    Before(new String[]{"@google"}, 0, 10, (Scenario scenario) -> {
      googlePage = new GooglePage(driver);
    });
    Given("^a web browser is on the Google page$", () -> {
      googlePage.navigateToHomePage();
    });
    When("^the search phrase \"([^\"]*)\" is entered$", (String phrase) -> {
      googlePage.enterSearchPhrase(phrase);
    });
    Then("^results for \"([^\"]*)\" are shown$", (String phrase) -> {
      assertThat(googlePage.pageTitleContains(phrase)).isTrue();
    });
    After(new String[]{"@web"}, (Scenario scenario) -> {
      driver.quit();
    });
  }
}

Either way, steps from any feature file are glued to step definition methods/lambdas from any class at runtime:

Gluing a Gherkin step to its Java definition using regular expressions. IDEs have features to automatically generate definition stubs for steps.

For best practice, class inheritance should also be avoided – step bindings in superclasses will trigger DuplicateStepDefinitionException exceptions at runtime, and any step definition concern handled by inheritance can be handled better with other design patterns. Class constructors should be used primarily for dependency injection, while setup operations should instead be handled in Before hooks.

Hooks

Scenarios sometimes need automation-centric setup and cleanup routines that should not be specified in Gherkin. For example, web tests must first initialize a Selenium WebDriver instance. Step definition classes can have Before and After hooks that run before and after a scenario. They are analogous to setup and teardown methods from other test frameworks like JUnit. Hooks may optionally specify tags for the scenarios to which they apply, as well as an order number. They are similar to Aspect-Oriented Programming. After hooks will run even if a scenario has an exception or abortive assertion – use them for cleanup routines instead of Gherkin steps to guarantee cleanup runs.

The code snippet below shows Before and After hooks from the traditional-style example project. The order given to the Before hooks guarantees the web driver is initialized before the page object is created.

  @Before(value = "@web", order = 1)
  public void initWebDriver() throws Throwable {
    driver = new ChromeDriver();
  }

  @Before(value = "@google", order = 10)
  public void initGooglePage() throws Throwable {
    googlePage = new GooglePage(driver);
  }

  @After(value = "@web")
  public void disposeWebDriver() throws Throwable {
    driver.quit();
  }

Before and After hooks surround scenarios only. Cucumber-JVM does not provide hooks to surround the whole test suite. This protects test case independence but makes global setup and cleanup challenging. The best workaround is to use the singleton pattern with lazy initialization. The solution is documented in Cucumber-JVM Global Hook Workarounds.

Dependency Injection

Cucumber-JVM supports dependency injection (DI) as a way to share objects between step definition classes. For example, steps in different classes may need to share the same web driver instance. Cucumber-JVM supports many DI modules, and each has its own dependency package. As a warning, do not use static variables for sharing objects between step definition classes – static variables can break test independence and parallelization.

PicoContainer is the simplest DI framework and is recommended for most needs. Dependency injection hinges upon step definition class constructors. Without DI, step def constructors must not have parameters. With DI, PicoContainer will automatically construct each object in a step def constructor signature and pass them in when the step def object is constructed. Furthermore, the same object is injected into all step def classes that have its type as a constructor parameter. Objects that require constructor parameters should use a holder or caching class to provide the necessary arguments. Note that dependency-injected objects are created fresh for each scenario.

Below is a trivial example for how to apply dependency injection using PicoContainer to initialize the web driver in the example projects. (A more advanced example would read browser type from a config file and set the web driver accordingly.)

public class WebDriverHolder {
  private WebDriver driver;
  public WebDriver getDriver() {
    return driver;
  }
  public void initWebDriver() {
    driver = new ChromeDriver();
  }
}

public class GoogleSearchSteps {
  private WebDriverHolder holder;
  public GoogleSearchSteps(WebDriverHolder holder) {
    this.holder = holder;
  }
  @Before
  public void initWebDriver() throws Throwable {
    if (holder.getDriver() == null)
      holder.initWebDriver();
  }
}

Automation Support Classes

Automation support classes are extra classes outside of the Cucumber-JVM framework itself that are needed for test automation. They could come from the same test project, a separate but proprietary package, or an open-source package. Regardless of the source, they should fold into build management. They can integrate seamlessly with Cucumber-JVM. Step definitions should be very short because the bulk of automation work should be handled by support classes for maximum code reusability.

Popular open-source Java packages for test automation support are:

Selenium WebDriver for web browser interactions
REST Assured for REST API calls
AssertJ for assertions
SLF4J, Logback, and log4j2 for logging
Extent Reports for test reporting

Page objects, file readers, and data processors also count as support classes.

Configuration Files

Configuration files are extra files outside of the Cucumber-JVM framework that provide environment-specific data to the tests, such as URLs, usernames, passwords, logging/reporting settings, and database connections. They should be saved in standard formats like CSV, XML, JSON, or Java Properties, and they should be read into memory once at the start of the test suite using global hook workarounds. The automation code should look for files at predetermined locations or using paths passed in as environment variables or properties.

Not all test automation projects need config files, but many do. Never hard-code config data into the automation code. Avoid non-text-based formats like Microsoft Excel so that version control can easily do diffs, and avoid non-standard formats that require custom parsers because they require extra development and maintenance time.

Running Tests

Cucumber-JVM tests may be run in a number of ways.

Using JUnit or TestNG

The cucumber-junit and cucumber-testng packages enable JUnit and TestNG respectively to run Cucumber-JVM tests. They require test runner classes that provide CucumberOptions for how to run the tests. A project may have more than one runner class. The example projects use the JUnit runner like this:

package com.automationpanda.example.runners;

import cucumber.api.CucumberOptions;
import cucumber.api.junit.Cucumber;
import org.junit.runner.RunWith;

@RunWith(Cucumber.class)
@CucumberOptions(
  plugin = {"pretty", "html:target/cucumber", "junit:target/cucumber.xml"},
  features = "src/test/resources/com/automationpanda/example/features",
  glue = {"com.automationpanda.example.stepdefs"})
public class PandaCucumberTest {
}

JUnit and TestNG runners can also be picked up by build management tools. For example, Maven will automatically run any runner classes named *Test.java during the test phase and *IT.java during the verify phase. Be sure to include the clean option to delete old test results. Avoid duplicate test runs by making sure runner classes do not cover the same tests – use tags to avoid duplicate coverage.

Using the Command Line Runner

Cucumber-JVM provides a CLI runner that can run feature files directly from the command line. To use it, invoke:

java cucumber.api.cli.Main

Run with “–help” to see all available options.

Using IDEs

Both JetBrains IntelliJ IDEA (with the Cucumber for Java plugin) and Eclipse (with the Cucumber JVM Eclipse Plugin) are great IDEs for Cucumber-JVM test development. They provide features for linking steps to definitions, generating definition stubs, and running tests with various options.

Cucumber Options

Cucumber options may be specified either in a runner class or from the command line as a Java system property. Set options from the command line using “-Dcucumber.options” – it will work for any java or mvn command. To see all available options, set the options to “–help”, or check the official Cucumber-JVM doc page.

The most useful option is probably the tags option. Selecting tags to run dynamically at runtime, rather than statically in runner classes, is very useful. In Cucumber-JVM 2.0, tag expressions use a basic English Boolean language:

@automated and @web
@web or @service
not @manual
(@web or @service) and (not @wip)

Older version of Cucumber-JVM used a more complicated syntax with tildes and commas.

Parallel Execution

Parallel test execution greatly reduces total start-to-end testing time. However, it requires additional machines to run tests, tools and config to handle parallel runs, and tests to be written to avoid collisions. As of version 4.0.0, Cucumber-JVM supports parallel execution out of the box. Previously, the most common way to do parallel test runs in Cucumber-JVM was to use a Maven plugin like the Cucumber-JVM Parallel Plugin or the Cucable plugin from Trivago.

[Update on 9/24/2018: Mentioned that Cucumber-JVM supports parallel execution starting with version 4.0.0.]

References

The Cucumber website
The Cucumber-JVM project on GitHub
The Cucumber Book
The Cucumber for Java Book
Cucumber Recipes
Running Cucumber tests in parallel
Cucable Maven Plugin
The Automation Panda blog
- The BDD page
- BDD 101

BDD 101: Manual Testing

Behavior-driven development takes an automation-first philosophy: behavior specs should become automated tests. However, BDD can also accommodate manual testing. Manual testing has a place and a purpose, even in BDD. Remember, behavior scenarios are first and foremost behavior specifications, and they provide value beyond testing and automation. Any behavior scenario could be run as a manual test. The main questions, then, are (1) when is manual testing appropriate and (2) how should it be handled.

(Check the Automation Panda BDD page for the full BDD 101 table of contents.)

When is Manual Testing Appropriate?

Automation is not a silver bullet – it doesn’t satisfy all testing needs. Scenarios should be written for all behaviors, but they likely shouldn’t be automated under the following circumstances:

The return-on-investment to automate the scenarios is too low.
The scenarios won’t be included in regression or continuous integration.
The behaviors are temporary (ex: hotfixes).
The automation itself would be too complex or too fragile.
The nature of the feature is non-functional (ex: performance, UX, etc.).
The team is still learning BDD and is not yet ready to automate all scenarios.

Manual testing is also appropriate for exploratory testing, in which engineers rely upon experience rather than explicit test procedures to “explore” the product under test for bugs and quality concerns. It complements automation because both testing styles serve different purposes. However, behavior scenarios themselves are incompatible with exploratory testing. The point of exploring is for engineers to go “unscripted” – without formal test plans – to find problems only a user would catch. Rather than writing scenarios, the appropriate way to approach behavior-driven exploratory testing is more holistic: testers should assume the role of a user and exercise the product under test as a collection of interacting behaviors. If exploring uncovers any glaring behavior gaps, then new behavior scenarios should be added to the catalog.

How Should Manual Testing Be Handled?

Manual testing fits into BDD in much the same way as automated testing because both formats share the same process for behavior specification. Where the two ways diverge is in how the tests are run. There are a few special considerations to make when writing scenarios that won’t be automated.

Repository

Both manual and automated behavior scenarios should be stored in the same repository. The natural way to organize behaviors is by feature, regardless of how the tests will be run. All scenarios should also be managed by some form of version control.

Furthermore, all scenarios should be co-located for document-generation tools like Pickles. Doc tools make it easy to expose behavior specs and steps to everyone. They make it easier for the Three Amigos to collaborate. Non-technical people are not likely to dig into programming projects.

Extra Comments

The conciseness of behavior scenarios is problematic for manual testing because steps don’t provide all the information a tester may need. For example, test data may not be written explicitly in the spec. The best way to add extra information to a scenario is to add comments. Gherkin allows any number of lines for comments and description. Comments provide extra information to the reader but are ignored by the automation.

It may be tempting to simply write new Gherkin steps to handle the extra information for manual testing. However, this is not a good approach. Principles of good Gherkin should be used for all scenarios, regardless of whether or not the scenarios will be automated. High-quality specification should be maintained for consistency, for documentation tools, and for potential future automation.

An Example

Below is a feature that shows how to write behavior scenarios for manual tests:

Feature: Google Searching

  @automated
  Scenario: Search from the search bar
    Given a web browser is at the Google home page
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page

  @manual
  Scenario: Image search
    # The Google home page URL is: http://www.google.com/
    # Make sure the images shown include pandas eating bamboo
    Given Google search results for "panda" are shown
    When the user clicks on the "Images" link at the top of the results page
    Then images related to "panda" are shown on the results page

It’s not really different from any other behavior scenarios.

As stated in the beginning, BDD should be automation-first. Don’t use the content of this article to justify avoiding automation. Rather, use the techniques outlined here for manual testing only as needed.

A Cucumber Recipe

Scenario: Dinnertime
Given I have cucumbers
When I prepare my favorite cucumber dish
Then everyone at the dinner table is happy

We all know “Cucumber” is the buzzword for BDD, but the vast majority of the population knows it only as a vegetable. My wife and I prepare a special Chinese cucumber dish about once a week that’s fresh, tasty, and nutritious. It is very easy to prepare, since it requires no cooking. It may be served chilled or at room temperature, which makes it great for warm weather. As a change of pace for my blog, I’d like to share my cucumber recipe. C’mon, a panda’s gotta eat!

Ingredients

I strongly recommend against substitutions. This recipe is best with authentic ingredients. Specifically, do not substitute the cucumber variety – “regular” cucumbers won’t work. If you cannot find the required ingredients at a regular grocery store, Asian supermarkets will carry them.

About 8 Persian cucumbers (or about 4 Japanese cucumbers)
1 tbsp white rice vinegar
1 tbsp soy sauce (light soy sauce preferred)
1/2 tbsp sesame oil
1 tbsp dumpling sauce (or substitute additional vinegar, soy sauce, and oil)
1 tsp sugar
1/4 tsp salt
2 cloves minced garlic (or a healthy pinch of garlic powder)
Sesame seeds (optional garnish)

These are Persian cucumbers. They are about 6 inches long and have a 1-1.5 in diameter.

These are all the ingredients that you need!

Instructions

Prep time: 15 minutes, tops.

The best way to prepare this dish is to use a Chinese cleaver (also called the “Chinese chef knife”). After washing the cucumbers, chop off the ends. Then, one at a time, smash the cucumber with the flat side of the knife. Literally swing your arm as if hammering a nail. The weight of the blade will smash the cucumber into about four strips lengthwise. Beware that seeds and guts may fly out. Then, chop the pieces into one-inch segments. (If you don’t have a Chinese cleaver, you can smash the cucumbers with a rolling pin. Other knives simply don’t have the weight to break up the cucumbers effectively.)

A cucumber, smashed and chopped with the Chinese cleaver.

Combine the cucumber pieces and all of the other ingredients (except the sesame seeds) into a flat-bottomed dish or wide bowl. Stir everything together and let rest. A flat bottom allows the cucumbers to soak up the sauce; regular bowls will leave the cucumbers on top somewhat flavorless. Top with sesame seeds for garnish. This dish may be served after a few minutes of resting, or it may be chilled and served later. (Please note that the texture of the cucumbers is not as good after more than a day in the refrigerator.)

The finished dish, ready to eat! Sesame seeds and garlic powder are sprinkled on top. Also, notice how the glass container spreads the cucumbers out so that they can soak in the sauce.

Fun Facts

The Mandarin word for “cucumber” is “黄瓜” (“huángguā” in Pinyin, pronounced “hwang-gwa”), which directly translates to “yellow melon” or “yellow gourd” in English. Asian varieties of cucumbers have a much yellower hue than their Western counterparts, as is apparent when making this dish.
China grows about three-quarters of the world’s cucumber crop.
Cucumbers originated in India.
Cucumbers are about 96% water.

Bon appetit!

Cucumber-JVM Global Hook Workarounds

Almost all BDD automation frameworks have some sort of hooks that run before and after scenarios. However, not all frameworks have global hooks that run once at the beginning or end of a suite of scenarios – and Cucumber-JVM is one of these unlucky few. Cucumber-JVM GitHub Issue #515, which seeks to add @BeforeAll and @AfterAll hooks, has been open and active since 2013, but it looks unclear if the issue will ever be resolved. Thankfully, there are some workarounds to effect the same behavior as global hooks.

Workaround #1: Don’t Do It

From a purist’s perspective, each scenario (or test) should be completely independent, meaning it should not share parts with any other tests. Independence provides the following benefits:

Safety between tests
Consistency across tests
The ability to run any tests individually, in any order, or in parallel
More sensible, understandable tests

If not handled properly, global hooks can be dangerous because they make tests interdependent. Changes or failures in one test may cascade into others. Global test data would waste memory for tests that don’t use it. Furthermore, the fact that Issue #515 has been open for years indicates the difficulty of properly implementing global hooks.

However, the main cost of independence is runtime. Independent tests often repeat similar setup and cleanup routines. Even a few extra seconds per test can add up tremendously. Google Guava, for example, has over 286,000 tests – adding one second to each test would amount to nearly 80 hours! Performance becomes especially critical for continuous integration, in which wasted time means either delivery delays or coverage gaps. Certain operations like preparing a database or fetching authentication tokens may be pragmatic candidates for global hooks.

The best strategy is to use global hooks only when necessary for time-intensive setup that can be shared safely. Any shared test data should be immutable. Always question the need for global hooks. Most tests probably won’t need them.

Workaround #2: Static Variables

A basic hack for global hooks is actually provided in Issue #515. A static Boolean flag can indicate when the @Before hook has run more than once because it isn’t “reset” when a new scenario re-instantiates the step definition classes. The runtime shutdown hook will be called once all tests are done and the program exits. (Note that a static flag cannot be used in an @After hook due to the halting problem.) The example from the issue is shamelessly copied below:

public class GlobalHooks {
    private static boolean dunit = false;

    @Before
    public void beforeAll() {
        if(!dunit) {
            Runtime.getRuntime().addShutdownHook(afterAllThread);
            // do the beforeAll stuff...
            dunit = true;
        }
    }
}

Workaround #3: Singleton Caching

The basic hack is useful for simple setup and cleanup routines, but it becomes inelegant when objects must be shared by scenarios. Rather than polluting the class with static members, a singleton can cache test data between scenarios, and global setup logic may be put into the singleton’s constructor. Furthermore, if the singleton uses lazy initialization, then @Before hooks may not be needed at all. A “lazy” singleton will not be instantiated until the first time its getInstance method is called, meaning it will be skipped if the scenarios do not need them. This is a huge advantage when selectively running scenarios by name, tag, or feature. (Please refer to the previous post, Static or Singleton, for a deeper explanation of the singleton pattern.)

Consider scenarios that must generate authentication tokens (like OAuth) for API testing. A singleton “token holder” could cache tokens for usernames, rather than doing the authorization dance for every scenario. The snippet below shows how such a singleton could be called within a @When step definition with no @Before method.

public class ExampleSteps {
    ...
    @When("^some API is called$")
    public void whenSomeApiIsCalled() {
        // Get the token from the singleton cache lazily
        String token = TokenHolder.getInstance().getToken("user", "pass");
        // Use the token to call some API (method not shown)
        callSomeApi(token);
    }
    ...
}

And the singleton class could be defined like this:

public class TokenHolder {
    private static volatile TokenHolder instance = null;
    private HashMap<String, String> tokens;

    private TokenHolder() {
        tokens = new HashMap<String, String>();
    }

    public static TokenHolder getInstance() {
        // Lazy and thread-safe
        if (instance == null) {
            synchronized(TokenHolder.class) {
                if (instance == null) {
                    instance = new TokenHolder();
                }
            }
        }

        return instance;
    }
    
    public String getToken(String username, String password) {
        // This check could be extended to handle token expiration
        if (!tokens.containsKey(username)) {
            // Request a fresh authentication token (method not shown)
            String token = requestToken(username, password);
            // Cache the token for later
            tokens.put(username, token);
        }
        
        return tokens.get(username);
    }
    
    ...
}

Workaround #4: JUnit Class Annotations

Another workaround mentioned in Issue #515 and elsewhere is to use JUnit‘s @BeforeClass and @AfterClass annotations in the runner class, like this:

@RunWith(Cucumber.class)
@Cucumber.Options(format = {
    "html:target/cucumber-html-report",
    "json-pretty:target/cucumber-json-report.json"})
public class RunCukesTest {

    @BeforeClass
    public static void setup() {
        System.out.println("Ran the before");
    }

    @AfterClass
    public static void teardown() {
        System.out.println("Ran the after");
    }
}

While @BeforeClass and @AfterClass may look like the cleanest solution at first, they are not very practical to use. They work only when Cucumber-JVM is set to use the JUnit runner. Other runners, like TestNG, the command line runner, and special IDE runners, won’t pick up these hooks. Their methods must also be are static and would need static variables or singletons to share data anyway. Therefore, I personally discourage using these annotations in Cucumber-JVM.

What About Dependency Injection?

Dependency injection is a marvelous technique. As defined by Wikipedia:

In software engineering, dependency injection is a technique whereby one object supplies the dependencies of another object. A dependency is an object that can be used (a service). An injection is the passing of a dependency to a dependent object (a client) that would use it. The service is made part of the client’s state. Passing the service to the client, rather than allowing a client to build or find the service, is the fundamental requirement of the pattern.

Dependency injection can be a powerful alternative to singletons because DI provides finer control over the scope of objects. However, Cucumber-JVM’s dependency injection cannot be applied with global hooks because dependency objects, like step definition objects, are constructed and destroyed for each scenario.

Comparison Table

Ultimately, the best approach for global hooks in Cucumber-JVM is the one that best fits the tests’ needs. Below is a table to make workaround comparisons easier.

Workaround	Pros	Cons
Don’t Do It	Scenarios are completely independent. No complicated or risky workarounds.	Repeated setup and cleanup procedures may add significant execution time.
Static Variables	Simple yet effective implementation.	May need many static variables to share test data.
Singleton Caching	Abstracts test data and setup procedures. Easily handles lazy initialization and evaluation. May not need a @Before hook.	More complicated design.
JUnit Class Annotations	Clean look for basic setup and cleanup routines.	May be used only with the JUnit runner. Requires static variables or singletons to share test data anyway.