Java

Running tests in a Java Maven project

Java continues to be one of the most popular languages for test automation, and Maven continues to be its most popular build tool. Adding tests in the right place to a Java project with Maven can be a bit tricky, however. Let’s briefly learn how to do it. These steps work for both JUnit and TestNG.

Test code location

Maven project follow the Standard Directory Layout. The main code should go under src/main, while all test code should go under src/test:

src/test/java should hold Java test classes
src/test/resources should hold resource files that tests use

Your project directory should look something like this:

src/
+-- main/
|   +-- java/
|   \-- resources/
|-- test/
|   +-- java/
|   \-- resources/
\-- pom.xml

Don’t put test code in the main source folder. You don’t want to include it with the final build artifact. The project might have other files as well, like a README.

Unit tests

The Maven Surefire Plugin runs unit tests during Maven’s test phase. To run unit tests:

Add maven-surefire-plugin to the plugins section of your pom.xml
Name your unit tests *Tests.java
Put them under src/test/java
Mirror the package structure from the main code always
Run tests with mvn test

There are a bunch of options for configuring the Maven Surefire Plugin. If you don’t want to configure anything special, you actually don’t need to add the plugin to the POM file. Nevertheless, it’s good practice to add it to the POM file anyway. Here’s what that would look like:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>3.2.5</version>
            </plugin>
        </plugins>
    </build>

Integration tests

The Maven Failsafe Plugin runs integration tests during Maven’s integration-test phase. Integration tests are distinct from unit tests due to their external dependencies and should be treated differently. To run integration tests:

Add maven-failsafe-plugin to the plugins section of your pom.xml
Name your unit tests *IT.java
Put them under src/test/java
Mirror the package structure from the main code as appropriate
Run tests with mvn verify

Maven actually has multiple integration test phases: pre-integration-test, integration-test, and post-integration-test to handle appropriate setup, testing, and cleanup. However, none of these phases will cause the build to fail. Instead, use the verify goal to make the build fail when ITs fail.

Like the Maven Surefire Plugin, the Maven Failsafe Plugin has a bunch of options. However, to run integration tests, you must configure the Failsafe plugin in the POM file. Here’s what it looks like:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-failsafe-plugin</artifactId>
                <version>3.2.5</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>integration-test</goal>
                            <goal>verify</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

Maven phases

Phases in the Maven Build Lifecycle are cumulative:

Running mvn test includes the compile phase
Running mvn verify includes the compile and test phases

It is also a good practice to run mvn clean before other phases to delete the build output (target/) directory. That way, old class files and test reports are removed before generating new ones. You may also include clean with commands to run tests, like this: mvn clean test or mvn clean verify.

Customizations

You can customize how tests run. For example, you can create a separate directory for integration tests (like src/it instead of src/test). However, I recommend avoiding customizations like this. They require complicated settings in the POM file that are difficult to get right and confusing to maintain. Others who join the project later will expect Maven standards.

Using Multiple Test Frameworks Simultaneously

Someone recently asked me the following question, which I’ve paraphrased for better context:

Is it good practice to use multiple test frameworks simultaneously? For example, I’m working on a Python project. I want to do BDD with behave for feature testing, but pytest would be better for unit testing. Can I use both? If so, how should I structure my project(s)?

The short answer: Yes, you should use the right frameworks for the right needs. Using more than one test framework is typically not difficult to set up. Let’s dig into this.

The F-word

I despise the F-word – “framework.” Within the test automation space, people use the word “framework” to refer to different things. Is the “framework” only the test package like pytest or JUnit? Does it include the tests themselves? Does it refer to automation tools like Selenium WebDriver?

For clarity, I prefer to use two different terms: “framework” and “solution.” A test framework is software package that lets programmers write tests as methods or functions, run the tests, and report the results. A test solution is a software implementation for a testing problem. It typically includes frameworks, tools, and test cases. To me, a framework is narrow, but a solution is comprehensive.

The original question used the word “framework,” but I think it should be answered in terms of solutions. There are two potential solutions at hand: one for unit tests written in pytest, while another for feature tests written in behave.

One Size Does Not Fit All

Always use the right tools or frameworks for the right needs. Unit tests and feature tests are fundamentally different. Unit tests directly access internal functions and methods in product code, whereas feature tests interact with live versions of the product as an external user or caller. Thus, they need different kinds of testing solutions, which most likely will require different tools and frameworks.

For example, behave is a BDD framework for Python. Programmers write test cases in plain-language Gherkin with step definitions as Python functions. Gherkin test cases are intuitively readable and understandable, which makes them great for testing high-level behaviors like interacting with a Web page. However, BDD frameworks add complexity that hampers unit test development. Unit tests are inherently “code-y” and low-level because they directly call product code. The pytest framework would be a better choice for unit testing. Conversely, feature tests could be written using raw pytest, but behave provides a more natural structure for describing features. Hence, separate solutions for different test types would be ideal.

Same or Separate Repositories?

If more than one test solution is appropriate for a given software project, the next question is where to put the test code. Should all test code go into the same repository as the product code, or should they go into separate repositories? Unfortunately, there is no universally correct answer. Here are some factors to consider.

Unit tests should always be located in the same repository as the product code they test. Unit tests directly depend upon the product code. They mus be written in the same language. Any time the product code is refactored, unit tests must be updated.

Feature tests can be placed in the same repository or a separate repository. I recommend putting feature tests in the same repository as product code if feature tests are written in the same language as the product code and if all the product code under test is located in the same repository. That way, tests are version-controlled together with the product under test. Otherwise, I recommend putting feature tests in their own separate repository. Mixed language repositories can be confusing to maintain, and version control must be handled differently with multi-repository products.

Same Repository Structure

One test solution in one repository is easy to set up, but multiple test solutions in one repository can be tricky. Thankfully, it’s not impossible. Project structure ultimately depends upon the language. Regardless of language, I recommend separating concerns. A repository should have clearly separate spaces (e.g., subdirectories) for product code and test code. Test code should be further divided by test types and then coverage areas. Testers should be able to run specific tests using convenient filters.

Here are ways to handle multiple test solutions in a few different languages:

In Python, project structure is fairly flexible. Conventionally, all tests belong under a top-level directory named “tests.” Subdirectories may be added thereunder, such as “unit” and “feature”. Frameworks like pytest and behave can take search paths so they run the proper tests. Furthermore, if using pytest-bdd instead of behave, pytest can use the markings/tags instead of search paths for filtering tests.
In .NET (like C#), the terms “project” and “solution” have special meanings. A .NET project is a collection of code that is built into one artifact. A .NET solution is a collection of projects that interrelate. Typically, the best practice in .NET would be to create a separate project for each test type/suite within the same .NET solution. I have personally set up a .NET solution that included separate projects for NUnit unit tests and SpecFlow feature tests.
In Java, project structure depends upon the project’s build automation tool. Most Java projects seem to use Maven or Gradle. In Maven’s Standard Directory Layout, tests belong under “src/test”. Different test types can be placed under separate packages there. The POM might need some extra configuration to run tests at different build phases.
In JavaScript, test placement depends heavily upon the project type. For example, Angular creates separate directories for unit tests using Jasmine and end-to-end tests using Protractor when initializing a new project.

Do What’s Best

Different test tools and frameworks meet different needs. No single one can solve all problems. Make sure to use the right tools for the problems at hand. Don’t force yourself to use the wrong thing simply because it is already used elsewhere.

Gherkin Syntax Highlighting in Chrome

Google Chrome is one of the most popular web browsers around. Recently, I discovered that Chrome can edit and display Gherkin feature files. The Chrome Web Store has two useful extensions for Gherkin: Tidy Gherkin and Pretty Gherkin, both developed by Martin Roddam. Together, these two extensions provide a convenient, lightweight way to handle feature files.

Tidy Gherkin

Tidy Gherkin is a Chrome app for editing and formatting feature files. Once it is installed, it can be reached from the Chrome Apps page (chrome://apps/). The editor appears in a separate window. Gherkin text is automatically colored as it is typed. The bottom preview pane automatically formats each line, and clicking the “TIDY!” button in the upper-left corner will format the user-entered text area as well. Feature files can be saved and opened like a regular text editor. Templates for Feature, Scenario, and Scenario Outline sections may be inserted, as well as tables, rows, and columns.

Another really nice feature of Tidy Gherkin is that the preview pane automatically generates step definition stubs for Java, Ruby, and JavaScript! The step def code is compatible with the Cucumber test frameworks. (The Java code uses the traditional step def format, not the Java 8 lambdas.) This feature is useful if you aren’t already using an IDE for automation development.

Tidy Gherkin has pros and cons when compared to other editors like Notepad++ and Atom. The main advantages are automatic formatting and step definition generation – features typically seen only in IDEs. It’s also convenient for users who already use Chrome, and it’s cross-platform. However, it lacks richer text editing features offered by other editors, it’s not extendable, and the step def gen feature may not be useful to all users. It also requires a bit of navigation to open files, whereas other editors may be a simple right-click away. Overall, Tidy Gherkin is nevertheless a nifty, niche editor.

This slideshow requires JavaScript.

Pretty Gherkin

Pretty Gherkin is a Chrome extension for viewing Gherkin feature files through the browser with syntax highlighting. After installing it, make sure to enable the “Allow access to the file URLs” option on the Chrome Extensions page (chrome://extensions/). Then, whenever Chrome opens a feature file, it should display pretty text. For example, try the GoogleSearch.feature file from my Cucumber-JVM example project, cucumber-jvm-java-example. Unfortunately, though, I could not get Chrome to display local feature files – every time I would try to open one, Chrome would simply download it. Nevertheless, Pretty Gherkin seems to work for online SCM sites like GitHub and BitBucket.

Since Pretty Gherkin is simply a display tool, it can’t really be compared to other editors. I’d recommend Pretty Gherkin to Chrome users who often read feature files from online code repositories.

This slideshow requires JavaScript.

Be sure to check out other Gherkin editors, too!

Gherkin Syntax Highlighting in Notepad++
Gherkin Syntax Highlighting in Atom
The Gherkin plugin for JetBrains IntelliJ IDEA

Cucumber-JVM for Java

This post is a concise-yet-comprehensive overview of Cucumber-JVM for Java. It is an introduction, a primer, a guide, and a reference. If you are new to BDD, please learn about it before using Cucumber-JVM.

Introduction

Cucumber is an open-source software test automation framework for behavior-driven development. It uses a business-readable, domain-specific language called Gherkin for specifying feature behaviors that become tests. The Cucumber project started in 2008 when Aslak Hellesøy released the first version of the Cucumber framework for Ruby.

Cucumber-JVM is the official port for Java. Every Gherkin step is “glued” to a step definition method that executes the step. The English text of a step is glued using annotations and regular expressions. Cucumber-JVM integrates nicely with other testing packages. Anything that can be done with Java can be handled by Cucumber-JVM. Cucumber-JVM is ideal for black-box, above-unit, functional tests.

[Update on 7/29/2018: As of version 3.0.0, Cucumber-JVM no longer supports JVM languages other than Java – namely Groovy, Scala, Clojure, and Gosu. Please read Cucumber-JVM is dropping support of JVM Languages on the official Cucumber blog. The Gherkin parser is also now written in Go instead of in Java.]

Example Projects

Github contains two Cucumber-JVM example projects for this guide:

cucumber-jvm-java-example for traditional annotation-style step defs
cucumber-jvm-java8-example for Java 8 lambda-style step defs

The projects use Java, Apache Maven, Selenium WebDriver, and AssertJ. The README files include practice exercises as well.

Prerequisite Skills

To be successful with Cucumber-JVM for Java, the following skills are required:

BDD and Gherkin (read BDD 101 to get up-to-speed)
Intermediate-level Java programming
Test automation (like JUnit and TestNG)
Build process (like ANT, Maven, and Gradle)

Prerequisite Tools

Test machines must have the Java Development Kit (JDK) installed to build and run Cucumber-JVM tests. They should also have the desired build tool installed (such as Apache Maven). The build tool should automatically install Cucumber-JVM packages through dependency management.

An IDE such as JetBrains IntelliJ IDEA (with the Cucumber for Java plugin) or Eclipse (with the Cucumber JVM Eclipse Plugin) is recommended for Cucumber-JVM test automation development. Software configuration management (SCM) with a tool like Git is also strongly recommended.

Versions

Cucumber-JVM 2.0 was released in August 2017 and should be used for new Cucumber-JVM projects. Releases may be found under Maven Group ID io.cucumber. Older Cucumber-JVM 1.x versions may be found under Maven Group ID info.cukes.

Build Management

Apache Maven is the preferred build management tool for Cucumber-JVM projects. All Cucumber-JVM packages are available from the Maven Central Repository. Maven can automatically run Cucumber-JVM tests as part of the build process. Projects using Cucumber-JVM should follow Maven’s Standard Directory Layout. The examples use Maven. Gradle may also be used, but it requires extra setup.

Every Maven project has a POM file for configuration. The POM should contain appropriate Cucumber-JVM dependencies. There is a separate package for each JVM language, dependency injection framework, and underlying unit test runner. Since Cucumber-JVM is a test framework, its dependencies should use test scope. Check io.cucumber on the Maven site for the latest packages and versions.

Project Structure

Cucumber-JVM test automation has the same layered approach as other BDD frameworks:

The higher layers focus more on specification, while the lower layers focus more on implementation. Gherkin feature files and step definition classes are BDD-specific.

Cucumber-JVM tests may be included in the same project as product code or in a separate project. Either way, projects using Cucumber-JVM should follow Maven’s Standard Directory Layout: test code should be located under src/test.

Screenshot of the example project from IntelliJ IDEA’s Project view.

Gherkin Feature Files

Gherkin feature files are text files that contain Gherkin behavior scenarios. They use the “.feature” extension. In a Maven project, they belong under src/test/resources, since they are not Java source files. They should also be organized into a sensible package hierarchy. Refer to other BDD pages for writing good Gherkin.

A feature file from the example projects, opened in IntelliJ IDEA.

Step Definition Classes

Step definition classes are Java classes containing methods that implement Gherkin steps. Step def classes are like regular Java classes: they have variables, constructors, and methods. Steps are “glued” to methods using regular expressions. Feature file scenarios can use steps from any step definition class in the project. In a Maven project, step defs belong in packages under src/test/java, and their class names should end in “Steps”.

The Basics

Below is a step definition class from the cucumber-jvm-java-example project, which uses the traditional method annotation style for step defs as part of the cucumber-java package. Each method should throw Throwable so that exceptions are raised up to the Cucumber-JVM framework.

package com.automationpanda.example.stepdefs;

import com.automationpanda.example.pages.GooglePage;
import cucumber.api.java.After;
import cucumber.api.java.Before;
import cucumber.api.java.en.Given;
import cucumber.api.java.en.Then;
import cucumber.api.java.en.When;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import static org.assertj.core.api.Assertions.assertThat;

public class GoogleSearchSteps {

  private WebDriver driver;
  private GooglePage googlePage;

  @Before(value = "@web", order = 1)
  public void initWebDriver() throws Throwable {
    driver = new ChromeDriver();
  }

  @Before(value = "@google", order = 10)
  public void initGooglePage() throws Throwable {
    googlePage = new GooglePage(driver);
  }

  @Given("^a web browser is on the Google page$")
  public void aWebBrowserIsOnTheGooglePage() throws Throwable {
    googlePage.navigateToHomePage();
  }

  @When("^the search phrase \"([^\"]*)\" is entered$")
  public void theSearchPhraseIsEntered(String phrase) throws Throwable {
    googlePage.enterSearchPhrase(phrase);
  }

  @Then("^results for \"([^\"]*)\" are shown$")
  public void resultsForAreShown(String phrase) throws Throwable {
    assertThat(googlePage.pageTitleContains(phrase)).isTrue();
  }

  @After(value = "@web")
  public void disposeWebDriver() throws Throwable {
    driver.quit();
  }
}

Alternatively, in Java 8, step definitions may be written using lambda expressions. As shown in the cucumber-jvm-java8-example project, lambda-style step defs are more concise and may be defined dynamically. The cucumber-java8 package is required:

package com.automationpanda.example.stepdefs;

import com.automationpanda.example.pages.GooglePage;
import cucumber.api.Scenario;
import cucumber.api.java8.En;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import static org.assertj.core.api.Assertions.assertThat;

public class GoogleSearchSteps implements En {

  private WebDriver driver;
  private GooglePage googlePage;

  // Warning: Make sure the timeouts for hooks using a web driver are zero

  public GoogleSearchSteps() {
    Before(new String[]{"@web"}, 0, 1, (Scenario scenario) -> {
      driver = new ChromeDriver();
    });
    Before(new String[]{"@google"}, 0, 10, (Scenario scenario) -> {
      googlePage = new GooglePage(driver);
    });
    Given("^a web browser is on the Google page$", () -> {
      googlePage.navigateToHomePage();
    });
    When("^the search phrase \"([^\"]*)\" is entered$", (String phrase) -> {
      googlePage.enterSearchPhrase(phrase);
    });
    Then("^results for \"([^\"]*)\" are shown$", (String phrase) -> {
      assertThat(googlePage.pageTitleContains(phrase)).isTrue();
    });
    After(new String[]{"@web"}, (Scenario scenario) -> {
      driver.quit();
    });
  }
}

Either way, steps from any feature file are glued to step definition methods/lambdas from any class at runtime:

Gluing a Gherkin step to its Java definition using regular expressions. IDEs have features to automatically generate definition stubs for steps.

For best practice, class inheritance should also be avoided – step bindings in superclasses will trigger DuplicateStepDefinitionException exceptions at runtime, and any step definition concern handled by inheritance can be handled better with other design patterns. Class constructors should be used primarily for dependency injection, while setup operations should instead be handled in Before hooks.

Hooks

Scenarios sometimes need automation-centric setup and cleanup routines that should not be specified in Gherkin. For example, web tests must first initialize a Selenium WebDriver instance. Step definition classes can have Before and After hooks that run before and after a scenario. They are analogous to setup and teardown methods from other test frameworks like JUnit. Hooks may optionally specify tags for the scenarios to which they apply, as well as an order number. They are similar to Aspect-Oriented Programming. After hooks will run even if a scenario has an exception or abortive assertion – use them for cleanup routines instead of Gherkin steps to guarantee cleanup runs.

The code snippet below shows Before and After hooks from the traditional-style example project. The order given to the Before hooks guarantees the web driver is initialized before the page object is created.

  @Before(value = "@web", order = 1)
  public void initWebDriver() throws Throwable {
    driver = new ChromeDriver();
  }

  @Before(value = "@google", order = 10)
  public void initGooglePage() throws Throwable {
    googlePage = new GooglePage(driver);
  }

  @After(value = "@web")
  public void disposeWebDriver() throws Throwable {
    driver.quit();
  }

Before and After hooks surround scenarios only. Cucumber-JVM does not provide hooks to surround the whole test suite. This protects test case independence but makes global setup and cleanup challenging. The best workaround is to use the singleton pattern with lazy initialization. The solution is documented in Cucumber-JVM Global Hook Workarounds.

Dependency Injection

Cucumber-JVM supports dependency injection (DI) as a way to share objects between step definition classes. For example, steps in different classes may need to share the same web driver instance. Cucumber-JVM supports many DI modules, and each has its own dependency package. As a warning, do not use static variables for sharing objects between step definition classes – static variables can break test independence and parallelization.

PicoContainer is the simplest DI framework and is recommended for most needs. Dependency injection hinges upon step definition class constructors. Without DI, step def constructors must not have parameters. With DI, PicoContainer will automatically construct each object in a step def constructor signature and pass them in when the step def object is constructed. Furthermore, the same object is injected into all step def classes that have its type as a constructor parameter. Objects that require constructor parameters should use a holder or caching class to provide the necessary arguments. Note that dependency-injected objects are created fresh for each scenario.

Below is a trivial example for how to apply dependency injection using PicoContainer to initialize the web driver in the example projects. (A more advanced example would read browser type from a config file and set the web driver accordingly.)

public class WebDriverHolder {
  private WebDriver driver;
  public WebDriver getDriver() {
    return driver;
  }
  public void initWebDriver() {
    driver = new ChromeDriver();
  }
}

public class GoogleSearchSteps {
  private WebDriverHolder holder;
  public GoogleSearchSteps(WebDriverHolder holder) {
    this.holder = holder;
  }
  @Before
  public void initWebDriver() throws Throwable {
    if (holder.getDriver() == null)
      holder.initWebDriver();
  }
}

Automation Support Classes

Automation support classes are extra classes outside of the Cucumber-JVM framework itself that are needed for test automation. They could come from the same test project, a separate but proprietary package, or an open-source package. Regardless of the source, they should fold into build management. They can integrate seamlessly with Cucumber-JVM. Step definitions should be very short because the bulk of automation work should be handled by support classes for maximum code reusability.

Popular open-source Java packages for test automation support are:

Selenium WebDriver for web browser interactions
REST Assured for REST API calls
AssertJ for assertions
SLF4J, Logback, and log4j2 for logging
Extent Reports for test reporting

Page objects, file readers, and data processors also count as support classes.

Configuration Files

Configuration files are extra files outside of the Cucumber-JVM framework that provide environment-specific data to the tests, such as URLs, usernames, passwords, logging/reporting settings, and database connections. They should be saved in standard formats like CSV, XML, JSON, or Java Properties, and they should be read into memory once at the start of the test suite using global hook workarounds. The automation code should look for files at predetermined locations or using paths passed in as environment variables or properties.

Not all test automation projects need config files, but many do. Never hard-code config data into the automation code. Avoid non-text-based formats like Microsoft Excel so that version control can easily do diffs, and avoid non-standard formats that require custom parsers because they require extra development and maintenance time.

Running Tests

Cucumber-JVM tests may be run in a number of ways.

Using JUnit or TestNG

The cucumber-junit and cucumber-testng packages enable JUnit and TestNG respectively to run Cucumber-JVM tests. They require test runner classes that provide CucumberOptions for how to run the tests. A project may have more than one runner class. The example projects use the JUnit runner like this:

package com.automationpanda.example.runners;

import cucumber.api.CucumberOptions;
import cucumber.api.junit.Cucumber;
import org.junit.runner.RunWith;

@RunWith(Cucumber.class)
@CucumberOptions(
  plugin = {"pretty", "html:target/cucumber", "junit:target/cucumber.xml"},
  features = "src/test/resources/com/automationpanda/example/features",
  glue = {"com.automationpanda.example.stepdefs"})
public class PandaCucumberTest {
}

JUnit and TestNG runners can also be picked up by build management tools. For example, Maven will automatically run any runner classes named *Test.java during the test phase and *IT.java during the verify phase. Be sure to include the clean option to delete old test results. Avoid duplicate test runs by making sure runner classes do not cover the same tests – use tags to avoid duplicate coverage.

Using the Command Line Runner

Cucumber-JVM provides a CLI runner that can run feature files directly from the command line. To use it, invoke:

java cucumber.api.cli.Main

Run with “–help” to see all available options.

Using IDEs

Both JetBrains IntelliJ IDEA (with the Cucumber for Java plugin) and Eclipse (with the Cucumber JVM Eclipse Plugin) are great IDEs for Cucumber-JVM test development. They provide features for linking steps to definitions, generating definition stubs, and running tests with various options.

Cucumber Options

Cucumber options may be specified either in a runner class or from the command line as a Java system property. Set options from the command line using “-Dcucumber.options” – it will work for any java or mvn command. To see all available options, set the options to “–help”, or check the official Cucumber-JVM doc page.

The most useful option is probably the tags option. Selecting tags to run dynamically at runtime, rather than statically in runner classes, is very useful. In Cucumber-JVM 2.0, tag expressions use a basic English Boolean language:

@automated and @web
@web or @service
not @manual
(@web or @service) and (not @wip)

Older version of Cucumber-JVM used a more complicated syntax with tildes and commas.

Parallel Execution

Parallel test execution greatly reduces total start-to-end testing time. However, it requires additional machines to run tests, tools and config to handle parallel runs, and tests to be written to avoid collisions. As of version 4.0.0, Cucumber-JVM supports parallel execution out of the box. Previously, the most common way to do parallel test runs in Cucumber-JVM was to use a Maven plugin like the Cucumber-JVM Parallel Plugin or the Cucable plugin from Trivago.

[Update on 9/24/2018: Mentioned that Cucumber-JVM supports parallel execution starting with version 4.0.0.]

References

The Cucumber website
The Cucumber-JVM project on GitHub
The Cucumber Book
The Cucumber for Java Book
Cucumber Recipes
Running Cucumber tests in parallel
Cucable Maven Plugin
The Automation Panda blog
- The BDD page
- BDD 101

Cucumber-JVM Global Hook Workarounds

Almost all BDD automation frameworks have some sort of hooks that run before and after scenarios. However, not all frameworks have global hooks that run once at the beginning or end of a suite of scenarios – and Cucumber-JVM is one of these unlucky few. Cucumber-JVM GitHub Issue #515, which seeks to add @BeforeAll and @AfterAll hooks, has been open and active since 2013, but it looks unclear if the issue will ever be resolved. Thankfully, there are some workarounds to effect the same behavior as global hooks.

Workaround #1: Don’t Do It

From a purist’s perspective, each scenario (or test) should be completely independent, meaning it should not share parts with any other tests. Independence provides the following benefits:

Safety between tests
Consistency across tests
The ability to run any tests individually, in any order, or in parallel
More sensible, understandable tests

If not handled properly, global hooks can be dangerous because they make tests interdependent. Changes or failures in one test may cascade into others. Global test data would waste memory for tests that don’t use it. Furthermore, the fact that Issue #515 has been open for years indicates the difficulty of properly implementing global hooks.

However, the main cost of independence is runtime. Independent tests often repeat similar setup and cleanup routines. Even a few extra seconds per test can add up tremendously. Google Guava, for example, has over 286,000 tests – adding one second to each test would amount to nearly 80 hours! Performance becomes especially critical for continuous integration, in which wasted time means either delivery delays or coverage gaps. Certain operations like preparing a database or fetching authentication tokens may be pragmatic candidates for global hooks.

The best strategy is to use global hooks only when necessary for time-intensive setup that can be shared safely. Any shared test data should be immutable. Always question the need for global hooks. Most tests probably won’t need them.

Workaround #2: Static Variables

A basic hack for global hooks is actually provided in Issue #515. A static Boolean flag can indicate when the @Before hook has run more than once because it isn’t “reset” when a new scenario re-instantiates the step definition classes. The runtime shutdown hook will be called once all tests are done and the program exits. (Note that a static flag cannot be used in an @After hook due to the halting problem.) The example from the issue is shamelessly copied below:

public class GlobalHooks {
    private static boolean dunit = false;

    @Before
    public void beforeAll() {
        if(!dunit) {
            Runtime.getRuntime().addShutdownHook(afterAllThread);
            // do the beforeAll stuff...
            dunit = true;
        }
    }
}

Workaround #3: Singleton Caching

The basic hack is useful for simple setup and cleanup routines, but it becomes inelegant when objects must be shared by scenarios. Rather than polluting the class with static members, a singleton can cache test data between scenarios, and global setup logic may be put into the singleton’s constructor. Furthermore, if the singleton uses lazy initialization, then @Before hooks may not be needed at all. A “lazy” singleton will not be instantiated until the first time its getInstance method is called, meaning it will be skipped if the scenarios do not need them. This is a huge advantage when selectively running scenarios by name, tag, or feature. (Please refer to the previous post, Static or Singleton, for a deeper explanation of the singleton pattern.)

Consider scenarios that must generate authentication tokens (like OAuth) for API testing. A singleton “token holder” could cache tokens for usernames, rather than doing the authorization dance for every scenario. The snippet below shows how such a singleton could be called within a @When step definition with no @Before method.

public class ExampleSteps {
    ...
    @When("^some API is called$")
    public void whenSomeApiIsCalled() {
        // Get the token from the singleton cache lazily
        String token = TokenHolder.getInstance().getToken("user", "pass");
        // Use the token to call some API (method not shown)
        callSomeApi(token);
    }
    ...
}

And the singleton class could be defined like this:

public class TokenHolder {
    private static volatile TokenHolder instance = null;
    private HashMap<String, String> tokens;

    private TokenHolder() {
        tokens = new HashMap<String, String>();
    }

    public static TokenHolder getInstance() {
        // Lazy and thread-safe
        if (instance == null) {
            synchronized(TokenHolder.class) {
                if (instance == null) {
                    instance = new TokenHolder();
                }
            }
        }

        return instance;
    }
    
    public String getToken(String username, String password) {
        // This check could be extended to handle token expiration
        if (!tokens.containsKey(username)) {
            // Request a fresh authentication token (method not shown)
            String token = requestToken(username, password);
            // Cache the token for later
            tokens.put(username, token);
        }
        
        return tokens.get(username);
    }
    
    ...
}

Workaround #4: JUnit Class Annotations

Another workaround mentioned in Issue #515 and elsewhere is to use JUnit‘s @BeforeClass and @AfterClass annotations in the runner class, like this:

@RunWith(Cucumber.class)
@Cucumber.Options(format = {
    "html:target/cucumber-html-report",
    "json-pretty:target/cucumber-json-report.json"})
public class RunCukesTest {

    @BeforeClass
    public static void setup() {
        System.out.println("Ran the before");
    }

    @AfterClass
    public static void teardown() {
        System.out.println("Ran the after");
    }
}

While @BeforeClass and @AfterClass may look like the cleanest solution at first, they are not very practical to use. They work only when Cucumber-JVM is set to use the JUnit runner. Other runners, like TestNG, the command line runner, and special IDE runners, won’t pick up these hooks. Their methods must also be are static and would need static variables or singletons to share data anyway. Therefore, I personally discourage using these annotations in Cucumber-JVM.

What About Dependency Injection?

Dependency injection is a marvelous technique. As defined by Wikipedia:

In software engineering, dependency injection is a technique whereby one object supplies the dependencies of another object. A dependency is an object that can be used (a service). An injection is the passing of a dependency to a dependent object (a client) that would use it. The service is made part of the client’s state. Passing the service to the client, rather than allowing a client to build or find the service, is the fundamental requirement of the pattern.

Dependency injection can be a powerful alternative to singletons because DI provides finer control over the scope of objects. However, Cucumber-JVM’s dependency injection cannot be applied with global hooks because dependency objects, like step definition objects, are constructed and destroyed for each scenario.

Comparison Table

Ultimately, the best approach for global hooks in Cucumber-JVM is the one that best fits the tests’ needs. Below is a table to make workaround comparisons easier.

Workaround	Pros	Cons
Don’t Do It	Scenarios are completely independent. No complicated or risky workarounds.	Repeated setup and cleanup procedures may add significant execution time.
Static Variables	Simple yet effective implementation.	May need many static variables to share test data.
Singleton Caching	Abstracts test data and setup procedures. Easily handles lazy initialization and evaluation. May not need a @Before hook.	More complicated design.
JUnit Class Annotations	Clean look for basic setup and cleanup routines.	May be used only with the JUnit runner. Requires static variables or singletons to share test data anyway.

BDD 101: Frameworks

Every major programming language has a BDD automation framework. Some even have multiple choices. Building upon the structural basics from the previous post, this post provides a survey of the major frameworks available today. Since I cannot possibly cover every BDD framework in depth in this 101 series, my goal is to empower you, the reader, to pick the best framework for your needs. Each framework has support documentation online justifying its unique goodness and detailing how to use it, and I would prefer not to duplicate documentation. Use this post primarily as a reference. (Check the Automation Panda BDD page for the full table of contents.)

Major Frameworks

Most BDD frameworks are Cucumber versions, JBehave derivatives inspired by Dan North, or non-Gherkin spec runners. Some put behavior scenarios into separate files, while others put them directly into the source code.

C# and Microsoft .NET

SpecFlow, created by Gáspár Nagy, is arguably the most popular BDD framework for Microsoft .NET languages. Its tagline is “Cucumber for .NET” – thus fully compliant with Gherkin. SpecFlow also has polished, well-designed hooks, context injection, and parallel execution (especially with test thread affinity). The basic package is free and open source, but SpecFlow also sells licenses for SpecFlow+ extensions. The free version requires a unit test runner like MsTest, NUnit, or xUnit.net in order to run scenarios. This makes SpecFlow flexible but also feels jury-rigged and inelegant. The licensed version provides a slick runner named SpecFlow+ Runner (which is BDD-friendly) and a Microsoft Excel integration tool named SpecFlow+ Excel. Microsoft Visual Studio has extensions for SpecFlow to make development easier.

There are plenty of other BDD frameworks for C# and .NET, too. xBehave.net is an alternative that pairs nicely with xUnit.net. A major difference of xBehave.net is that scenario steps are written directly in the code, instead of in separate text (feature) files. LightBDD bills itself as being more lightweight than other frameworks and basically does some tricks with partial classes to make the code more readable. NSpec is similar to RSpec and Mocha and uses lambda expressions heavily. Concordion offers some interesting ways to write specs, too. NBehave is a JBehave descendant, but the project appears to be dead without any updates since 2014.

Java and JVM Languages

The main Java rivalry is between Cucumber-JVM and JBehave. Cucumber-JVM is the official Cucumber version for Java and other JVM languages (Groovy, Scala, Clojure, etc.). It is fully compliant with Gherkin and generates beautiful reports. The Cucumber-JVM driver can be customized, as well. JBehave is one of the first and foremost BDD frameworks available. It was originally developed by Dan North, the “father of BDD.” However, JBehave is missing key Gherkin features like backgrounds, doc strings, and tags. It was also a pure-Java implementation before Cucumber-JVM existed. Both frameworks are widely used, have plugins for major IDEs, and distribute Maven packages. This popular but older article compares the two in slight favor of JBehave, but I think Cucumber-JVM is better, given its features and support.

The Automation panda article Cucumber-JVM for Java is a thorough guide for the Cucumber-JVM framework.

Java also has a number of other BDD frameworks. JGiven uses a fluent API to spell out scenarios, and pretty HTML reports print the scenarios with the results. It is fairly clean and concise. Spock and JDave are spec frameworks, but JDave has been inactive for years. Scalatest for Scala also has spec-oriented features. Concordion also provides a Java implementation.

JavaScript

Almost all JavaScript BDD frameworks run on Node.js. Jasmine and Mocha are two of the most popular general-purpose JS test frameworks. They differ in that Jasmine has many features included (like assertions and spies) that Mocha does not. This makes Jasmine easier to get started (good for beginners) but makes Mocha more customizable (good for power users). Both claim to be behavior-driven because they structure tests using “describe” and “it-should” phrases in the code, but they do not have the advantage of separate, reusable steps like Gherkin. Personally, I consider Jasmine and Mocha to be behavior-inspired but not fully behavior-driven.

Other BDD frameworks are more true to form. Cucumber provides Cucumber.js for Gherkin-compliant happiness. Yadda is Gherkin-like but with a more flexible syntax. Vows provides a different way to approach behavior using more formalized phrase partitions for a unique form of reusability. The Cucumber blog argues that Cucumber.js is best due to its focus on good communication through plain language steps, whereas other JavaScript BDD frameworks are more code-y. (Keep in mind, though, that Cucumber would naturally boast of its own framework.) Other comparisons are posted here, here, here, and here.

PHP

The two major BDD frameworks for PHP are Behat and Codeception. Behat is the official Cucumber version for PHP, and as such is seen as the more “pure” BDD framework. Codeception is more programmer-focused and can handle other styles of testing. There are plenty of articles comparing the two – here, here, and here (although the last one seems out of date). Both seem like good choices, but Codeception seems more flexible.

Python

Python has a plethora of test frameworks, and many are BDD. behave and lettuce are probably the two most popular players. Feature comparison is analogous to Cucumber-JVM versus JBehave, respectively: behave is practically Gherkin compliant, while lettuce lacks a few language elements. Both have plugins for major IDEs. pytest-bdd is on the rise because it integrates with all the wonderful features of pytest. radish is another framework that extends the Gherkin language to include scenario loops, scenario preconditions, and variables. All these frameworks put scenarios into separate feature files. They all also implement step definitions as functions instead of classes, which not only makes steps feel simpler and more independent, but also avoids unnecessary object construction.

Other Python frameworks exist as well. pyspecs is a spec-oriented framework. Freshen was a BDD plugin for Nose, but both Freshen and Nose are discontinued projects.

Ruby

Cucumber, the gold standard for BDD frameworks, was first implemented in Ruby. Cucumber maintains the official Gherkin language standard, and all Cucumber versions are inspired by the original Ruby version. Spinach bills itself as an enhancement to Cucumber by encapsulating steps better. RSpec is a spec-oriented framework that does not use Gherkin.

Which One is Best?

There is no right answer – the best BDD framework is the one that best fits your needs. However, there are a few points to consider when weighing your options:

What programming language should I use for test automation?
Is it a popular framework that many others use?
Is the framework actively supported?
Is the spec language compliant with Gherkin?
What type of testing will you do with the framework?
What are the limitations as compared to other frameworks?

Frameworks that separate scenario text from implementation code are best for shift-left testing. Frameworks that put scenario text directly into the source code are better for white box testing, but they may look confusing to less experienced programmers.

Personally, my favorites are SpecFlow and pytest-bdd. At LexisNexis, I used SpecFlow and Cucumber-JVM. For Python, I used behave at MaxPoint, but I have since fallen in love with pytest-bdd since it piggybacks on the wonderfulness of pytest. (I can’t wait for this open ticket to add pytest-bdd support in PyCharm.) For skill transferability, I recommend Gherkin compliance, as well.

Reference Table

The table below categorizes BDD frameworks by language and type for quick reference. It also includes frameworks in languages not described above. Recommended frameworks are denoted with an asterisk (*). Inactive projects are denoted with an X (x).

Language	Framework	Type
C	Catch	In-line Spec
C++	Igloo	In-line Spec
C# and .NET	Concordion LightBDD NBehave x NSpec SpecFlow * xBehave.net	In-line Spec In-line Gherkin Separated semi-Gherkin In-line Spec Separated Gherkin In-line Gherkin
Golang	Ginkgo	In-line Spec
Java and JVM	Cucumber-JVM * JBehave JDave x JGiven * Scalatest Spock	Separated Gherkin Separated semi-Gherkin In-line Spec In-line Gherkin In-line Spec In-line Spec
JavaScript	Cucumber.js * Yadda Jasmine Mocha Vows	Separated Gherkin Separated semi-Gherkin In-line Spec In-line Spec In-line Spec
Perl	Test::BDD::Cucumber	Separated Gherkin
PHP	Behat Codeception *	Separated Gherkin Separated or In-line
Python	behave * freshen x lettuce pyspecs pytest-bdd * radish	Separated Gherkin Separated Gherkin Separated semi-Gherkin In-line Spec Separated semi-Gherkin Separated Gherkin-plus
Ruby	Cucumber * RSpec Spinach	Separated Gherkin In-line Spec Separated Gherkin
Swift / Objective C	Quick	In-line Spec

[4/22/2018] Update: I updated info for C# and Python frameworks.

The Best Programming Language for Test Automation

Which programming languages are best for writing test automation? There are several choices – just look at this list on Wikipedia and this cool decision graphs for choosing languages. While this topic can quickly devolve into a spat over personal tastes, I do believe there are objective reasons for why some languages are better for automating test cases than others.

Dividing Test Layers

First of all, unit tests should always be written in the same language as the product under test. Otherwise, they would definitionally no longer be unit tests! Unit tests are white box and need direct access to the product source code. This allows them to cover functions, methods, and classes.

The question at hand pertains more to higher-layer functional tests. These tests fall into many (potentially overlapping) categories: integration, end-to-end, system, acceptance, regression, and even performance. Since they are all typically black box, higher-layer tests do not necessarily need to be written in the same language as the product under test.

My Opinionated Choices

Personally, I think Python is today’s best all-around language for test automation. Python is wonderful because its conciseness lets the programmer expressively capture the essence of the test case. It also has very rich test support packages. Check out this article: Why Python is Great for Test Automation. Java is a good choice as well – it has a rich platform of tools and packages, and continuous integration with Java is easy with Maven/Gradle/ANT and Jenkins. I’ve heard that Ruby is another good choice for reasons similar to Python, but I have not used it myself.

Some languages are good in specific domains. For example, JavaScript is great for pure web app testing (à la Jasmine, Karma, and Protractor) but not so good for general purposes (despite Node.js running anywhere). A good reason to use JavaScript for testing would be MEAN stack development. TypeScript would be even better because it is safer and scales better. C# is great for Microsoft shops and has great test support, but it lives in the Microsoft bubble. .NET development tools are not always free, and command line operations can be painful.

Other languages are poor choices for test automation. While they could be used for automation, they likely should not be used. C and C++ are inconvenient because they are very low-level and lack robust frameworks. Perl is dangerous because it simply does not provide the consistency and structure for scalable, self-documenting code. Functional languages like LISP and Haskell are difficult because they do not translate well from test case procedures. They may be useful, however, for some lower-level data testing.

8 Criteria for Evaluation

There are eight major points to consider when evaluating any language for automation. These criteria specifically assess the language from a perspective of purity and usability, not necessarily from a perspective of immediate project needs.

Usability. A good automation language is fairly high-level and should handle rote tasks like memory management. Lower learning curves are preferable. Development speed is also important for deadlines.
Elegance. The process of translating test case procedures into code must be easy and clear. Test code should also be concise and self-documenting for maintainability.
Available Test Frameworks. Frameworks provide basic needs such as fixtures, setup/cleanup, logging, and reporting. Examples include Cucumber and xUnit.
Available Packages. It is better to use off-the-shelf packages for common operations, such as web drivers (Selenium), HTTP requests, and SSH.
Powerful Command Line. A good CLI makes launching tests easy. This is critical for continuous integration, where tests cannot be launched manually.
Easy Build Integration. Build automation should launch tests and report results. Difficult integration is a DevOps nightmare.
IDE Support. Because Notepad and vim just don’t cut it for big projects.
Industry Adoption. Support is good. If the language remains popular, then frameworks and packages will be maintained well.

Below, I rated each point for a few popular languages:

	Python	Java	JavaScript	C#	C/C++	Perl
Usability	awesome	good	good	good	terrible	poor
Elegance	awesome	good	okay	good	poor	poor
Available Test Frameworks	awesome	awesome	awesome	good	okay	poor
Available Packages	awesome	awesome	okay	good	good	good
Powerful Command Line	awesome	good	good	okay	poor	okay
Easy Build Integration	good	good	good	good	poor	poor
IDE Support	good	awesome	good	good	okay	terrible
Industry Adoption	awesome	awesome	awesome	good	terrible	terrible

Conclusion

I won’t shy away from my preference for Python, but I recognize that they may not be the right choice for all situations. For example, when I worked at LexisNexis, we used C# because management wanted developers, who wrote the app in C#, to contribute to test automation.

Now, a truly nifty idea would be to create a domain-specific language for test automation, but that must be a topic for another post.

UPDATE: I changed some recommendations on 4/18/2018.