Author: Andy Knight

I'm a software engineer who specializes in test automation.

BDD 101: Gherkin By Example

Gherkin is learned best by example. Whereas the previous post in this series focused on Gherkin syntax and semantics, this post will walk through a set of examples that show how to use all of the language parts. The examples cover basic Google searching, which is easy to explain and accessible to all. You can find other good example references from Cucumber and Behat. (Check the Automation Panda BDD page for the full table of contents.)

As a disclaimer, this post will focus entirely upon feature file examples and not upon automation through step definitions. Writing good Gherkin scenarios must come before implementing step definitions. Automation will be covered in future posts. (Note that these examples could easily be automated using Selenium.)

A Simple Feature File

Let’s start with the example from the previous post:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown

This is a complete feature file. It starts with a required Feature section and a description. The description is optional, and it may have as many or as few lines as desired. The description will not affect automation at all – think of it as a comment. As an Agile best practice, it should include the user story for the features under test. This feature file then has one Scenario section with a title and one each of GivenWhenThen steps in order. It could have more scenarios, but for simplicity, this example has only one. Each scenario will be run independently of the other scenarios – the output of one scenario has no bearing on the next! The indents and blank lines also make the feature file easy to read.

Notice how concise yet descriptive the scenario is. Any non-technical person can easily understand how Google searches should behave from reading this scenario. “Search for pandas? Get pandas!” The feature’s behavior is clear to the developer, the tester, and the product owner. Thus, this one feature file can be shared by all stakeholders and can dispel misunderstandings.

Another thing to notice is the ability to parameterize steps. Steps should be written for reusability. A step hard-coded to search for pandas is not very reusable, but a step parameterized to search for any phrase is. Parameterization is handled at the level of the step definitions in the automation code, but by convention, it is a best practice to write parameterized values in double-quotes. This makes the parameters easy to identify.

Additional Steps

Not all behaviors can be fully described using only three steps. Thankfully, scenarios can have any number of steps using And and But. Let’s extend the previous example:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown
    And the related results include "Panda Express"
    But the related results do not include "pandemonium"

Now, there are three Then steps to verify the outcome. And and But steps can be attached to any type of step. They are interchangeable and do not have any unique meaning – they exist simply to make scenarios more readable. For example, the scenario above could have been written as Given-When-Then-Then-Then, but Given-When-Then-And-But makes more sense. Furthermore, And and But do not represent any sort of conditional logic. Gherkin steps are entirely sequential and do not branch based on if/else conditions.

Doc Strings

In-line parameters are not the only way to pass inputs to a step. Doc strings can pass larger pieces of text as inputs like this:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown
    And the result page displays the text
      """
      Scientific name: Ailuropoda melanoleuca
      Conservation status: Endangered (Population decreasing)
      """

Doc strings are delimited by three double-quotes ‘”””‘.  They may fit onto one line, or they may be multiple lines long. The step definition receives the doc string input as a plain old string. Gherkin doc strings are reminiscent of Python docstrings in format.

Step Tables

Tables are a valuable way to provide data with concise syntax. In Gherkin, a table can be passed into a step as an input. The example above can be rewritten to use a table for related results like this:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown
    And the following related results are shown
      | related       |
      | Panda Express |
      | giant panda   |
      | panda videos  |

Step tables are delimited by the pipe symbol “|”. They may have as many rows or columns as desired. The  first row contains column names and is not treated as input data. The table is passed into the step definition as a data structure native to the language used for automation (such as an array). Step tables may be attached to any step, but they will be connected to that step only. For good formatting, remember to indent the step table and to space the delimiters evenly.

The Background Section

Sometimes, scenarios in a feature file may share common setup steps. Rather than duplicate these steps, they can be put into a Background section:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Background:
    Given a web browser is on the Google page

  Scenario: Simple Google search for pandas
    When the search phrase "panda" is entered
    Then results for "panda" are shown

  Scenario: Simple Google search for elephants
    When the search phrase "elephant" is entered
    Then results for "elephant" are shown

Since each scenario is independent, the steps in the Background section will run before each scenario is run, not once for the whole set. The Background section does not have a title. It can have any type or number of steps, but as a best practice, it should be limited to Given steps.

Scenario Outlines

Scenario outlines bring even more reusability to Gherkin. Notice in the example above that the two scenarios are identical apart from their search terms. They could be combined with a Scenario Outline section:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario Outline: Simple Google searches
    Given a web browser is on the Google page
    When the search phrase "<phrase>" is entered
    Then results for "<phrase>" are shown
    And the related results include "<related>"
    
    Examples: Animals
      | phrase   | related       |
      | panda    | Panda Express |
      | elephant | Elephant Man  |

Scenario outlines are parameterized using Examples tables. Each Examples table has a title and uses the same format as a step table. Each row in the table represents one test instance for that particular combination of parameters. In the example above, there would be two tests for this Scenario Outline. The table values are substituted into the steps above wherever the column name is surrounded by the “<” “>” symbols.

Scenario Outline section may have multiple Examples tables. This may make it easier to separate combinations. For example, tables could be added for “Planets” and “Food”. Each Examples table is connected to the Scenario Outline section immediately preceding it. A feature file can have any number of Scenario Outline sections, but make sure to write them well. (See Are Multiple Scenario Outlines in a Feature File Okay?)

Be careful not to confuse step tables with Examples tables! This is a common mistake for Gherkin beginners. Step tables provide input data structures, whereas Examples tables provide input parameterization.

Tags

Tags are a great way to classify scenarios. They can be used to selectively run tests based on tag name, and they can be used to apply before-and-after wrappers around scenarios. Most BDD frameworks support tags. Any scenario can be given tags like this:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  @automated @google @panda
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown

Tags start with the “@” symbol. Tag names are case-sensitive and whitespace-separated. As a best practice, they should be lowercase and use hyphens (“-“) between separate words. Tags must be put on the line before a Scenario or Scenario Outline section begins. Any number of tags may be used.

Comments

Comments allow the author to add additional information to a feature file. In Gherkin, comments must use a whole line, and each line must start with a hashtag “#”. Comment lines may appear anywhere and are ignored by the automation framework. For example:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  # Test ID: 12345
  # Author: Andy
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown

Since Gherkin is very self-documenting, it is a best practice to limit the use of comments in favor of more descriptive steps and titles.

Writing Good Gherkin

This post merely shows how to use the Gherkin syntax. The next post will cover how to write good Gherkin feature files.

BDD 101: The Gherkin Language

As mentioned in the previous post, behavior scenarios are the cornerstone of BDD. Each scenario is the formalized specification of a single behavior of a product or feature. Scenarios are both the requirements for the feature as well as the test cases. This post will show how to write behavior scenarios in Gherkin feature files. (Check the Automation Panda BDD page for the full table of contents.)

Introducing Gherkin

Gherkin is the domain-specific language for writing behavior scenarios. It is a simple programming language, and its “code” is written into feature files (text files with a “.feature” extension). The official Gherkin language standard is maintained by Cucumber, one of the most prevalent BDD automation frameworks. Most other BDD frameworks use Gherkin, but some may not conform 100% to Cucumber’s language standards.

Gherkin scenarios are meant to be short and to sound like plain English. Each scenario has the following structure:

  1. Given some initial state
  2. When an action is taken
  3. Then verify an outcome

A simple feature file example is shown below, with keywords in bold:

Feature: Google Searching
  As a web surfer, I want to search Google, so that I can learn new things.
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown

As you can see, it reads intuitively. Even non-technical people can understand it.

The Feature section has a title and a description, which are both used only for documentation purposes. When the feature is tied to an Agile user story, it is good practice to put the user story in the description. The Feature section has one or more Scenario sections, each with a unique title.

Each scenario is essentially a test case. The Given-When-Then format concisely frames the behavior under test. Each Given, When, or Then line is called a step. Steps must appear in the order of Given->When->Then and are executed sequentially. The Given step sets up the expected state before the main actions take place (like loading the Google home page). The When step contains the actions for exercising the behavior under test (running a Google search), and the Then step verifies that the behavior was successful (seeing the results page). The English-y phrase following the step keyword is a description of what the step will do, written by the test author. This description is linked to a step definition (a method/function that implements the operations for the step) in the automation code base using string or regular expression matching. (Feature files apart from step definitions are basically manual test case procedures.) Good steps are declarative in that they state what should happen at a high level, and not imperative because they shouldn’t focus on direct, low-level instructions.

Gherkin Keywords

Every programming language has its keywords, and Gherkin is no different. The table below explains how each keyword is used in the official Gherkin language. Note that some BDD frameworks may not be fully compliant. Cucumber provides a decent Gherkin language reference for its implementation.

Keyword Purpose
Feature
  • section denoting product or feature under test
  • contains a one-line title
  • contains extra lines for description
  • description should include the user story
  • may have one Background section
  • may have multiple Scenario and Scenario Outline sections
  • should be one Feature per feature file
Scenario
  • section for a specific behavior scenario
  • contains a one-line title
  • contains multiple Given, When, and Then steps
  • each type of step is optional
  • step order matters
  • each scenario runs independently
Given
  • step to define the preconditions (initial state or context)
  • should put the product under test into the desired state
  • may be parameterized
When
  • step to define the action to be performed
  • may be parameterized
Then
  • step to define the expected result from the action taken by When
  • may be parameterized
And
  • an additional step added to a Given, When, or Then
  • used instead of repeating Given, When, or Then
  • example: Given-Given-When-Then = Given-And-When-Then
  • associated with the immediately preceding step
  • order matters
But
  • functions the same as And, but might be easier to read
  • interchangeable with And
Background
  • a section of Given and And statements to run before each scenario
  • does not have a title or description
  • only one Background for each Feature section
Scenario Outline
  • a templated scenario section
  • uses “<” and “>” to identify parameter names
  • followed by Examples tables that provides parameter values
  • may have more than one Examples tables
  • parameters are substituted when the tests run
Examples
  • a section to provide a table of parameter values for a Scenario Outline
  • each table row represents a combination of values to test together
  • may have any positive number of rows
|
  • table delimeter used for Examples tables and step tables
  • use the escape sequence “\|” to use pipe characters as text within a column
“””
  • doc string delimiter for passing large text into a step
  • doc strings may be multi-line
@
  • prefix for a tag: @
  • tags may be placed before Feature or Scenario sections
  • tags are used to filter scenarios
#
  • prefix for a comment line
  • comments are not read by the Gherkin parser

The next post will walk through several Gherkin examples to show how to write good scenarios.

BDD 101: Introducing BDD

Series Overview

BDD 101 is a blog series to teach the basics of behavior-driven development. It is both a “getting started” guide for BDD beginners, as well as a best-practice reference for pros. I wrote this series for anyone involved in the daily duties of software development: developers, testers, scrum masters, product owners, and managers. The content in this series comes from my experiences using BDD for many projects. It focuses on Gherkin-based specification, and test automation will be a major theme. If this series is for you, then let’s dive in!

The BDD 101 table of contents is given on the Automation Panda BDD page. Note that some articles in the series were posted months apart and will not all appear together using the “previous” and “next” article arrows.

The Big BDD Picture: The main goals of BDD are collaboration and automation.

What is a Behavior?

behavior is how a product or feature operates. It is defined as a scenario of inputs, actions, and outcomes. A product or feature exhibits countless behaviors. Identifying behaviors individually brings clarity and simplicity. It also helps explain how behaviors are related. Below are examples of behaviors:

  • Logging into a web page
  • Clicking links on a navigation bar
  • Submitting forms
  • Making successful service calls
  • Receiving expected errors

Separating individual behaviors makes it easy to define a system without unnecessary repetition. For example, there may be multiple ways to navigate to the same page.

Nav Behaviors

Search from a text field and searching directly from URL parameters both lead to the same results page.

What is BDD?

Behavior­-Driven Development (BDD) is a test­-centric software development process that grew out of Test-­Driven Development (TDD). It has been around since roughly the mid-2000s. BDD focuses on clearly identifying the desired behavior of a feature from the very start. Behaviors are identified using specification by example: behavior specs are written to illustrate the desired behavior with realistic examples, rather than being written with abstract, generic jargon. They serve as both the product’s requirements/acceptance criteria (before development) and its test cases (after development). Gherkin is one of the most popular languages for writing formal behavior specifications – it captures behaviors as “Given-When-Then” scenarios. With the help of automation tools, scenarios can easily be turned into automated test cases. Anybody from engineers to product owners can write BDD scenarios, since they are just English phrases. BDD keeps developers focused on delivering precisely what the product owner wants. It also expedites testing. As such, BDD pairs nicely with Agile Software Development.

Quick Points

  • BDD is specification by example.
    • When someone says “BDD”, immediately think of “Given-When-Then”.
  • BDD focuses on behavior first.
    • Behavior scenarios are the cornerstone of BDD.
  • BDD is a refinement of the Agile process, not an overhaul.
    • It formalizes acceptance criteria and test coverage.
  • BDD is a paradigm shift.
    • Behaviors become the team’s main focus.

The Origins of BDD

The following quote comes from an article entitled Introducing BDD, written by Dan North (the “Father of BDD”) in March 2006:

I had a problem. While using and teaching agile practices like test-driven development (TDD) on projects in different environments, I kept coming across the same confusion and misunderstandings. Programmers wanted to know where to start, what to test and what not to test, how much to test in one go, what to call their tests, and how to understand why a test fails.

The deeper I got into TDD, the more I felt that my own journey had been less of a wax-on, wax-off process of gradual mastery than a series of blind alleys. I remember thinking “If only someone had told me that!” far more often than I thought “Wow, a door has opened.” I decided it must be possible to present TDD in a way that gets straight to the good stuff and avoids all the pitfalls.

My response is behaviour-driven development (BDD). It has evolved out of established agile practices and is designed to make them more accessible and effective for teams new to agile software delivery. Over time, BDD has grown to encompass the wider picture of agile analysis and automated acceptance testing.

12 Awesome Benefits

BDD improves the development process in a dozen ways:

Inclusion Anyone can write BDD scenarios, because they are written in plain English. Think of The Three Amigos.
Clarity Scenarios focus specifically on the expected behavior of the product under development, resulting in less ambiguity for what to develop.
Streamlining Requirements = acceptance criteria = test cases. Modular syntax expedites automation as well.
Shift-Left Test case definition inherently becomes part of grooming.
Artifacts Scenarios form a collection of test cases. Any tests not automated can be added to a known automation backlog.
Automation BDD frameworks make it easy to turn scenarios into automated tests.
Test­-Driven Most BDD frameworks can run scenarios to fail until the feature is implemented.
Code Reuse “Given-When­-Then” steps can be reused between scenarios.
Parameterization Steps can be parameterized. For example, a step to click a button can take in its ID.
Variation Using parameters, example tables make it easy to run the same scenario with different combinations of inputs.
Momentum Scenarios become easier and faster to write and automate as more step definitions are added.
Adaptability Scenarios are easy to rewrite as the products and features change.

Testing Recommendations

Since BDD focuses on actual feature behavior, behavior specs are best for higher-level, functional, black box tests. For example, BDD is great for testing APIs and web UIs. Gherkin excels for acceptance testing. However, behavior specs would be overkill for unit tests, and it is also not a good choice for performance tests that focus on metrics and not pass/fail results. Read more about this in the article BDD 101: Unit, Integration, and End-to-End Tests.

Next Steps

Lost yet? Don’t worry – this first post presented a lot of information. Things will make much more sense after learning how to write Gherkin test scenarios, which will be covered in the next post in this 101 series.

Why is Automation Full of Duplicate Code?

Copypasta
noun

A block of text that is duplicated repeatedly via "copy-paste",
    often causing annoyance or frustration.

One the the biggest problems (if not the biggest problem) I have seen in test automation is copypasta – the unnecessary duplication of code. It happens at all layers of testing. It happens in any type of project. It happens at companies big and small. And the consequences are stark: test development slows down, mistakes become more common, and maintenance becomes a nightmare. Although duplicate code can happen in any software project, it is especially prevalent in test automation. The reasons may or may not surprise you, but the solutions are clear.

4 Reasons Why Duplicate Code Pervades Automation

#1: Test cases are repetitive. For any given product, tests will share many of the same steps. For example, web app tests must all navigate to a start page at first, or API tests might cover a few variations for one call. Testing mechanics such as input parameters, setup/cleanup, logging, and assertions happen frequently. Put all of that together into test suites that have tens, hundreds, thousands, or even more test cases. It’s simply the nature of testing.

#2: Automation frameworks reinforce repetition. Most frameworks structure test cases as a class with methods (like JUnit) or as a collection of functions (like pytest), in which each method or function represents one test. Inherently, this basic structure is a good thing for making tests independent. However, lazy programmers may abuse the structure. Often, they put all test code inside these test methods, instead of extracting repetitive logic into helper methods or design patterns. Then, it becomes easier to simply duplicate an entire test case method and change a few things, rather than to implement a better overall design.

#3: Test code takes a backseat to product code. Business needs drive software development in the industry, and since test code is not part of the product delivered to customers, it is often deemed to be less important. Not as much devotion is given to developing good test code. Many best practices are abandoned for expediency.

#4: Testers often have weaker development skills. This is not a condemnation of testers, nor a universal labeling, but rather a distinction between disciplines: developers are developers because they are good at making software, and testers are testers because they are good at exercising software and finding bugs. Of course, I know plenty of testers who do indeed have strong dev skills. However, I also know solid testers who have limited programming experience. When automation responsibility falls upon testers with limited dev skills, poor development practices happen, and code duplication is typically rampant.

How to Avoid Duplicate Code in Automation

Code duplication is code cancer.  -Andy

There are a number of ways to slay the copypasta monster. The first line of defense is to check yourself before you wreck yourself. Always question yourself when you copy-paste blocks of code. Why did you do that? What are you changing in the pasted copy? Should you abstract that logic into a method or a class that can be reused? Can you parameterize it? Override your Ctrl-C, Ctrl-V keyboard shortcut if necessary.

Be a good programmer. Develop packages for reusable actions. Things like assertions, logging, setup, and cleanup should be shared by all test cases. In that shared code, keep action calls short. Long method names with too many parameters inhibit usability. Remember that automated test cases should be self-documenting so that they read like test procedures. Whenever possible, make repetitive actions happen automatically. For example, make library methods do internal logging, and use test framework setup/cleanup routines. For another example, I once wrote code to automatically reconnect SSH sessions whenever they dropped. These auto-actions allow test case code to focus less on the low-level mechanics and more on the high-level features under test.

Finally, be a team player. Use the same development practices for test code as for product code. Automation is a product, and its customers are the team. Use coding standards, design patterns, and revision control. Most importantly, reinforce good practices through code review. Use the review process as a constructive way to learn new tricks and even to mentor less experienced team members. Finally, divide testing roles between test formulation, test case automation, and test framework development. “QA” (quality assurance) is a wide discipline, and not everyone is equally skilled. Let people do what they do best. There is strength in diversity and in teamwork.

The Best Programming Language for Test Automation

Which programming languages are best for writing test automation? There are several choices – just look at this list on Wikipedia and this cool decision graphs for choosing languages. While this topic can quickly devolve into a spat over personal tastes, I do believe there are objective reasons for why some languages are better for automating test cases than others.

Dividing Test Layers

First of all, unit tests should always be written in the same language as the product under test. Otherwise, they would definitionally no longer be unit tests! Unit tests are white box and need direct access to the product source code. This allows them to cover functions, methods, and classes.

The question at hand pertains more to higher-layer functional tests. These tests fall into many (potentially overlapping) categories: integration, end-to-end, system, acceptance, regression, and even performance. Since they are all typically black box, higher-layer tests do not necessarily need to be written in the same language as the product under test.

My Opinionated Choices

Personally, I think Python is today’s best all-around language for test automation. Python is wonderful because its conciseness lets the programmer expressively capture the essence of the test case. It also has very rich test support packages. Check out this article: Why Python is Great for Test AutomationJava is a good choice as well – it has a rich platform of tools and packages, and continuous integration with Java is easy with Maven/Gradle/ANT and Jenkins. I’ve heard that Ruby is another good choice for reasons similar to Python, but I have not used it myself.

Some languages are good in specific domains. For example, JavaScript is great for pure web app testing (à la Jasmine, Karma, and Protractor) but not so good for general purposes (despite Node.js running anywhere). A good reason to use JavaScript for testing would be MEAN stack development. TypeScript would be even better because it is safer and scales better. C# is great for Microsoft shops and has great test support, but it lives in the Microsoft bubble. .NET development tools are not always free, and command line operations can be painful.

Other languages are poor choices for test automation. While they could be used for automation, they likely should not be used. C and C++ are inconvenient because they are very low-level and lack robust frameworks. Perl is dangerous because it simply does not provide the consistency and structure for scalable, self-documenting code. Functional languages like LISP and Haskell are difficult because they do not translate well from test case procedures. They may be useful, however, for some lower-level data testing.

8 Criteria for Evaluation

There are eight major points to consider when evaluating any language for automation. These criteria specifically assess the language from a perspective of purity and usability, not necessarily from a perspective of immediate project needs.

  1. Usability.  A good automation language is fairly high-level and should handle rote tasks like memory management. Lower learning curves are preferable. Development speed is also important for deadlines.
  2. Elegance. The process of translating test case procedures into code must be easy and clear. Test code should also be concise and self-documenting for maintainability.
  3. Available Test Frameworks. Frameworks provide basic needs such as fixtures, setup/cleanup, logging, and reporting. Examples include Cucumber and xUnit.
  4. Available Packages. It is better to use off-the-shelf packages for common operations, such as web drivers (Selenium), HTTP requests, and SSH.
  5. Powerful Command Line. A good CLI makes launching tests easy. This is critical for continuous integration, where tests cannot be launched manually.
  6. Easy Build Integration. Build automation should launch tests and report results. Difficult integration is a DevOps nightmare.
  7. IDE Support. Because Notepad and vim just don’t cut it for big projects.
  8. Industry Adoption. Support is good. If the language remains popular, then frameworks and packages will be maintained well.

Below, I rated each point for a few popular languages:

Python Java JavaScript C# C/C++ Perl
Usability  awesome  good  good  good  terrible  poor
Elegance  awesome  good  okay  good  poor  poor
Available Test Frameworks  awesome  awesome  awesome  good  okay  poor
Available Packages  awesome  awesome  okay  good  good  good
Powerful Command Line  awesome  good  good  okay  poor  okay
Easy Build Integration  good  good  good  good  poor  poor
IDE Support  good  awesome  good  good  okay  terrible
Industry Adoption  awesome  awesome  awesome  good  terrible  terrible

Conclusion

I won’t shy away from my preference for Python, but I recognize that they may not be the right choice for all situations. For example, when I worked at LexisNexis, we used C# because management wanted developers, who wrote the app in C#, to contribute to test automation.

Now, a truly nifty idea would be to create a domain-specific language for test automation, but that must be a topic for another post.

UPDATE: I changed some recommendations on 4/18/2018.

Should Gherkin Steps Use First-Person or Third-Person?

The Gherkin language allows the tester to write their own steps.  This is a blessing (for flexibility) and a curse (for poor grammar).  Although misspellings and out-of-place capitalization don’t affect the functionality of test scenarios, mixed point of view may cause ambiguity.  Consider the following two examples:

    Given I am at the Google search page
    When I search for “panda”
    Then I see web page links for “panda”
    Given the browser is at the Google search page
    When the user searches for “panda”
    Then web page links for “panda” are shown

Both scenarios do the same thing: they run a basic Google search.   However, the first one is written in first-person narrative, while the second one is written in third-person narrative.  What happens when we mix the steps together?

    Given I am at the Google search page
    When the user searches for “panda”
    Then I see web page links for “panda”

That scenario is confusing.  Am I the user, or is the user a different person?  Should there be a second browser for the user?  Why do I see what the user sees?  The English is poorly written due to the mixed point of view.

This may seem like a trivial example, but consider a project with multiple tests.  Gherkin scenarios will reuse steps.  Steps with different points of view will clash.  Therefore, all Gherkin scenarios for a project should use one point of view.

So, which point of view is better?  There is no definitively correct answer, but my strong conviction is that all Gherkin steps should use third-person perspective.  Third-person perspective is entirely generic and can expressively name any user or system component.  First-person semantically limits the expressive coverage of a step by forcing presumptions of who the speaker is.  For example, if “I” am a user, what profile or privileges do I have?  And are those attributes of who “I” am applicable when the step is used in other contexts?  It may be easier to write Gherkin scenarios in first-person perspective because it helps the author to frame himself or herself in the context of the user, but it makes the steps less reusable.  Even worse, first-person perspective can cause steps to be misunderstood.  As a workaround, scenarios could add an extra “Given” step to explicitly frame the context of the first person (such as, “Given I am an administrator, When …”), but this requires an extra step that would be unnecessary with third-person perspective.  Personally, I just don’t see the advantage to first-person point of view in Gherkin.  And I would definitely reject code reviews that mixed the point of view either way.

As techies, we can look to the humanities for one more reason to use third-person point of view in Gherkin. In middle school, in high school, and in college, every teacher emphasized time and time again that essays must be written in third-person perspective.  Every slip of “I think” and “I believe” and “you know” was dinged.  Why?  Third-person presents a more objective, more formal, and more powerful writing style.  Gherkin is meant to be expressive, so let’s write it like we mean it.

Gherkin Syntax Highlighting in Notepad++

Notepad++ is an excellent text editor for Windows. It is free, lightweight, feature-rich, and extendable. It can handle just about any programming language out there. I use it all the time, especially for config files and quick edits that don’t require a bulky IDE. Seriously, if you don’t have it, download it now. (Not a Windows user? Check out Gherkin Syntax Highlighting in Atom.)

One of the nifty features in Notepad++ is User Defined Language, which allows users to customize the syntax highlighting for any language. This is invaluable if you use an obscure language or even create your own. To access this feature, simply navigate to the Language menu option, go to User Defined Language near the bottom, and choose Define your language…. From there, you can create new user language and set stylers for keywords, operators, and other language facets. Stylers can set font color, size, and style. Users can also import and export UDLs as XML files for sharing. Since the highlighting doesn’t rely upon a context-free grammar, it has its limits. For example, keywords may still be highlighted when not actually being used as keywords in the language. Nevertheless, it’s better than nothing.

Since I do a lot of behavior-driven test automation development, I created a UDL for Gherkin. You can download it from the Automation Panda Github repository – the file is named gherkin_npp_udl.xml. Import it into Notepad++ through the User Defined Language window, and you’re ready to go! If you download my UDL file from GitHub, make sure to download it as a raw XML file.

Below is a screen shot of an example feature file:

npp_gherkin
An example feature file using my Notepad++ UDL for Gherkin

(Note: These instructions are based on NotePad++ 7.9.1.)

10 Things You Lose Without Automation

Automation has a lot of potential to improve software development. Unfortunately, though, automation is often seen as a luxury. Deadlines in the real word are unforgiving, and since test code isn’t product code, automation tasks are given lower priority and dunked into the black hole of the backlog. Some might argue that this is okay because it is lean or because a new project is just getting started. Once, I even heard it quipped that the first ones cut during a layoff are the automation folks. And it is true that automation requires a nontrivial resource investment.

However, I want to turn the tables. Instead of thinking about automation in terms of the opportunity, think about automation in terms of the opportunity cost. What happens if you don’t automate your tests from the get-go?  There are 10 major things you lose:

#1: Man Hours

Automated tests will automatically run.  Manual tests must be manually run.  That’s ontological.  If you only run a test one time, then automation has no return-on-investment.  But if you run a test more than once, automation saves a tester from repeating themselves. Plus, it’s easy: push the button and wait for results. Automated tests almost always run faster than manual tests, too.  Considering that time is money and engineer salaries aren’t cheap, man hours are a clear opportunity cost.

#2: Coverage

Automated tests can achieve greater coverage than manual tests, particularly for regression testing. As product development progresses, the sheer number of test cases increases. For example, in Agile, new tests will be created every sprint. Older tests must be run periodically to verify that new features don’t break existing features. If regression tests are manual, then testers must burn hours grinding through the same tests repeatedly.  Often, for expediency, this means that they skip some tests – not in the sense of being lazy, but rather as part of a risk-based approach.  Weaker coverage plus risk of missing bugs are accepted for the sake of shorter testing time.  If those regression tests were automated, then there would be no reason to shrink coverage, because they would be easy to run.

#3: Consistency

People make mistakes. It’s human nature – nobody’s perfect. And manual tests are prone to human error because humans run them. I remember how nervous I felt running manual on-call system checks at MaxPoint for the first time, afraid that I would miss a problem that could bring down a million-dollar bidding system.  Automated scripts run the same way every time.

#4: Protection

Continuous integration (CI) protects code against defects by building and testing every code change in real time. A CI system will automatically trigger tests all the time.Tests not running in CI (like manual tests) are effectively dead. At NetApp, failing code changes would immediately be kicked out of the code line, making automated tests act like a vaccine against bugs. On the other hand, I remember a project at MaxPoint that was riddled with bugs and perpetually delayed. When I asked the developers to see their unit tests, they said they never wrote unit tests because “it wasn’t a requirement.”

#5: Delivery Time

Continuous delivery (CD) is the natural extension of continuous integration, in which software products can automatically be delivered (and potentially even deployed) as the final step in a CI pipeline. This is how big companies like Google, Facebook, and Netflix can deliver so rapidly. No automation means no CD.

#6: Results and Metrics

Non-engineers (managers, product owners, scrum masters, oh my!) love to ask questions about tests.  “Are we red or green?” “How many tests do we have for this feature?” “What’s our coverage?” “How often do we run the tests?” Automated tests simply yield more accurate and more comprehensive results. Automation can also generate test reports, so engineers don’t need to waste time drafting emails or updating wiki pages.

#7: Accountability

Numbers don’t lie. Scripts don’t lie. Engineers typically don’t lie, but… results from manual tests can have a fudge factor, or a mistake in reporting, or any other sort of inconsistency. Inaccurate results may lead to poor business decisions. Automated results tell it like it is.

#8: Creativity

Manual testing can devolve into repetitive, menial labor: just follow steps 1-10 again and again and again. It would be much more effective for manual testers to focus on exploratory testing rather than deterministic testing. While automated tests can cover the fixed, repetitive test scenarios, exploratory testing lets testers find creative ways to uncover defects and judge how well a product actually works. Lack of automation ties up human capital.

#9: Peace of Mind

Are you sure that your product is “good”? Can you run enough tests to make sure? I learned the value of peace of mind while I was still in college. In my compiler theory course, I had to develop a simple programming language and build a compiler for it. Every week, we had to add new language features: arithmetic, strings, arrays, functions, etc. And every week, I wrote a slew of mini-programs to test grammar updates to my new language. By the time the project was complete, I had 1000+ automated test cases running through JUnit with 100% coverage, and the entire suite took a mere few minutes to run. And there were many late nights when the tests caught bugs in my language right away before committing code. There was no way I could have passed that class without my automated tests.

#10: Quality

The ultimate purpose of test automation is product quality. Having automation doesn’t necessarily mean product quality is good, but not having automation severely limits how quality can be pursued. Anecdotally, I’ve seen much better code quality come out of projects that have good test automation than ones without it. If I were a product owner, I know what I would want.

My Choice to Automate

Why am I an automation engineer? The answer became starkly clear to me one Tuesday morning that should have been like any other. At that time, I was a “Software Quality Engineer” on the QA Framework team of a fairly small software company. I arrived at the office at my normal time around 9am. The following two hours I spent in my daily groove with my headphones on, blissfully unaware as our director started escorting engineers out the door with boxes in hand. After an interrupted 11am standup, I found myself, heart racing, in a conference room with my manager and the VP of engineering who had flown in from our other site. On the table, I saw a pile of key fobs and bejeweled company phones. By this point, anyone who knows the software industry could predict what would happen next: the dreaded layoff. However, that’s where the story becomes more interesting.

I wasn’t laid off. Surprise!

In a daring reorganization, the VP eliminated all QA positions in the company. The justification given was to make teams purely Agile with no silos of division between development and quality. Everybody would own quality. Manual testers would be removed in order to prioritize automation. I also suspected that company finances may have encouraged the re-org. As a result, people lost their jobs. However, a few QA engineers, myself included, were spared the layoff and rebranded as regular “software engineers.” In that conference room, as the adrenaline rush subsided, the VP outlined a high-level plan in which I would move to a product development team that desperately needed automation help. Apparently, this team didn’t even have unit tests, and their product was weeks behind schedule. Once they got it going, I would become a regular web developer with the others.

So, why me? The survivor’s guilt bothered me, but the answer, apart from God’s grace, was pretty obvious: automation skill. I was the company’s automation champion, and everybody knew it. I consulted with each team to help improve their testing efforts. I gave a lunch-and-learn on behavior-driven development. I wrote several wiki pages and even example projects. Now, that’s not to brag on myself or belittle the skills of others, since many of the engineers who were cut also had automation talent, but my skills had been made highly visible to the people in power. And my skills continued to afford me great opportunity.

However, the last part of the VP’s new plan disconcerted me – did I want to become a standard web UI developer? I pondered the question deeply, but my gut answer never changed: no. Test automation was my specialty. It was the product I made, and I made it well. I knew the ins and outs, the frameworks, the best practices, the fundamental problem of test automation and the lowercase problems that derive. And I loved doing automation because it gave me the opportunity to set things right: to find an eliminate bugs, to protect the code line, and to make it look damn good. I could indeed switch over to web dev, but why? I didn’t have interest in web dev. That would be like making a hardwood floor installer do painting instead. With automation, I had both a specialty and an opportunity.

To conclude the story, I didn’t stick around much longer at that company. I quickly found a better opportunity as a senior-level automation engineer on an exciting new project at a different company. That’s the software industry – so it goes! Ironically, that layoff may have impacted me just as much as if I had actually lost my job. It forced me to make a choice. I definitively chose to be a software quality engineer. Automation was no longer merely a skill set but a career identity.