BDD 101: Writing Good Gherkin

So, you and your team have decided to make test automation a priority. You plan to use behavior-driven development to shift left with testing. You read the BDD 101 Series up through the previous post. You picked a good language for test automation. You even peeked at Cucumber-JVM or another BDD framework on your own. That’s great! Big steps! And now, you are ready to write your first Gherkin feature file.  You fire open Atom with a Gherkin plugin or Notepad++ with a Gherkin UDL, you type “Given” on the first line, and…

Writer’s block.  How am I supposed to write my Gherkin steps?

Good Gherkin feature files are not easy to write at first. Writing is definitely an art. With some basic pointers, and a bit of practice, Gherkin becomes easier. This post will cover how to write top-notch feature files. (Check the Automation Panda BDD page for the full table of contents.)

The Golden Gherkin Rule: Treat other readers as you would want to be treated. Write Gherkin so that people who don’t know the feature will understand it.

Proper Behavior

The biggest mistake BDD beginners make is writing Gherkin without a behavior-driven mindset. They often write feature files as if they are writing “traditional” procedure-driven functional tests: step-by-step instructions with actions and expected results. HP ALM, qTest, and many other test repository tools store tests in this format. These procedure-driven tests are often imperative and trace a path through the system that covers multiple behaviors. As a result, they may be unnecessarily long, which can delay failure investigation, increase maintenance costs, and create confusion.

For example, let’s consider a test that searches for images of pandas on Google. Below would be a reasonable test procedure:

  1. Open a web browser.
    1. Web browser opens successfully.
  2. Navigate to https://www.google.com/.
    1. The web page loads successfully and the Google image is visible.
  3. Enter “panda” in the search bar.
    1. Links related to “panda” are shown on the results page.
  4. Click on the “Images” link at the top of the results page.
    1. Images related to “panda” are shown on the results page.

I’ve seen many newbies translate a test like this into Gherkin like the following:

# BAD EXAMPLE! Do not copy.
Feature: Google Searching

  Scenario: Google Image search shows pictures
    Given the user opens a web browser
    And the user navigates to "https://www.google.com/"
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page
    When the user clicks on the "Images" link at the top of the results page
    Then images related to "panda" are shown on the results page

This scenario is terribly wrong. All that happened was that the author put BDD buzzwords in front of each step of the traditional test. This is not behavior-driven, it is still procedure-driven.

The first two steps are purely setup: they just go to Google, and they are strongly imperative. Since they don’t focus on the desired behavior, they can be reduced to one declarative step: “Given a web browser is at the Google home page.” This new step is friendlier to read.

After the Given step, there are two When-Then pairs. This is syntactically incorrect: Given-When-Then steps must appear in order and cannot repeat. A Given may not follow a When or Then, and a When may not follow a Then. The reason is simple: any single When-Then pair denotes an individual behavior. This makes it easy to see how, in the test above, there are actually two behaviors covered: (1) searching from the search bar, and (2) performing an image search. In Gherkin, one scenario covers one behavior. Thus, there should be two scenarios instead of one. Any time you want to write more than one When-Then pair, write separate scenarios instead. (Note: Some BDD frameworks may allow disordered steps, but it would nevertheless be anti-behavioral.)

This splitting technique also reveals unnecessary behavior coverage. For instance, the first behavior to search from the search bar may be covered in another feature file. I once saw a scenario with about 30 When-Then pairs, and many were duplicate behaviors.

Do not be tempted to arbitrarily reassign step types to make scenarios follow strict Given-When-Then ordering. Respect the integrity of the step types: Givens set up initial state, Whens perform an action, and Thens verify outcomes. In the example above, the first Then step could have been turned into a When step, but that would be incorrect because it makes an assertion. Step types are meant to be guide rails for writing good behavior scenarios.

The correct feature file would look something like this:

Feature: Google Searching

  Scenario: Search from the search bar
    Given a web browser is at the Google home page
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page

  Scenario: Image search
    Given Google search results for "panda" are shown
    When the user clicks on the "Images" link at the top of the results page
    Then images related to "panda" are shown on the results page

The second behavior arguably needs the first behavior to run first because the second needs to start at the search result page. However, since that is merely setup for the behavior of image searching and is not part of it, the Given step in the second scenario can basically declare (declaratively) that the “panda” search must already be done. Of course, this means that the “panda” search would be run redundantly at test time, but the separation of scenarios guarantees behavior-level independence.

The Cardinal Rule of BDD: One Scenario, One Behavior!

Remember, behavior scenarios are more than tests – they also represent requirements and acceptance criteria. Good Gherkin comes from good behavior.

(For deeper information about the Cardinal Rule of BDD and multiple When-Then pairs per scenario, please refer to my article, Are Gherkin Scenarios with Multiple When-Then Pairs Okay?)

Phrasing Steps

How you write a step matters. If you write a step poorly, it cannot easily be reused. Thankfully, some basic rules maintain consistent phrasing and maximum reusability.

Write all steps in third-person point of view. If first-person and third-person steps mix, scenarios become confusing. I even dedicated a whole blog post entirely to this point: Should Gherkin Steps Use First-Person or Third-Person? TL;DR: just use third-person at all times.

Write steps as a subject-predicate action phrase. It may tempting to leave parts of speech out of a step line for brevity, especially when using Ands and Buts, but partial phrases make steps ambiguous and more likely to be reused improperly. For example, consider the following example:

# BAD EXAMPLE! Do not copy.
Feature: Google Searching

  Scenario: Google search result page elements
    Given the user navigates to the Google home page
    When the user entered "panda" at the search bar
    Then the results page shows links related to "panda"
    And image links for "panda"
    And video links for "panda"

The final two And steps lack the subject-predicate phrase format. Are the links meant to be subjects, meaning that they perform some action? Or, are they meant to be direct objects, meaning that they receive some action? Are they meant to be on the results page or not? What if someone else wrote a scenario for a different page that also had image and video links – could they reuse these steps? Writing steps without a clear subject and predicate is not only poor English but poor communication.

Also, use appropriate tense for each type of step. Givens should always use present perfect tense, and Whens and Thens should always use present tense. Rather than take a time warp back to middle school English class, let’s illustrate tense with a bad example:

# BAD EXAMPLE! Do not copy.
Feature: Google Searching

  Scenario: Simple Google search
    Given the user navigates to the Google home page
    When the user entered "panda" at the search bar
    Then links related to "panda" will be shown on the results page

The Given step above indicates an action when it says, “The user navigates.” Actions imply the exercise of behavior. However, Given steps are meant to establish an initial state, not exercise a behavior. This may seem like a trivial nuance, but it can confuse feature file authors who may not be able to tell if a step is a Given or When. Using present perfect tense indicates a state rather than an action.

The When step above uses past tense when it says, “The user entered.” This indicates that an action has already happened. However, When steps should indicate that an action is presently happening. Plus, past tense here conflicts with the tenses used in the other steps.

The Then step above uses future tense when it says, “The results will be shown.” Future tense seems practical for Then steps because it indicates what the result should be after the current action is taken. However, future tense reinforces a procedure-driven approach because it treats the scenario as a time sequence. A behavior, on the other hand, is a present-tense aspect of the product or feature. Thus, it is better to write Then steps in the present tense.

The corrected example looks like this:

Feature: Google Searching

  Scenario: Simple Google search
    Given a web browser is at the Google home page
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page

And note, all steps are written in third-person.

Good Titles

Good titles are just as important as good steps. The title is like the face of a scenario – it’s the first thing people read. It must communicate in one concise line what the behavior is. Titles are often logged by the automation framework as well. Specific pointers for writing good scenario titles are given in my article, Good Gherkin Scenario Titles.

Choices, Choices

Another common misconception for beginners is thinking that Gherkin has an “Or” step for conditional or combinatorial logic. People may presume that Gherkin has “Or” because it has “And”, or perhaps programmers want to treat Gherkin like a structured language. However, Gherkin does not have an “Or” step. When automated, every step is executed sequentially.

Below is a bad example based on a classic Super Mario video game, showing how people might want to use “Or”:

# BAD EXAMPLE! Do not copy.
Feature: SNES Mario Controls

  Scenario: Mario jumps
    Given a level is started
    When the player pushes the "A" button
    Or the player pushes the "B" button
    Then Mario jumps straight up

Clearly, the author’s intent is to say that Mario should jump when the player pushes either of two buttons. The author wants to cover multiple variations of the same behavior. In order to do this the right way, use Scenario Outline sections to cover multiple variations of the same behavior, as shown below:

Feature: SNES Mario Controls

  Scenario Outline: Mario jumps
    Given a level is started
    When the player pushes the "<letter>" button
    Then Mario jumps straight up
    
    Examples: Buttons
      | letter |
      | A      |
      | B      |

The Known Unknowns

Test data can be difficult to handle. Sometimes, it may be possible to seed data in the system and write tests to reference it, but other times, it may not. Google search is the prime example: the result list will change over time as both Google and the Internet change. To handle the known unknowns, write scenarios defensively so that changes in the underlying data do not cause test runs to fail. Furthermore, to be truly behavior-driven, think about data not as test data but as examples of behavior.

Consider the following example from the previous post:

Feature: Google Searching
  
  Scenario: Simple Google search
    Given a web browser is on the Google page
    When the search phrase "panda" is entered
    Then results for "panda" are shown
    And the following related results are shown
      | related       |
      | Panda Express |
      | giant panda   |
      | panda videos  |

This scenario uses a step table to explicitly name results that should appear for a search. The step with the table would be implemented to iterate over the table entries and verify each appeared in the result list. However, what if Panda Express were to go out of business and thus no longer be ranked as high in the results? (Let’s hope not.) The test run would then fail, not because the search feature is broken, but because a hard-coded variation became invalid. It would be better to write a step that more intelligently verified that each returned result somehow related to the search phrase, like this: “And links related to ‘panda’ are shown on the results page.” The step definition implementation could use regular expression parsing to verify the presence of “panda” in each result link.

Another nice feature of Gherkin is that step definitions can hide data in the automation when it doesn’t need to be exposed. Step definitions may also pass data to future steps in the automation. For example, consider another Google search scenario:

Feature: Google Searching

  Scenario: Search result linking
    Given Google search results for "panda" are shown
    When the user clicks the first result link
    Then the page for the chosen result link is displayed

Notice how the When step does not explicitly name the value of the result link – it simply says to click the first one. The value of the first link may change over time, but there will always be a first link. The Then step must know something about the chosen link in order to successfully verify the outcome, but it can simply reference it as “the chosen result link”. Behind the scenes, in the step definitions, the When step can store the value of the chosen link in a variable and pass the variable forward to the Then step.

Handling Test Data

Some types of test data should be handled directly within the Gherkin, but other types should not. Remember that BDD is specification by example – scenarios should be descriptive of the behaviors they cover, and any data written into the Gherkin should support that descriptive nature. Read Handling Test Data in BDD for comprehensive information on handling test data.

Less is More

Scenarios should be short and sweet. I typically recommend that scenarios should have a single-digit step count (<10). Long scenarios are hard to understand, and they are often indicative of poor practices. One such problem is writing imperative steps instead of declarative steps. I have touched on this topic before, but I want to thoroughly explain it here.

Imperative steps state the mechanics of how an action should happen. They are very procedure-driven. For example, consider the following When steps for entering a Google search:

  1. When the user scrolls the mouse to the search bar
  2. And the user clicks the search bar
  3. And the user types the letter “p”
  4. And the user types the letter “a”
  5. And the user types the letter “n”
  6. And the user types the letter “d”
  7. And the user types the letter “a”
  8. And the user types the ENTER key

Now, the granularity of actions may seem like overkill, but it illustrates the point that imperative steps focus very much on how actions are taken. Thus, they often need many steps to fully accomplish the intended behavior. Furthermore, the intended behavior is not always as self-documented as with declarative steps.

Declarative steps state what action should happen without providing all of the information for how it will happen. They are behavior-driven because they express action at a higher level. All of the imperative steps in the example above could be written in one line: “When the user enters ‘panda’ at the search bar.” The scrolling and keystroking is implied, and it will ultimately be handled by the automation in the step definition. When trying to reduce step count, ask yourself if your steps can be written more declaratively.

Another reason for lengthy scenarios is scenario outline abuse. Scenario outlines make it all too easy to add unnecessary rows and columns to their Examples tables. Unnecessary rows waste test execution time. Extra columns indicate complexity. Both should be avoided. Below are questions to ask yourself when facing an oversized scenario outline:

  • Does each row represent an equivalence class of variations?
    • For example, searching for “elephant” in addition to “panda” does not add much test value.
  • Does every combination of inputs need to be covered?
    • N columns with M inputs each generates MN possible combinations.
    • Consider making each input appear only once, regardless of combination.
  • Do any columns represent separate behaviors?
    • This may be true if columns are never referenced together in the same step.
    • If so, consider splitting apart the scenario outline by column.
  • Does the feature file reader need to explicitly know all of the data?
    • Consider hiding some of the data in step definitions.
    • Some data may be derivable from other data.

These questions are meant to be sanity checks, not hard-and-fast rules. The main point is that scenario outlines should focus on one behavior and use only the necessary variations.

Style and Structure

While style often takes a backseat during code review, it is a factor that differentiates good feature files from great feature files. In a truly behavior-driven team, non-technical stakeholders will rely upon feature files just as much as the engineers. Good writing style improves communication, and good communication skills are more than just resume fluff.

Below are a number of tidbits for good style and structure:

  1. Focus a feature on customer needs.
  2. Limit one feature per feature file. This makes it easy to find features.
  3. Limit the number of scenarios per feature. Nobody wants a thousand-line feature file. A good measure is a dozen scenarios per feature.
  4. Limit the number of steps per scenario to less than ten.
  5. Limit the character length of each step. Common limits are 80-120 characters.
  6. Use proper spelling.
  7. Use proper grammar.
  8. Capitalize Gherkin keywords.
  9. Capitalize the first word in titles.
  10. Do not capitalize words in the step phrases unless they are proper nouns.
  11. Do not use punctuation (specifically periods and commas) at the end of step phrases.
  12. Use single spaces between words.
  13. Indent the content beneath every section header.
  14. Separate features and scenarios by two blank lines.
  15. Separate examples tables by 1 blank line.
  16. Do not separate steps within a scenario by blank lines.
  17. Space table delimiter pipes (“|”) evenly.
  18. Adopt a standard set of tag names. Avoid duplicates.
  19. Write all tag names in lowercase, and use hyphens (“-“) to separate words.
  20. Limit the length of tag names.

Without these rules, you might end up with something like this:

# BAD EXAMPLE! Do not copy.

 Feature: Google Searching
     @AUTOMATE @Automated @automation @Sprint32GoogleSearchFeature
 Scenario outline: GOOGLE STUFF
Given a Web Browser is on the Google page,
 when The seach phrase "<phrase>" Enter,

 Then  "<phrase>" shown.
and The relatedd   results include "<related>".
Examples: animals
 | phrase | related |
| panda | Panda Express        |
| elephant    | elephant Man  |

Don’t do this. It looks horrible. Please, take pride in your profession. While the automation code may look hairy in parts, Gherkin files should look elegant.

Gherkinize Those Behaviors!

With these best practices, you can write Gherkin feature files like a pro. Don’t be afraid to try: nobody does things perfectly the first time. As a beginner, I broke many of the guidelines I put in this post, but I learned as I went. Don’t give up if you get stuck. Always remember the Golden Gherkin Rule and the Cardinal Rule of BDD!

This is the last of three posts in the series focused exclusively on Gherkin. The next post will address how to adopt behavior-driven practices into the Agile software development process.

72 comments

  1. Hi,

    Thanks for the lots of information.I have couple of questions to be asked.

    1.Some companies like to automate user scenarios/business cases since that would be really handy before go live. Rather testing collection of activities testing a user scenario may cover lots of things.Also if anything is broken that could be easily found in user scenarios. Such case your number of steps in the scenario will be higher. Can you explain how you can you follow good practices at this point.
    (In my experienced we had to automate some user scenarios as it was really critical due to code changes.They are like business scenarios)

    2.If you have set of preconditions to be run before execute your tests what will be the best way to automate them? Do you think better to write them with the test? or any other idea?

    Thank you,
    Dileepa Ranaweera
    Quality Assistant Analyst

    Like

    1. Hi Dileepa,

      Thanks for reading my blog and for commenting! You raise good questions.

      For your first question, when you say “user scenario,” I presume you mean, “a scenario covering multiple behaviors to test more intense user interaction”? Think of these types of scenarios as no different from “regular” scenarios. Every behavior scenario should still cover one main behavior. However, for these “user” scenarios, the intended behavior is not the individual actions (“behaviors”) themselves but rather the interaction of these actions in concert. You should still have separate scenarios to cover each individual behavior. With this mindset, when you write “user” scenarios, your steps should be more general and even more declarative than normal. “User” scenarios should not necessarily have higher step count. As a general rule, scenarios should not be more than a dozen lines long.

      For example, consider testing an ATM. You may have a scenario for entering your debit card:
      Given the ATM is at the start screen
      When the user enters their debit card
      And the user enters their PIN correctly
      Then the ATM is at the home screen

      And a scenario for withdrawing cash:
      Given the user is at the ATM home screen
      When the user touches the “withdraw” button
      And the user enters “20” dollars
      Then the ATM dispenses “20” dollars
      And the user’s account is debited by “20” dollars

      You may have a “user” scenario like this:
      Given there are “100” users with valid debit cards
      When the users each withdraw between “20” and “100” dollars
      Then the ATM dispenses each withdraw correctly
      And each user’s account is debited correctly

      This is a trivial, spur-of-the-moment example, but it shows how “user” scenario steps can be less declarative. The part about entering the debit card is just assumed, and the mechanics of cash withdraw are stated as “what” and now “how”. The “how” can be handled in the underlying automation code by calling the other steps or helpers internally. Thus, the behavior in focus for the third scenario is the multi-user intensity, not the authentication or withdraw behaviors.

      For your second question, there are two primary ways to handle scenario preconditions. The first way is the Given step. The Given step is meant to set up initial state. A weak Given will check if a state is true and abort if not. A legitimate Given will put the system into the desired state. If scenarios within a feature file all have the same Givens, then they may be moved to a Background section. If there are a high number of Givens, then I recommend reducing them from being imperative to declarative. The second way to handle preconditions is with tagging and before hooks. For example, you could write a before scenario hook to construct a Selenium WebDriver instance for any scenario that has a “@selenium” tag. This strategy reduces a lot of redundant code, but it should be done only for preconditions/setup that are widely needed.

      I hope this helps!
      Andy

      Like

  2. Hello,

    Thanks for sharing such important information with us.
    Looking at this post I can say that I am certainly not using my FitNesse in the best possible way. I have a doubt, though.

    In one of the tests, we have to fill a form with around 20 questions. It looks something like this:
    Given the user has permissions
    When the user answers the session
    Then the status for the session is displayed as Complete

    However, for the When clause, we have 20 steps, one for each question:
    |Click radiobutton|yes|
    |Fill fields|Field1|Field2|…
    |Select from dropdown|Item1|
    |Fill fields|Field3|Field4|…
    |Type|Comments1|
    ….
    ….
    ….

    This makes the Test Page very lengthy and difficult to read. Could you please provide me with a better approach for scenarios like this?

    Thanks,
    Dinzy

    Like

    1. Hi Dinzy,

      Thanks for reading my blog! I’m sorry it took me so long to reply to your question.

      Your question is very pertinent for BDD: What is the best way to capture test data? And your sensibilities are in the right direction. My advice is this: Write only the most pertinent test data in Gherkin, and put the rest into the automation layer. BDD is meant to be “Specification by Example”, and thus data inputs for behavior scenarios should represent examples of behavioral circumstances. Based on the scenario you provided, it looks like the main behavior in focus is completing the form and getting a success message. The specific input values, though they must be entered, are not of first importance when defining the desired behavior. These specific inputs would best be defined in the automation layer in the step definitions. Automation code can easily have methods/functions to handle the specific input operations. You may also want to consider putting the test data into some sort of text-based file (like a .csv, .json, .xml, .properties, etc.) that the automation framework can read in and use for this step.

      I hope this info helps!

      Andy

      Like

      1. Thanks for the response, Andy. Adding the test data to a text-based file looks like a good solution to our problem. I will definitely try this approach.

        Like

  3. Hello Andy,

    I have one more recurring question from my team.

    We have a scenario where:
    1) the user logs in and navigates to a certain menu,
    2) creates an instance (press add button on the page, a form is opened up, fill the form, press save button on the form),
    3) verifies that the newly created instance is displayed in the above page (only the name of the instance appears on the page as a link),
    4) click on this link,
    5) and verify that the fields have correct values. Ideally, this is one complete scenario.

    Given the user is logged in as administrator
    and the application is at the create page

    When the user creates an instance

    Then the instance is displayed at the page => This verification is important.

    (Given the instance is created and displayed) => not implemented currently

    When the user clicks on the new instance

    Then the fields are displayed with the correct values

    The first then is important to verify. Breaking the scenarios into several Given-When-Then scenarios would have several related scenarios, which would make them dependent to each other. In FitNesse, it would also increase the size of the tests.

    Also, the Given in the second case would be a repeated step. For the 2nd scenario, the Given assumes that the instance is created and displayed, which is basically the same as When and Then of the first scenario. We can still manage writing the GWT, but the step definition for 2nd Given would be the verification of the created instance, and hence, repeated code from 1st Then.
    Is it okay to have an empty Given in such case, i.e. a Given not doing anything or without any step definition?

    Thanks.

    Like

    1. Hi Dinzy,

      Sorry again for a slow response. Things have been busy!

      I think you could elegantly handle this example as one scenario like this:
      Given the user is logged in as administrator
      And the application is at the create page
      When the user creates an instance
      Then the instance is displayed at the page
      And the fields on the linked page are displayed with the correct values

      The final “And” step would click the link and do the required verification.

      You could split it into two scenarios. That would be advantageous if there are multiple ways to get to that link. Remember, BDD is about capturing DESIRED and INDIVIDUAL behaviors. Is the action of clicking the link and verifying its fields an individual behavior that could be reached in multiple ways and thus worthy of its own scenario, or is it totally dependent upon the instance creation?

      The empty Given, though, is not okay. Each scenario is independent, meaning the output of one does not become the input of another. The order scenarios appear in a feature file doesn’t matter. Think of it this way: Each scenario should start at login. If you were to write two scenarios, then the step definition for the second scenario’s Given must perform all of the same actions as the first scenario. That second Given could be “reduced” in that it includes login, navigation, and some initial interaction, but it still must be done. It’s best to write those basic initial actions as helper methods/classes so that they may be called by many different Given steps.

      Andy

      Like

  4. “The second behavior arguably needs the first behavior to run first because the second needs to start at the search result page. However, since that is merely setup for the behavior of image searching and is not part of it, the Given step in the second scenario can basically declare (declaratively) that the “panda” search must already be done. Of course, this means that the “panda” search would be run redundantly at test time, but the separation of scenarios guarantees behavior-level independence.”

    Hi Andy, how do you plan to address this dilemma in actual code? How do you intend to call the pre-requisite steps of launching the browser and doing the panda search from Scenario 1 before arriving at the Given state of Scenario 2? It would be nice if the “good examples” were paired with their corresponding step definitions. It’s easy to write idealistic Gherkin, only to realize code implementation is inefficient with the way your Gherkin is written. Thanks!

    Like

    1. The step definition code isn’t as much of a dilemma as you may think. For example, let’s say a scenario has:

      Scenario: 1
      Given A
      And B
      And C
      When …

      And another scenario has:

      Scenario 2:
      Given already at C
      When …

      In Scenario 2, what matters most is that C is reached. If it took going through A, B, and C, then the step definition for “already at C” could look like this:

      @Given(“already at C”)
      public void alreadyAtC() {
      A();
      B();
      C();
      }

      Programming is programming; methods are methods. You can call the methods for other step definitions. Or, you can write helper methods to be shared. The point of the step definition layer is to link plain language to programming code. And step definitions should be very short – any intense logic (like service calls or web element interaction) should be handled by other classes.

      Like

  5. Hi Andy,

    What if methods A(), B(), or C() take parameters from the original annotation in Scenario 1?

    @Given(“^I log in using ([^\”]*) and ([^\”]*)$”)
    public void methodA(String username, String password) {
    }

    How are you going to get those parameters for the Given at Scenario 2?

    @Given(“Already at C”)
    public void alreadyAtC() {
    methodA(); // How to get username and password here?
    }

    Thanks!

    Like

    1. Then you have two options: Either add the parameters to the new step (“Given Already at C using __ and __”), or hard-code the parameter values. Hard-coding is not so bad if the specific values need not be exposed at the Gherkin level.

      Like

  6. Hi Andy,

    Good day! Thanks for sharing your brilliant knowledge about Gherkin. This really helped our team standardize our Gherkin writing. 🙂

    However, we have some question. For example: Field validation.

    We need to enter multiple combination of inputs (test data) for a specific field and verify the error message out of focus.

    Given the user is at registration form page
    When the user enters username
    Then message is displayed below username field

    | input | error |
    | | This is a required field. |
    | AAA | Please enter atleast 5 characters. |
    | AAA111BBB222 | Please enter maximum of 10 characters.|

    We thought of writing it as a scenario outline, however it will take like additional seconds to go the registration page again rather than entering the next input after the first.

    We know that this may not be a good idea:

    Given the user is at registration form page
    When the user enters username and verifies message below username field
    | input | error |
    | | This is a required field. |
    | AAA | Please enter atleast 5 characters. |
    | AAA111BBB222 | Please enter maximum of 10 characters.|
    Then

    We are wondering if there’s other way to write it or it is better to right it as scenario outline?

    Thank you in advance! 🙂

    Regard,
    AC

    Like

    1. Hi AC,

      Thanks for reading my blog!

      Honestly, I would recommend using a scenario outline so that each validation check is independent. This will burn setup time but will make test reports cleaner and repro-runs (such as rerunning only failed tests) easier.

      There are a few ways you can ease the pain of additional time spent in scenario setup. First, I strongly recommend running tests in parallel on multiple threads. This could be done within the automation framework runner itself or with another tool/platform like Selenium Grid or SauceLabs. Parallelization is especially important when test suites take a long time to run. Each scenario will still take the additional setup time, but the complete start-to-end time for the test suite will be reduced drastically.

      Second, you could use a global hook to preserve the web driver session for these scenarios. Typically for web tests, each scenario launches a new instance of the web driver and then navigates to the starting page (which often means login). If the login will be the same across scenarios, then the web driver could be constructed once and shared across scenarios, so long as scenarios “leave no trace” and return the site back to the starting page. This implementation could get messy and potentially dangerous (because test case independence could be violated), but it is possible. I’d recommend it only if web setup time is really hurting your overall test suite runtime.

      Check out some of my other posts, which may help:
      https://automationpanda.com/2017/08/05/handling-test-data-in-bdd/
      https://automationpanda.com/2017/03/03/cucumber-jvm-global-hook-workarounds/

      Sincerely,
      Andy

      Like

  7. Hi Andy,

    I see now where the code implementation can be “hairy” at parts as you said in the article. I will probably go with sacrificing “elegant Gherkin” for practical step/code reuse for increased productivity. My Gherkin will be the “hairy” one. Usually, clients value how many tests you can automate in a limited span of time. Sadly, they do not give a hoot how pretty the Gherkin is nor do they even care to understand it.

    BUT, I can at least make my steps less imperative as advised from this blog. Instead of highly instructional steps like “I enter ” and “I enter “, simply “I log in as and ” will do. I am going to combine some of my imperative steps to a behavioral step where possible.

    Thanks for actively answering questions for all your readers. Keep it up.

    Like

  8. Hey Andy,

    Thanks for sharing such a nice information. I have a doubt.

    Scenario: Google Image search shows pictures
    Given the user opens a web browser
    And the user navigates to “https://www.google.com/”
    When the user enters “panda” into the search bar
    Then links related to “panda” are shown on the results page
    When the user clicks on the “Images” link at the top of the results page
    Then images related to “panda” are shown on the results page

    The above scenario works even if i replace Given, When, Then with And i.e.,

    Scenario: Google Image search shows pictures
    And the user opens a web browser
    And the user navigates to “https://www.google.com/”
    And the user enters “panda” into the search bar
    And links related to “panda” are shown on the results page
    And the user clicks on the “Images” link at the top of the results page
    And images related to “panda” are shown on the results page

    Want to know if the work can be done with one keyword what’s the use of having so many i.e., Given, And, When and Then.

    Please do revert 🙂

    Like

    1. Gherkin keywords (Given, When, Then) are used entirely for convention. Each one documents the type of step:

      1. Given is used for setting initial state.
      2. When is used to perform some action.
      3. Then is used to verify an outcome.

      In theory, a test case doesn’t even need Gherkin. A test case could simply be a sequence of method/function calls that execute stuff. The value added by using Gherkin (and specifically by using step type keywords) is to frame the test case as a behavior scenario. It helps the test case author and readers know the intended behavior for the scenario. It also guides the author to write good behavior scenarios. For example, if step types are assigned appropriately, and the scenario has repeated when/then pairs, then it is obvious that the scenario covers more than one behavior and should be split into two scenarios.

      With BDD and Gherkin, you get out what you put in. If you follow good practices, you get good results. If you write sloppy scenarios, you’ll get bad tests. So, I strongly recommend following good conventions to get good results. 😉

      Like

  9. Hey There. I found your blog using msn. This is a really well written article. I’ll be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I will definitely return.

    Like

  10. Hi Andy,

    I have a flight search and book app where the scenario have below flows:
    1. User can be logged In or Guest user
    2. User can search flights: one way, round trip or multicity
    3. User can chose to pay using cash or miles
    4. Also, the steps to book a flight change depending on which combination you chose from above

    My question is, should I cover these flows in different scenarios or is there a way I could frame this in one scenario

    Like

    1. Both!

      To me, most of these steps sound like different behaviors. I would write separate scenarios to cover each step’s behavior in detail. For example, I’d probably write a scenario outline to handle searching for different types of flights to make sure each type yielded appropriate results. You may even want to cover some of these behaviors at the service level and simply verify the web UI displays the right stuff.

      I’d probably also write one or two web-level end-to-end scenarios to make sure the total flow works together. Something like this: “Given a user is logged in, When the user searches for a flight, And the user chooses to pay using , Then …” The fine details of correctness should be checked by the individual behavior steps, whereas the end-to-end scenario would cover the flow at a higher level to make sure the sequence of actions (as its own unique behavior) works properly.

      Like

  11. Hi Andy,

    Do you have any idea if a failure is caught in the middle of steps and okay to proceed?

    For example (same gherkin I used before in my other question):

    Given the user is at registration form page
    When the user enters username
    Then message is displayed below username field
    And the success icon is displayed

    Let’s say it fails in Then step and okay to proceed in And (last verification) step. Can cucumber gherkin handle it? Thanks in advance!

    Regards,
    AC

    Like

    1. Hi AC,

      Yes, but not out-of-the-box.

      The concern you raise is about assertions. Specifically, what you want is a “soft” assertion – an assertion that records a failure but allows the test to keep going – versus a regular or “hard” assertion that will immediately abort upon failure. Most BDD frameworks do not provide their own assertion APIs but instead rely upon other libraries. This allows programmers to choose their own way to handle assertions. Most assertions function by simply throwing exceptions, anyway, and most test frameworks catch exceptions at the test case level and log them as failures.

      For Gherkin scenarios, you could create a soft assertion that spans multiple steps. If you program your automation in Java, then you can use AssertJ for soft assertions. You could also easily create your own soft assertion class: write an “add(condition, message)” method that appends the assertion to a list, and then write a “check()” method that throws an exception if any conditions are false.

      I hope this helps!
      Andy

      Like

      1. Hi Andy,

        Thank you for your quick response! Appreciate it! 🙂

        We try to use soft assert and it proceeds with the remaining steps. However, our problem now was with the report. It proceeds to the next step but the status of the failed step that was in soft assert was passed instead of failed. Thus, it confuses us in the report. Do you have any idea how can we reflect the failed step in the soft assert to make it fail? Thanks a lot!

        Regards,
        Aiza

        Like

      2. What programming language and BDD framework are you using? Unfortunately, there’s no way to mark a step failed and continue test execution in the Cucumber frameworks: https://github.com/cucumber/cucumber/issues/79. I’m pretty sure this is not possible in SpecFlow (Cucumber for .NET) or behave (Cucumber-like framework for Python), either.

        If you don’t like how the log/report appears, you could rewrite your scenario in a few ways:
        * Combine all Then steps into one big, declarative Then step and do the soft assertions inside it.
        * Split the scenario into a different scenario for each Then step.
        * Turn the scenario into a scenario outline, and make each Then step a row in the examples table.

        You could also make your own log/report. That would be a lot of work, but you could do it if it matters to you. Extent Reports (http://extentreports.com/) make this pretty easy for Java and C#.

        If it were me, I’d say keep your scenario the way it was originally and just do hard assertions. The investment to pull off soft assertions and better reporting together probably isn’t worthwhile just to make sure all Thens are run despite failures. If it really did matter to me, then I’d go with a custom report. I wouldn’t want to rewrite the scenario because the original version is very natural and understandable.

        Like

  12. Hello Andy,

    Thanks for the detailed information about Gherkin.
    I am facing one issue with Content Assistance for feature file ( Ctrl+Space) , I hope you could help me here.

    Issue – When I launch connect assistant by pressing , I can see for each step I have written there is duplicate of it. In another word, all the steps are shown twice when they are written only once.

    Eclipse version : Neon.1a Release (4.6.1)
    Selenium – 2.53

    I have tried to search a lot but could not got any clue.

    Regards,
    Atul

    Like

    1. Hi Atul,

      Thanks for reading my blog! I’m glad you find my posts helpful. Unfortunately, I don’t personally use Eclipse for my automation work, so I can’t really help you with the problem you shared. (I use IntelliJ IDEA for Java and Python and Visual Studio for C#.) I recommend reaching out to the owner of the Eclipse plugin you are using and possibly submitting a bug.

      Sincerely,
      Andy

      Like

  13. Thank you for the article and for answering the questions in the comments. I’m trying to explain to others how to break up or rewrite some feature files I’m reviewing into something more manageable (and 10 lines or less), and I’ve found your explanations and answer very helpful in clarifying my own thinking about best practices in Gherkin and what belongs in FF vs Step definitions as I’m trying to define the best practices that should be followed. I’m going back to read your other two posts in the series too, and will reference your blog as a good source of examples and information. Just wanted to leave a comment expressing thanks for taking the time to share this information.

    Like

  14. Thanks for such a helpful post!
    I am refactoring one of my feature scenarios, the business logic is like this: the client calls the server APIs in particular order, and get proper responses. Say, there are 3 APIs available. I had dedicated feature files for testing each API, but end to end testing is still valuable in my case.

    Should I do this?
    Given some context
    When the client calls API A
    Then the client gets response a
    When the client calls API B
    Then the client gets response b
    When the client calls API C
    Then the client gets response c

    Apparently it breaks your suggestions, that one scenario one when/Then. Of course I can change Whens here to “And” make it look like a “When” (except the last Then). But does it really change anything?

    Yes what I try to test is a work flow in sequence, how I organise such end to end feature?

    Like

    1. Hi Murphy! Thanks for reading my blog.

      You could certainly write scenarios like this. Don’t change “When” steps to “And” steps – preserve the integrity of the step types. However, you said you already have testing for each API, which I presume means that you have covered all equivalence classes of API use cases. And end-to-end test like this might be a duplication of test coverage.

      Please read my article on end-to-end tests; specifically, point #2 about lengthy end-to-end tests:
      https://automationpanda.com/2017/10/14/bdd-101-unit-integration-and-end-to-end-tests/

      I also want to write an article specifically addressing times when duplicate When/Then pairs may and may not be appropriate.

      Like

  15. Hi Andy,
    I’m totally new to Gherkin – and glad I found your blog.

    Right now I’m struggling with some kind of “reusable vs. natural” question.
    For example, I might have been tempted to turn

    the user navigates to “https://www.google.com/”
    the user enters “panda” into the search bar

    into

    the user enters “https://www.google.com/” into the “address bar”
    the user enters “panda” into the “search bar”

    so that only one step definition is needed to enter text anywhere,

    Do I get it right that this is NOT proper Gherkin thinking?

    Thanks in advance,
    Flo

    Like

    1. Hi Flo,

      “Reusable vs. Natural” – finding the balance is the art of good Gherkin.

      My advice: parametrize only where it makes sense for the underlying step definition. If the parametrized value will simply be passed along, such as a name or a URL, then that’s good. However, if the parametrized value requires an if-else decision block for different actions, then it would be better to write different steps.

      In your example above, I don’t see value in parametrizing “address bar” and “search bar”. Even if you wanted to enter text at a different location, the actions and web elements would be different, so you might as well make separate steps.

      Sincerely,
      Andy

      Like

      1. Thanks a lot, Ithis seems like a good advice I can stick to.

        In my scenario, I used some “returnObjectForName” function that I called from within the step definition, and that I could reuse for yet another step definition that would check the contents of a field.

        I especially wondered about entering values into a table – having a different step definition for each column to be filled seems awkward, but on the other hand those writing the feature file would not necessarily know the internal name of the column.

        But well, I’ll just go ahead and try and learn.

        Sincerely,
        Flo

        Like

      2. I just found something that seems like a middle ground for “reusable vs. natural”:

        I can write st. like
        [Given(@”The user “”(.*)”” for “”(.*)”” “)]
        [Given(@”The user “”(.*)”” the text “”(.*)”” “)]
        [Given(@”The user “”(.*)”” a “”(.*)”” “)]
        public void EnterTextToField(string p0, string p0){
        […]
        }

        to have one step definition for the following steps:
        Given The user “searches” for “panda”
        Given The user “translates” the text “panda”
        Given The user “orders” a “panda”

        While there might be good reasons NOT to do it (it does seeem rather pointless in this example), I think it can come in handy in some less artificial cases – whenerver parametrizing seems like a good idea anyway, it can be achieved without enforcing artificial step formulations “the user performs action X on object Y”.

        Like

  16. Hi Andy!

    Thanks for the articles. I’ve found a lot of useful information!
    But I’ve just started in the BDD and have two questions.

    1. As I understand gherkin file doesn’t contain any “developers” things like HTML element id or CSS selectors and so on. For instance, I’ve decided to add reusable step

    When user clicks the button

    As developer I would use something like this

    When user clicks the button “#btnUpload”

    But it is not right. How can I resolve it correctly?

    When user clicks the button “upload”

    where upload – the text of the button?

    Or it is the wrong way and I have to add the concrete step

    When user clicks the upload button

    And I have to develop such steps for any button which I will use in the gherkin?

    2. Our product supports several languages (english, french, german etc.) and we want to test each language separately. How can we reuse one gherkin file for every supported language without duplication?

    For example:

    Given the user have created new account
    When the user logins in the account
    Then message “Welcome” is displayed

    In EN site version text will be “Welcome”. In FR will be “Bienvenue” and so on. I don’t want to develop separate gherkin file with hardcoded text for every supported language. Also I don’t want to develop several steps like

    message welcome is displayed
    message goodbye is displayed
    message error is displayed

    for any occured situation. How solve this issue?

    Thank you very much!

    P.S: Sorry for my bad english. It’s not my native language.

    Like

    1. Hi Andrey,

      I’m glad that you find my content useful!

      Gherkin is meant to be naturally expressive, high-level, and business-oriented. Gherkin is a description language, not a programming language. All of the developer details like web element IDs and XPaths should be written into the automation code (meaning, the step definition classes and automation support classes). I strongly recommend using the Page Object Model for web testing – step defs would call page object methods, and page objects would have the developer-like pieces.

      To handle the same scenario with multiple variations, use Scenario Outlines! Parametrize the language choice and all phrases that should be checked. I’d also recommend keeping language checking to a minimum. Find the key places in the web site where language choice is triggered/reflected – check those only, and skip the rest. If you try to test it all, you’ll significantly increase test execution time with little chance of catching unique bugs.

      Sincerely,
      Andy

      Like

  17. }}} “Given a web browser is at the Google home page.” This new step is friendlier to read.

    I’d even suggest it be taken a step further — unless you’re explicitly testing GOOGLE, you could reduce the specificity/imperativity (to coin a word) to:

    “Given a web browser is at a search page.”

    Like

    1. The examples do explicitly intend Google to be the product under tests, so it is appropriate to state Google in the steps. Nevertheless, it’s good that your thinking about being highly declarative! Always remove unnecessary details.

      Like

  18. Hi!
    Thank you for you post, it’s helped us to standardise our scenario writing.
    Now, just a quick question. In your phrasing steps section you mentioned that the Given step should always use the present perfect tense, and in your corrected examples you seem to use the present simple tense. So which do you prefer? The user has pressed the buy button (present perfect) or the user presses the buy button(present simple)?
    Thank you from Brazil!

    Like

    1. That’s a good point about phrasing. I’m okay with present simple tense in Given steps when the phrase is declaring a state of being. For example, “the login page is displayed”.

      Like

  19. Hey Andy,
    In my quest to ramp up my BDD knowledge, I’m discovering that nuanced differences abound. I think this is inevitable given the “polyglot” nature of Cucumber. Contributors on the Cucumber Slack channel tell me to focus Gherkin on behavior. In other words, remove the implementation-specific language like “press this button” and “navigate to this page”. In your experience, is that too pedantic and not how Gherkin works “in the real world”, so to speak?

    Like

    1. The Cucumber folks are on point with their advice. Try to be more declarative than imperative. Gherkin scenarios should focus on behavior rather than minute actions. Do your best to make this work in the “real world” – there’s tons of imperative Gherkin out there, but that doesn’t make it good.

      Also, keep in mind that the language used should fit the scenario’s intention. There may be times when slightly more imperative steps are needed to make important behavior facets clear. For example, most scenarios shouldn’t include login steps, but scenarios that explicitly test login behavior should.

      In my experience, automation-focused engineers using BDD/Gherkin frameworks primarily write overly-imperative steps, while feature-focused product owners tend to write overly-declarative steps.

      Like

  20. Hi Andy,
    since I often find useful advice here, I’d like to ask another question: Right now at our company, we regularly rally some manual testers to perform predefined steps in a load test, and we want to automate.
    Would you think it it a good idea to use Gherkin for such load or performance tests? For example like

    Given a web browser is on the Google page
    When the search phrase “panda” is entered
    Then the results should appear after at most “3” seconds

    I think it’s possible from a technical point of view, but would you say it is a good idea?

    On the one hand, it sounds promising if agents and manual testers could be guided by the same description.
    On the other hand, we are just about to introduce BDD into our company culture, and of the course the first implementation will shape general expectations and understanding for the future. And maybe this would be a rather misleading way of using Gherkin…

    Thanks in advance,
    Flo

    Like

    1. Hi Flo,

      I get this question a lot. My opinion answer is this: No, Gherkin should not be used for load or performance testing. Functional testing (a la Gherkin) tests that a product works /correctly/, while performance testing (and load testing as a subset of performance testing) tests that a product works /optimally/. They have different purposes. Performance tests are really about gathering wide metrics and doing data analysis. They are thus nondeterministic. It is possible to reduce performance tests into functional tests, but I oppose this on the principle of the separation of concerns. Furthermore, the purpose of the tests should be considered as well. If Gherkin tests will be run in a CI/CD pipeline, then nondeterministic results are unacceptable.

      Sincerely,
      Andy

      Like

  21. Hello Andy, I am learning a lot of new things with your blog, man. Thank you very much!

    I wanna ask your opinion about my case

    I have a event/enrollments system.

    In the event page, if the event is for a couple, the user should create one enrollment for both him and your partner (already registered in the system) in the same time.

    So, to represent this feature, I wrote the scenario below.

    ==========================
    Scenario: Enroll in couple event

    Given that the user “John” is logged in
    And he is at “Power Couple” event page
    And he is married with “Marie”
    When John enrolls in the “Power Couple” event
    Then a enrollment is created for “John” and “Marie” in the “Power Couple” event
    ==========================

    Do You think this scenario steps ok?

    Thanks in advance

    (Sorry for my bad english)

    Like

    1. Thanks for reading my blog! I think you steps are okay. Your scenario follows the Cardinal Rule of BDD, and I can understand exactly what needs to be done. Kudos for that! My one suggestion is to try to be less imperative. You could try something like this:

      Given “John” and “Marie” are married
      And “John” is at the “Power Couple” event page
      When “John” enrolls in the “Power Couple” event
      Then an enrollment is created for “John” and “Marie” in the “Power Couple” event

      You may also want to state more about what the enrollment validation involves. Is a special message or page displayed? Will the users get an email? You may also want to write another scenario in which John checks his enrollment status.

      Liked by 1 person

    1. I would not recommend that style of Gherkin. It is very imperative, and the intended behaviors are not intuitive at first glance. Unfortunately, it is fairly common to see imperative Gherkin like that in the industry, though.

      Liked by 1 person

    1. Hi Beto,

      That’s a scary amount of data for one step, let alone one scenario! I’m going to be honest: after reading this one scenario outline, I really do not understand the behavior it seeks to exercise. It’s likely that some of this data is not necessary or could be provided within step def code, or that more than one behavior is being covered by this scenario. Remember, Gherkin is not a programming language – it’s a description language.

      Sincerely,
      Andy

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s