Example Mapping

How Do We Write Good Gherkin as Part of BDD? (Webinar + Q&A)

On July 23, 2019, I gave a webinar entitled, “How Do We Write Good Gherkin as Part of BDD?” in collaboration with Paul Merrill and his company, Beaufort Fairmont. This webinar was the follow-up to a previous webinar, What Is BDD, and How Do We Practice It? It was an honor to partner with Paul again to go further into BDD practices. (If you want to learn more about BDD, check out Beaufort Fairmont’s two-day BDD training offering, as well as their blog and other webinars.)

To see my webinar recording, register here. Definitely watch the previous webinar first.

Just like last time, attendees asks several great questions that we simply could not answer live. I categorized all questions we received and answered them below. Please note that some questions might be rephrased or combined with others.

Questions about BDD

What is BDD?

Behavior-Driven Development! Read more here.

In a typical Agile development process, who should write feature files?

The Three Amigos! Product owners, developers, and testers should all come together to figure out behaviors. I recommend doing Example Mapping to formulate before writing Gherkin scenarios. The green example cards should be turned into feature files. The specific person who writes the feature files is up to team preference. It could be a collaborative effort, or it could be divided-and-conquered. Any one of the Three Amigos can do it.

How can we apply BDD to SAFe (Scaled Agile Framework) teams?

BDD practices like Three Amigos meetings, Example Mapping, Behavior Specification with Gherkin, and Behavior Implementation can become part of any process. All of these practices happen at the level of the development teams. Teams could even share Gherkin steps and test frameworks wherever sharing makes sense. Check out BDD 101: Behavior-Driven Agile.

What advice can you give to teams that use BDD tests frameworks solely as an automation tool and not part of a greater BDD process?

Do the best with what you’ve got. Try to show how other BDD practices can pragmatically improve your team’s development and delivery work. See also:

Questions about Gherkin Syntax

What is the difference between a scenario and a scenario outline?

A scenario is a procedure of Given-When-Then steps that covers one example for one behavior. If there are any parameters for steps, then a scenario has exactly one combination of possible inputs. A scenario outline is a Given-When-Then procedure that can have multiple examples of one behavior provided as a table of input combos. Each input row will run the same steps once, just with different parameter inputs. See BDD 101: Gherkin by Example to see examples.

What do you think about long tables in scenarios?

Long tables in Gherkin usually look terrible. They’re hard to read, and they create a wall of text. They may also include unnecessary variations. Stick to the Unique Example rule.

Are Given steps mandatory, or can scenarios start directly with When steps?

None of the step types are mandatory. It is valid to write a scenario that skips the Given and has only When-Then steps. It is also valid to write scenarios that are Given-Then or Given-When. In fact, it is syntactically valid to put steps in any order. However, I strongly recommend keeping Given-When-Then step order to properly frame behaviors.

Are quotation marks required for parameters?

No, quotation marks are not required for parameters, but they are a popular convention, and one that I recommend. Quotes make parameters easy to identify.

Questions about Gherkin Scenarios

How do we make sure each scenario focuses on an individual, independent behavior?

Do Example Mapping first as a team. Write scenarios together, or review them with others. Ask, “What makes this behavior unique?” Make sure to use strict Given-When-Then step order when defining the behavior. Rethink the scenario if it is more than 10 lines long. Look out for unnecessary complication.

What does it mean for a scenario to be “chronological”?

Scenario steps should be written as if they were on a timeline. Each step will be executed after the previous one, so its description must start where the previous one ended. Remember, steps will be automated as if they were scripts.

How do we write a very low-level scenario without having a wall of text?

Don’t write low-level scenarios! Gherkin is best for feature testing, not unit testing. Steps should focus on intention and business value. Instead of writing “type, type, click, wait,” write “log into the app.” If you absolutely must write a low-level scenario, remember that the same principles apply. Be intuitively descriptive. Focus on individual behaviors. Keep scenarios concise.

If all scenarios in a feature file have only one user, is it okay to use first-person perspective instead of third-person?

In my opinion, no. I favor third-person perspective universally. Trying to limit usage to one feature file won’t work because any step can be used by any feature file within a test project. The entire solution must be either first-person or third-person. There’s no middle ground.

Can we write Gherkin scenarios with personas?

Yes! Personas can make scenarios more meaningful and understandable. Make sure to define the personas well – they could be described under the Feature section or in a separate text file.

How do we write Gherkin scenarios that need to validate lots of information on a page?

Pick the most important pieces of information to check. You could write separate Then steps for each assertion, or you could push small-but-similar validations down to the automation level to avoid Gherkin clutter.

How do we write Gherkin scenarios for validating Web UI fields?

Typically, I treat each field validation as an independent behavior, and thus I write separate scenarios to check each field. If the scenario steps simply enter a textual value and verify a specific message, then I might make a Scenario Outline with example rows for each equivalence class of inputs.

How do we write Gherkin scenarios that have multiple inputs and setup steps? (Example: an API with ten parameters)

Gherkin allows multiple steps of the same type to be written using “And” and “But” keywords. It’s not a problem to have “Given-And-And” or “When-And-And”. If you discover that different scenarios repeat the same setup steps, then I recommend either moving those common steps to a Background section or writing a new step that covers multiple calls (for conciseness).

One example from the webinar showed searching for shoes and adding them to a shopping cart as part of one scenario. Aren’t those two different behaviors?

Here’s the scenario in question:

Scenario: Add shoes to the shopping cart
  Given the ShoeStore home page is displayed
  When the shopper searches for “red pumps”
  And the shopper adds the first result to the cart
  Then the cart has one pair of “red pumps”

We could have split this scenario into two. I just chose to define the behavior this way. This scenario is a bit more end-to-end because it covers a basic but typical workflow. It may also have leveraged existing steps, which expedites automation development. Overall, the scenario is still concise, chronological, and intuitively understandable. Remember, there is an art as well as a science to writing good Gherkin.

Questions about Automation

Do scenarios need to be independent of each other?

Yes, unequivocally. Tests that are not independent could interfere with each other and cause unexpected failures. Independence also reinforces singular behavioral focus.

How do we start a scenario “in media res” without it depending on other tests?

At the Gherkin level, write Given steps that define a new starting point for the behavior. For example, many teams develop Web apps. It’s common to think that the starting point for all tests is login. However, the starting point can be a few pages after login.

At the automation level, it may be useful to implement the Given steps by calling other steps. For example, if a Given step should start at a user’s profile page, then perhaps it could internally call the login step and the click-the-profile-link step. Test steps may repetitively do the same operations for different tests, but test case independence will be preserved, and unique failures will be reported.

What is the best way to handle preconditions like logging into a Web app?

The simplest way to handle preconditions is to write Given steps. If those Given steps are shared by all scenarios in a feature file, then move them to a Background section. Automation hooks can also perform common setup and cleanup actions, depending upon the test framework. Personally, I prefer to use hooks to do automatic login rather than repeat Given steps for many scenarios.

Is it better to set up and tear down new test objects for each test case, or is it better to use shared, pre-created objects?

That depends upon the object. Most objects like WebDrivers and page objects should have scenario scope, meaning they are created fresh for each scenario and then torn down when the scenario ends. The only time an object should be shared across scenarios is if it is immutable or very expensive to create. For example, configuration data could be read in once before all tests and then injected immutably into each scenario. The safe position is always to use fresh objects; justify why sharing is needed before trying it.

I want to use Serenity for BDD and testing. Should I use Cucumber-like Gherkin feature files, or should I use Serenity’s native methods?

That’s up to you and your team. Personally, I would still use Gherkin feature files with Serenity. I like to separate my test case from my test code. Everyone can read Gherkin feature files, but not everyone can read Java or JavaScript test methods.

If a company already has a large BDD test solution that is poorly implemented, would it be better to keep it going or try to change it?

This question can be applied to all software projects, not just BDD test solutions. The answer is situational. Personally, I favor doing things right, even if it means refactoring. Please read Our Test Automation Has Problems. Should We Start Over? for a thorough answer.

Final Questions

Why do you call yourself “Pandy” and the “Automation Panda”?

Pandas are awesome. Everybody loves them. And nobody forgets my moniker. The nickname “Pandy” came about in the Python community to distinguish me from other folks named “Andy.”

Where can I get team training in BDD?

Beaufort Fairmont provides a one- or two-day course in BDD and writing Gherkin. Sign up for more information here.

What is BDD, and How Do We Practice It? (Webinar + Q&A)

On March 18, 2019, I gave a webinar entitled, “What is Behavior-Driven Development, and How Do We Practice It?” in collaboration with Paul Merrill and his company, Beaufort Fairmont. It was both a pleasure and an honor to do this webinar with them. Paul is a top-notch test automation expert, and Beaufort Fairmont is doing really exciting things. Check out their two-day BDD training offering, as well as their blog and other webinars.

To see my webinar recording, register here.

During the webinar, attendees asked more questions than we could answer. I’m excited that so many people asked questions. My answers are below.

Questions about Process

How is BDD different from TDD (Test-Driven Development)?

BDD is an evolution of TDD. In TDD, developers (1) write unit tests and watch them fail, (2) develop the feature to make the tests pass, (3) refactor the code to make it stronger, and (4) repeat the cycle. In BDD, teams do this same loop with feature tests (a.k.a “acceptance” or “black-box” tests) as well as unit tests. Furthermore, BDD adds shift left practices like Example Mapping and Specification by Example so that teams know what they are doing and focus on developing the right things.

Check out Dan North’s article, Introducing BDD, for a more thorough answer.

Can BDD be used with manual testing?

Yes! BDD is not merely an automation tool – it is a set of pragmatic practices to help teams develop better software. Gherkin scenarios are first and foremost behavior specs that help a team’s collaboration and accountability. They function secondarily as test cases that can be executed either manually or with automation.

Can we use BDD with technical stories or backend features?

Yes! If you can describe it, then you can do it.

How many Gherkin scenarios should one story have?

There’s no hard rule, but I recommend no more than a handful of rules per story, and no more than a handful of examples per rule. If you do Example Mapping and feel overwhelmed by the number of cards for a story, then the story should probably be broken into smaller stories.

Should we do Example Mapping for every story? Spending 20-30 minutes for each story would take a long time.

Try doing Example Mapping on one or two stories to start. The first time is always rough, but as you iterate on it, you’ll get better as a team. Even though Example Mapping has an upfront time cost, it will save a lot of time later in the sprint because (a) acceptance criteria is clear, (b) tests are already written, and (c) everyone has a mutual understanding of the story. The team won’t suffer through the inefficiencies of miscommunication and poor planning. You may even want to replace planning meeting with Example Mapping meetings.

What metrics should we use with BDD?

All metrics are flawed, but some metrics are useful. All the standard testing and Agile metrics still apply: code coverage, story velocity, etc. Here are some additional metrics you may consider for BDD:

the percentage of stories that undergo Example Mapping before the sprint
the number of rules and examples that get “missed” during Example Mapping and need to be added later
the percentage of Gherkin scenarios that get automated in the sprint

If you choose to track metrics, make sure their feedback is used to improve team practices. For more info on metrics, please read my Quality Metrics 101 series.

What were the resources you recommended at the end of the webinar?

Questions about Tools

What test management tools should we use with BDD?

I’m sure there are BDD plugins for test management tools, but I don’t have any that I can personally recommend. To be honest, I try to stay away from large test management tools like HP ALM, qTest, VersionOne. When doing BDD, the Gherkin feature files themselves should be the single source of truth for feature-level tests, and they should be version-controlled in a repository. Don’t fall into the trap of slapping “Given-When-Then” keywords onto existing functional tests – that’s not BDD.

Does Jira support Example Mapping?

I have not personally used any Jira plugin for Example Mapping. It looks like there is an Easy Agile User Story Maps plugin that is similar to but slightly different from Example Mapping.

Are there other good tools for BDD and Example Mapping?

Cycle Automation provides a nice app with Gherkin steps out of the box, so you can automate tests without needing a programming language.
TeamUp Labs provides an online Example Mapping tool.
IDEs from JetBrains and Eclipse provide BDD plugins
Gherkin Syntax Highlighting in Notepad++
Gherkin Syntax Highlighting in Visual Studio Code
Gherkin Syntax Highlighting in Atom
Gherkin Syntax Highlighting in Chrome

What’s the difference between Gherkin, Cucumber, and SpecFlow?

Gherkin is the Given-When-Then spec language.
Cucumber is a company and its eponymous test framework that uses Gherkin.
SpecFlow is Cucumber for .NET.

Questions about Testing

Can BDD test frameworks be used for unit testing?

Yes, but I don’t recommend it. BDD frameworks shine for black-box feature testing. They’re a bit too verbose for code-level unit tests. Read BDD 101: Unit, Integration, and End-to-End Tests for more info.

Can BDD test frameworks be used for integration testing?

Yes! See BDD 101: Unit, Integration, and End-to-End Tests.

How long should Gherkin scenarios be?

Scenarios should be bite-sized. Each scenario should focus on one individual behavior. There’s no hard rule, but I recommend single-digit step counts. Read BDD 101: Writing Good Gherkin for more info.

What are “step definitions” in Cucumber?

Step definitions are the methods in the automation code that execute the steps. When a BDD framework runs a Gherkin scenario as a test, it “glues” each step to a step definition based on some sort of string matching.

How can we minimize duplicate code within a BDD test framework?

Know your steps. Always search for existing steps before writing new steps. Refactor existing steps whenever appropriate. Reuse steps when writing new scenarios. Do pair programming or mob programming when writing scenarios. Put scenarios through code reviews. Apply good coding practices – remember, test automation is software.

I write Gherkin scenarios, but I don’t write test automation code. What’s the best way to write Gherkin scenarios so that they can be automated?

Do pair programming with the automation engineers to write Gherkin scenarios together. Become familiar with existing steps by reading and searching feature files. Otherwise, the Gherkin steps you write in isolation might not be usable. Remember, BDD is a team effort!

The examples in the webinar were all fairly basic. Do you have any examples with more complex systems?

I have some example projects on GitHub in Python and Java with some basic unit, integration, and end-to-end tests, but I don’t have any large-scale examples that I can share publicly.

We wrote hundreds of SpecFlow tests without the other Amigos. Now, there are large test gaps, and many steps aren’t reusable. What should we do?

I’m sorry to hear that. It’s not an uncommon story. There are two paths: (1) refactoring or (2) starting over. Without really knowing the situation, I don’t think it’s my place to say which way is better. Here are some questions to help guide your decision:

What are your goals for testing and automation?
What’s your overall quality and testing strategy?
What parts of the code base are salvageable?
What parts of the code base should be removed?
If you started again from scratch, what would you do differently to make sure the same problems don’t reoccur?

I strongly recommend taking the Setting a Foundation for Successful Test Automation course from Test Automation University. (It’s free.) I also gave a talk about this very problem, Egad! How Do We Start Writing (Better) Tests?, at a few Python conferences.

We have a large BDD test suite with heavy coupling and slow execution times. The business amigos have also left the company. Should we try to fix what we have or just start over?

Sorry to hear that; same answer as before.

Final Questions

Why do you call yourself the “Automation Panda”?

Pandas are awesome. Everybody loves them. And nobody forgets my moniker.

Where can I get team training in BDD?

Beaufort Fairmont provides a one- or two-day course in BDD and writing Gherkin. Sign up for more information here.

Sprint Planning Sucks. Can It Be Fixed?

Warning: This article contains strong opinions that might not be suitable for all audiences. Reader discretion is advised.

It’s Monday morning. After an all-too-short weekend and rush hour traffic, you finally arrive at the office. You throw your bag down at your desk, run to the break room, and queue up for coffee. As the next pot is brewing, you check your phone. It’s 8:44am… now 8:45am, and DING! A meeting reminder appears:

Sprint Planning – 9am to 3pm.

What’s your visceral reaction?

I can’t tell you mine, because I won’t put profanity on my blog.

Real Talk

In the capital-A Agile Scrum process, sprint planning is the kick-off meeting for the next iteration. The whole team comes together to talk about features, size work items with points, and commit to deliverables for the next “sprint” (typically 2 weeks long). Idealistically, team members collaborate freely as they learn about product needs and give valued input.

Let’s have some real talk, though: sprint planning sucks. Maybe that’s a harsh word, but, if you’re reading this article, then it caught your attention. Personally, my sprint planning experiences have been lousy. Why? Am I just bellyaching, or are there some serious underlying problems?

Sprint planning is a huge time commitment. 9am to 3pm is not an exaggeration. Sprint planning meetings are typically half-day to full-day affairs. Most people can’t stay focused on one thing for that long. Plus, when a sprint is only two weeks long, one hour is a big chunk of time, let alone 3, or 6, or a whole day. The longer the meeting, the higher the opportunity cost, and the deeper the boredom.

Collaboration is a farce. Planning meetings typically devolve into one “leader” (like a scrum master, product owner, or manager) pulling teeth to get info for a pre-determined list of stories. Only two people, the leader and the story-owner, end up talking, while everyone else just stares at their laptops until it’s their turn. Discussions typically don’t follow any routine beyond, “What’s the acceptance criteria?” and, “Does this look right?” with an interloper occasionally chiming in. Each team member typically gets only a few minutes of value out of an hours-long ordeal. That’s an inefficient use of everyone’s time.

No real planning actually happens. These meetings ought to be called “guessing” meetings, instead. Story point sizes are literally made up. Do they measure time or complexity? No, they really just measure groupthink. Teams even play a game called planning poker that subliminally encourages bluffing. Then, point totals are used to guess how much work can be done during the sprint. When the guess turns out to be wrong at the end of the sprint (and it always does), the team berates itself in retro for letting points slip. Every. Time.

Does It Spark Joy?

I’ve long wondered to myself if sprint planning is a good concept just implemented poorly, or if it’s conceptually flawed at its root. I’m pretty sure it’s just flawed. The meetings don’t facilitate efficient collaboration relative to their time commitments, and estimates are based on poor models. Retros can’t fix that. And gut reactions don’t lie.

So, what should we do? Should we Konmari our planning meetings to see if they spark joy? Should we get rid of our ceremonies and start over? Is this an indictment of the whole Agile Scrum process? But then, how will we know what to do, and when things can get done?

I think we can evolve our Agile process with more effective practices than sprint planning. And I don’t think that evolution would be terribly drastic.

Behavior-Driven Planning

What we really want out of a planning meeting is planning, not pulling and not predicting. Planning is the time to figure out what will be done and how it will be done. The size of the work should be based on the size of the blueprint. Enter Example Mapping.

Example Mapping is a Behavior-Driven Development practice for clarifying and confirming stories. The process is straightforward:

Write the story on a yellow card.
Write each rule that the story must satisfy on a blue card.
Illustrate each rule with examples written on green cards.
Got stuck on a question? Write it on a red card and move on.

One story should take about 20-30 minutes to map. The whole team can participate, or the team can split up into small groups to divide-and-conquer. Rules become acceptance criteria, examples become test cases, and questions become spikes.

Here’s a good walkthrough of Example Mapping.

What about story size? That’s easy – count the cards. How many cards does a story have? That’s a rough size for the work to be done based on the blueprint, not bluffing. More cards = more complexity. It’s objective. No games. Frankly, it can’t be any worse that made-up point values.

This is real planning: a blueprint with a course of action.

So, rather than doing traditional sprint planning meetings, try doing Example Mapping sessions. Actually plan the stories, and use card counts for point sizes. Decisions about priority and commitments can happen between rounds of story mapping, too. The Scrum process can otherwise remain the same.

If you want to evolve further, you could eliminate the time boxes of sprints in favor of Kanban. Two-week work item boundaries can arbitrarily fall in the middle of progress, which is not only disruptive to workflow but can also encourage bad responses (like cramming to get things done or shaming for not being complete.) Kanban treats work items as a continuous flow of prioritized work fed to a team in bite-sized pieces. When a new story comes up, it can have its own Example Mapping “planning” meeting. Now, Kanban is not for everyone, but it is popular among post-Agile practitioners. What’s important is to find what works for your team.

Rant Over

I know I expressed strong, controversial opinions in this article. And I also recognize that I’m arguing against bad examples of Agile Scrum. Nevertheless, I believe my points are fair: planning itself is not a waste of time, but the way many teams plan their sprints uses time inefficiently and sets poor expectations. There are better ways to do planning – let’s give them a try!