BDD

How Q2 uses BDD with SpecFlow for testing PrecisionLender

This case study was written by Andrew Knight, Lead Software Engineer in Test for Q2’s PrecisionLender product, in collaboration with Q2 and Tricentis. It explains the PrecisionLender team’s continuous testing journey and how SpecFlow served as a cornerstone for success.

What is PrecisionLender?

PrecisionLender is a web application that empowers commercial bankers with in-the-moment insights that help them structure and price commercial deals. Andi®, PrecisionLender’s intelligent virtual analyst, delivers these hyper-focused recommendations in real-time, allowing relationship managers to make data-driven decisions while pricing their commercial deals. PrecisionLender is owned and developed by Q2, a financial experience software company dedicated to providing digital banking and lending solutions to banks, credit unions, alternative finance, and fintech companies in the U.S. and internationally.

The PrecisionLender Opportunity Screen
(Picture taken from the PrecisionLender Support Center)

The starting point

The PrecisionLender team had a robust Continuous Integration (CI) delivery pipeline with strong unit test coverage, but they lacked end-to-end feature coverage. Developers would fill this gap by manually inspecting their changes in a shared development environment. However, as the PrecisionLender app grew, manual checks could not cover all possible integrations. The team knew they needed continuous automated testing to provide a safety net for development to remain lean and efficient. In April 2018, they hired Andrew Knight as their first Software Engineer in Test (SET) – a new role for the company – to lead the effort.

Automating tests with SpecFlow

The PrecisionLender team developed the Boa test solution – a project for automating end-to-end tests at scale. Boa would become PrecisionLender’s internal platform for test automation development. The name “Boa” is a loose acronym for “Behavior-Oriented Automation.”

The team chose SpecFlow to be the core framework for Boa tests. Since the PrecisionLender app’s backend is developed using .NET, SpecFlow was a natural fit. SpecFlow’s Gherkin syntax made tests readable and understandable, even to product owners and product support specialists who do not code.

The SpecFlow framework integrates with tools like Selenium WebDriver for testing Web UIs and RestSharp for testing REST APIs to exercise vital pathways for thorough app coverage. SpecFlow’s dependency injection mechanisms are solid yet simple, and the online docs are thorough. Plus, SpecFlow is an open-source project, so anyone can look at its code to learn how things work, open requests for new features, and even offer code contributions.

An example Boa test, written in Gherkin using SpecFlow.

Executing tests with SpecFlow+ Runner

Writing good tests was only part of the challenge. The PrecisionLender team needed to execute Boa tests continuously to provide fast feedback on changes to the app. The team chose to run Boa tests using SpecFlow+ Runner, which is tailored for SpecFlow tests. The team uses SpecFlow+ Runner to launch tests in parallel in TeamCity any time a developer deploys a code change to internal pre-production environments. The entire test suite also runs every night against multiple product configurations. SpecFlow+ Runner produces a helpful test report with everything needed to triage test failures: pass-and-fail tallies overall and per feature, a visual execution timeline, and full system logs. If engineers need to investigate certain failures more closely, they can use SpecFlow tags and SpecFlow+ Runner profiles to selectively filter tests for reruns. SpecFlow+ Runner’s multiple features help the team expedite test execution and investigation.

The SpecFlow+ Runner report for a dozen smoke tests.

Sharing features with SpecFlow+ LivingDoc

Good test cases are more than just verification procedures – they are behavior specifications. They define how features should work. Instead of keeping testing work siloed by role, the PrecisionLender team wanted to share Boa tests as behavior specs with all stakeholders to foster greater collaboration and understanding around features. The team also wanted to share Boa tests with specific customers without sharing the entire automation code.

SpecFlow+ LivingDoc enabled the PrecisionLender team to turn Gherkin feature files into living documentation. Whereas the SpecFlow+ Runner report focuses on automation execution, the SpecFlow+ LivingDoc report focuses on behavior specification apart from coding and automation details. LivingDoc displays Gherkin scenarios in a readable, searchable way that both internal folks and customers can consume. It can also optionally include high-level pass-and-fail results for each scenario, providing just enough information to be helpful and not overwhelming. LivingDoc has also helped PrecisionLender’s engineers identify and eliminate unused step definitions within the automation code. PrecisionLender benefits greatly from complementary reports from SpecFlow+ Runner and SpecFlow+ LivingDoc.

The SpecFlow+ LivingDoc report for a dozen smoke tests with their pass-and-fail results.

Improving interactions with Boa Constrictor

The Boa test solution initially used the Page Object Model to model interactions with the PrecisionLender app. However, as the PrecisionLender team automated more and more Boa tests, it became apparent that page objects did not scale well. Many page object classes had duplicative methods, making automation code messy. Some methods also did not include appropriate waiting mechanisms, introducing flaky failures.

PrecisionLender’s SETs developed Boa Constrictor, a .NET implementation of the Screenplay Pattern, to make better interactions for better automation. In Screenplay, actors use abilities to perform interactions. For example, an ability could be using Selenium WebDriver, and an interaction could be clicking an element. The Screenplay Pattern can be seen as a refactoring of the Page Object Model that minimizes duplicate code through a better separation of concerns. Individual interactions can be hardened for robustness, eliminating flaky hotspots. The Boa test solution now exclusively uses Boa Constrictor for interactions.

In October 2020, Q2 released Boa Constrictor as an open-source project so that anyone can use it. It is fully compatible with SpecFlow and other .NET test frameworks, and it provides rich interactions for Selenium WebDriver and RestSharp out of the box.

Boa Constrictor, the .NET Screenplay Pattern.

Scaling massively with Selenium Grid

When the PrecisionLender team first started automating Boa tests, they ran tests one at a time. That soon became too slow since the average Boa test took 20 to 50 seconds to complete. The team then started running up to 3 tests in parallel on one machine, but that also was not fast enough. They turned to Selenium Grid, a tool for running WebDriver sessions remotely across multiple machines.

PrecisionLender built a set of internal Selenium Grid instances using Microsoft Azure virtual machines to run Boa tests at high scale. As of July 2021, PrecisionLender has over 1800 unique Boa tests that run across four distinct product configurations. Whenever TeamCity detects a code change, it triggers a “continuous” Boa test suite with over 1000 tests running 50 parallel tests using Google Chrome on Selenium Grid. It completes execution in about 10 minutes. TeamCity launches the full test suite every night against all product configurations with 64-100 parallel tests on Selenium Grid. Continuous Integration currently runs up to 10K Boa tests daily against the PrecisionLender app with SpecFlow+ Runner and Selenium Grid.

The Boa test solution architecture, including Continuous Integration through TeamCity and parallel testing with SpecFlow+ Runner and Selenium Grid.

Shifting left with BDD

Better testing and automation practices eventually inspired better development practices. Product owners would create user stories, but developers would struggle to understand requirements and business purposes fully. PrecisionLender’s SETs started bringing together the Three Amigos – business, development, and testing roles – to discuss product behaviors proactively while creating user stories. They introduced Behavior-Driven Development (BDD) activities like Example Mapping to explore behaviors together. Then, well-defined stories could be easily connected to SpecFlow tests written in Gherkin following Specification by Example (SBE). Teams repeatedly saved time by thinking before coding and specifying before testing. They built higher quality into features from the beginning, and they stopped before working on half-baked stories with unjustified value propositions. Developers who participated in these behavior-driven practices were also more likely to automate Boa tests on their own. Furthermore, one of PrecisionLender’s developers loved BDD practices so much that he joined the team of SETs! Through Gherkin, SpecFlow provided a foundation that enabled quality work to shift left.

Challenges along the way

Achieving true continuous testing had its challenges along the way. Intermittent failure was the most significant issue PrecisionLender faced at scale. With so many tests, environments, and infrastructural pieces, arbitrary failures were statistically unavoidable. The PrecisionLender team took a two-pronged approach to handle intermittent failures: (1) eliminate race conditions in automation using good interactions with Boa Constrictor, and (2) use SpecFlow+ Runner to automatically retry failed tests to determine if failures were consistent or intermittent. These two approaches reduced the frequency of flaky failures and helped engineers quickly resolve any remaining issues. As a result, Boa tests enjoy well above a 99% success rate, and most failures are due to actual bugs.

PrecisionLender app performance at scale was a second big challenge. Running up to 100 tests in parallel turned functional tests into de facto load tests. Testing at scale repeatedly uncovered performance bottlenecks in the app. Performance issues caused widespread test failures that were difficult to diagnose because they appeared intermittently. Still, the visual timeline and timestamps in the SpecFlow+ Runner report helped the team identify periods of failure that could be crosschecked against backend logs, metrics, and database queries. Developers resolved many performance issues and significantly boost the app’s response times and load capacity.

Training team members to develop solid test automation was the third challenge. At the start of the journey, test automation, Gherkin, and BDD were all new to PrecisionLender. The PrecisionLender SETs took active steps to train others on how to develop good tests and good automation through group workshops, Three Amigos meetings, and one-on-one mentoring sessions. They shared resources like the Automation Panda blog for how to write good tests and good Gherkin. The investment in education paid off: many developers have joined the SETs in writing readable, reliable Boa tests that run continuously.

Benefits to the business

Developing a continuous testing solution brought many incredible benefits to PrecisionLender. First, the quality of the PrecisionLender app improved because continuous testing provided fast feedback on failures that developers could quickly fix. Instead of relying on manual spot checks, the team could trust the comprehensive safety net of Boa tests to catch bugs. Many issues would be caught within an hour of a developer making a code commit, and the longest feedback cycle would be only one business day for the full nightly test suites to run. Boa tests catch failures before customers ever experience them. The continuous nature of testing enables PrecisionLender to publish new releases every two weeks.

Second, the high reliability of the Boa test solution means that the PrecisionLender team can trust test results. When a test passes, the behavior is working. When a test fails, there is a real bug. Reliability also means that engineers spend less time on automation maintenance and more time on more valuable activities, like developing new features and adding new tests. Quality is present in both the product code and the test code.

Third, continuous testing boosts customer confidence in PrecisionLender. Customers trust the software quality because they know that PrecisionLender thoroughly tests every release. The PrecisionLender team also shares SpecFlow+ LivingDoc reports with specific clients to prove quality.

A bright future

PrecisionLender’s continuous testing journey is not over. Since the PrecisionLender team hired its first SET, it has hired three more, in addition to a testing manager, to grow quality improvement efforts. Multiple development teams have written their own Boa tests, and they plan to write more tests independently. SpecFlow’s tools have been indispensable in helping the PrecisionLender team achieve successful quality assurance. As PrecisionLender welcomes more customers, the Boa solution will be ready to scale with more tests, more configurations, and more executions.

Should Gherkin Steps use Past, Present, or Future Tense?

Gherkin’s Given-When-Then syntax is a great structure for specifying behaviors. However, while writing Gherkin may seem easy, writing good Gherkin can be a challenge. One aspect to consider is the tense used for Gherkin steps. Should Gherkin steps use past, present, or future tense?

One approach is to use present tense for all steps, like this:

Scenario: Simple Google search
    Given the Google home page is displayed
    When the user searches for "panda"
    Then the results page shows links related to "panda"

Notice the tense of each verb:

  1. the home page is – present
  2. the user searches – present
  3. the results page shows – present

Present tense is the simplest verb tense to use. It is the least “wordy” tense, and it makes the scenario feel active.

An alternative approach is to use past-present-future tense for Given-When-Then steps respectively, like this:

Scenario: Simple Google search
    Given the Google home page was displayed
    When the user searches for "panda"
    Then the results page will show links related to "panda"

Notice the different verb tenses in this scenario:

  1. the home page was – past
  2. the user searches – present
  3. the result page will show – future

Scenarios exercise behavior. Writing When steps using present tense centers the scenario’s main actions in the present. Since Given steps must happen before the main actions, they would be written using past tense. Likewise, since Then steps represent expected outcomes after the main actions, they would be written using future tense.

Both of these approaches – using all present tense or using past-present-future in order – are good. Personally, I prefer to write all steps using present tense. It’s easier to explain to others, and it frames the full scenario in the moment. However, I don’t think other approaches are good. For example, writing all steps using past tense or future tense would seem weird, and writing steps in order of future-present-past tense would be illogical. Scenarios should be centered in the present because they should timelessly represent the behaviors they cover.

Want to learn more? Check out my other BDD articles, especially Writing Good Gherkin.

Solving: How to write good UI interaction tests? #GivenWhenThenWithStyle

Writing good Gherkin is a passion of mine. Good Gherkin means good behavior specification, which results in better features, better tests, and ultimately better software. To help folks improve their Gherkin skills, Gojko Adzic and SpecFlow are running a series of #GivenWhenThenWithStyle challenges. I love reading each new challenge, and in this article, I provide my answer to one of them.

The Challenge

Challenge 20 states:

This week, we’re looking into one of the most common pain points with Given-When-Then: writing automated tests that interact with a user interface. People new to behaviour driven development often misunderstand what kind of behaviour the specifications should describe, and they write detailed user interactions in Given-When-Then scenarios. This leads to feature files that are very easy to write, but almost impossible to understand and maintain.

Here’s a typical example:

Scenario: Signed-in users get larger capacity
 
Given a user opens https://www.example.com using Chrome
And the user clicks on "Upload Files"
And the page reloads
And the user clicks on "Spreadsheet Formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "500kb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 
And the user clicks on "Login"
And the user enters "testuser123" into the "username" field
And the user enters "$Pass123" into the "password" field
And the user clicks on "Sign in"
And the page reloads
Then the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 
And the user clicks on "spreadsheet formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "1mb-sheet.xlsx" 
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx"

A common way to avoid such issues is to rewrite the specification to avoid the user interface completely. We’ve looked into that option several times in this article series. However, that solution only applies if the risk we’re testing is not in the user interface, but somewhere below. To make this challenge more interesting, let’s say that we actually want to include the user interface in the test, since the risk is in the UI interactions.

Indeed, most behavior-driven practitioners would generally recommend against phrasing steps using language specific to the user interface. However, there are times when testing a user interface itself is valid. For example, I work at PrecisionLender, a Q2 Company, and our main web app is very heavy on the front end. It has many, many interconnected fields for pricing commercial lending opportunities. My team has quite a few tests to cover UI-centric behaviors, such as verifying that entering a new interest rate triggers recalculation for summary amounts. If the target behavior is a piece of UI functionality, and the risk it bears warrants test coverage, then so be it.

Let’s break down the example scenario given above to see how to write Gherkin with style for user interface tests.

Understanding Behavior

Behavior is behavior. If you can describe it, then you can do it. Everything exhibits behavior, from the source code itself to the API, UIs, and full end-to-end workflows. Gherkin scenarios should use verbiage that reflects the context of the target behavior. Thus, the example above uses words like “click,” “select,” and “open.” Since the scenario explicitly covers a user interface, I think it is okay to use these words here. What bothers me, however, are two apparent code smells:

  1. The wall of text
  2. Out-of-order step types

The first issue is the wall of text this scenario presents. Walls of text are hard to read because they present too much information at once. The reader must take time to read through the whole chunk. Many readers simply read the first few lines and then skip the remainder. The example scenario has 27 Given-When-Then steps. Typically, I recommend Gherkin scenarios to have single-digit line length. A scenario with less than 10 steps is easier to understand and less likely to include unnecessary information. Longer scenarios are not necessarily “wrong,” but their longer lengths indicate that, perhaps, these scenarios could be rewritten more concisely.

The second issue in the example scenario is that step types are out of order. Given-When-Then is a formula for success. Gherkin steps should follow strict Given → When → Then ordering because this ordering demarcates individual behaviors. Each Gherkin scenario should cover one individual behavior so that the target behavior is easier to understand, easier to communicate, and easier to investigate whenever the scenario fails during testing. When scenarios break the order of steps, such as Given → Then → Given → Then in the example scenario, it shows that either the scenario covers multiple behaviors or that the author did not bring a behavior-driven understanding to the scenario.

The rules of good behavior don’t disappear when the type of target behavior changes. We should still write Gherkin with best practices in mind, even if our scenarios cover user interfaces.

Breaking Down Scenarios

If I were to rewrite the example scenario, I would start by isolating individual behaviors. Let’s look at the first half of the original example:

Given a user opens https://www.example.com using Chrome
And the user clicks on "Upload Files"
And the page reloads
And the user clicks on "Spreadsheet Formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "500kb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx"

Here, I see four distinct behaviors covered:

  1. Clicking “Upload Files” reloads the page.
  2. Clicking “Spreadsheet Formats” displays new buttons.
  3. Uploading a spreadsheet file makes the filename appear on the page.
  4. Attempting to upload a spreadsheet file that is 1MB or larger fails.

If I wanted to purely retain the same coverage, then I would rewrite these behavior specs using the following scenarios:

Feature: Example site
 
 
Scenario: Choose to upload files
 
Given the Example site is displayed
When the user clicks the "Upload Files" link
Then the page displays the "Spreadsheet Formats" link
 
 
Scenario: Choose to upload spreadsheets
 
Given the Example site is ready to upload files
When the user clicks the "Spreadsheet Formats" link
Then the page displays the "XLS" and "XLSX" buttons
 
 
Scenario: Upload a spreadsheet file that is smaller than 1MB
 
Given the Example site is ready to upload spreadsheet files
When the user clicks the "XLSX" button
And the user selects "500kb-sheet.xlsx" from the file upload dialog
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
 
 
Scenario: Upload a spreadsheet file that is larger than or equal to 1MB
 
Given the Example site is ready to upload spreadsheet files
When the user clicks the "XLSX" button
And the user selects "1mb-sheet.xlsx" from the file upload dialog
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx"

Now, each scenario covers each individual behavior. The first scenario starts with the Example site in a “blank” state: “Given the Example site is displayed”. The second scenario inherently depends upon the outcome of the first scenario. Rather than repeat all the steps from the first scenario, I wrote a new starting step to establish the initial state more declaratively: “Given the Example site is ready to upload files”. This step’s definition method may need to rerun the same operations as the first scenario, but it guarantees independence between scenarios. (The step could also optimize the operations, but that should be a topic for another challenge.) Likewise, the third and fourth scenarios have a Given step to establish the state they need: “Given the Example site is ready to upload spreadsheet files.” Both scenarios can share the same Given step because they have the same starting point. All three of these new steps are descriptive more than prescriptive. They declaratively establish an initial state, and they leave the details to the automation code in the step definition methods to determine precisely how that state is established. This technique makes it easy for Gherkin scenarios to be individually clear and independently executable.

I also added my own writing style to these scenarios. First, I wrote concise, declarative titles for each scenario. The titles dictate interaction over mechanics. For example, the first scenario’s title uses the word “choose” rather than “click” because, from the user’s perspective, they are “choosing” an action to take. The user will just happen to mechanically “click” a link in the process of making their choice. The titles also provide a level of example. Note that the third and fourth scenarios spell out the target file sizes. For brevity, I typically write scenario titles using active voice: “Choose this,” “Upload that,” or “Do something.” I try to avoid including verification language in titles unless it is necessary to distinguish behaviors.

Another stylistic element of mine was to remove explicit details about the environment. Instead of hard coding the website URL, I gave the site a proper name: “Example site.” I also removed the mention of Chrome as the browser. These details are environment-specific, and they should not be specified in Gherkin. In theory, this site could have multiple instances (like an alpha or a beta), and it should probably run in any major browser (like Firefox and Edge). Environmental characteristics should be specified as inputs to the automation code instead.I also refined some of the language used in the When and Then steps. When I must write steps for mechanical actions like clicks, I like to specify element types for target elements. For example, “When the user clicks the “Upload Files” link” specifies a link by a parameterized name. Saying the element is a link helps provides context to the reader about the user interface. I wrote other steps that specify a button, too. These steps also specified the element name as a parameter so that the step definition method could possibly perform the same interaction for different elements. Keep in mind, however, that these linguistic changes are neither “required” nor “perfect.” They make sense in the immediate context of this feature. While automating step definitions or writing more scenarios, I may revisit the verbiage and do some refactoring.

Determining Value for Each Behavior

The four new scenarios I wrote each covers an independent, individual behavior of the fictitious Example site’s user interface. They are thorough in their level of coverage for these small behaviors. However, not all behaviors may be equally important to cover. Some behaviors are simply more important than others, and thus some tests are more valuable than others. I won’t go into deep detail about how to measure risk and determine value for different tests in this article, but I will offer some suggestions regarding these example scenarios.

First and foremost, you as the tester must determine what is worth testing. These scenarios aptly specify behavior, and they will likely be very useful for collaborating with the Three Amigos, but not every scenario needs to be automated for testing. You as the tester must decide. You may decide that all four of these example scenarios are valuable and should be added to the automated test suite. That’s a fine decision. However, you may instead decide that certain user interface mechanics are not worth explicitly testing. That’s also a fine decision.

In my opinion, the first two scenarios could be candidates for the chopping block:

  1. Choose to upload files
  2. Choose to upload spreadsheets

Even though these are existing behaviors in the Example site, they are tiny. The tests simply verify that a user clicks makes certain links or buttons appear. It would be nice to verify them, but test execution time is finite, and user interface tests are notoriously slow compared to other tests. Consider the Rule of 1’s: typically, by orders of magnitude, a unit test takes about 1 millisecond, a service API test takes about 1 second, and a web UI test takes about 1 minute. Furthermore, these behaviors are implicitly exercised by the other scenarios, even if they don’t have explicit assertions.

One way to condense the scenarios could be like this:

Feature: Example site
 
 
Background:
 
Given the Example site is displayed
When the user clicks the "Upload Files" link
And the user clicks the "Spreadsheet Formats" link
And the user clicks the "XLSX" button
 
 
Scenario: Upload a spreadsheet file that is smaller than 1MB
 
When the user selects "500kb-sheet.xlsx" from the file upload dialog
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
 
 
Scenario: Upload a spreadsheet file that is larger than or equal to 1MB
 
When the user selects "1mb-sheet.xlsx" from the file upload dialog
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 

This new feature file eliminates the first two scenarios and uses a Background section to cover the setup steps. It also eliminates the need for special Given steps in each scenario to set unique starting points. Implicitly, if the “Upload Files” or “Spreadsheet Formats” links fail to display the expected elements, then those steps would fail.

Again, this modification is not necessarily the “best” way or the “right” way to cover the desired behaviors, but it is a reasonably good way to do so. However, I would assert that both the 4-scenario feature file and the 2-scenario feature file are much better approaches than the original example scenario.

More Gherkin

What I showed in my answer to this Gherkin challenge is how I would handle UI-centric behaviors. I try to keep my Gherkin scenarios concise and focused on individual, independent behaviors. Try using these style techniques to rewrite the second half of Gojko’s original scenario. Feel free to drop your Gherkin in the comments below. I look forward to seeing how y’all write #GivenWhenThenWithStyle!

Using Domain-Specific Languages for Security Testing

I love programming languages. They have fascinated me ever since I first learned to program my TI-83 Plus calculator in ninth grade, many years ago. When I studied computer science in college, I learned how parsers, interpreters, and compilers work. During my internships at IBM, I worked on a language named Enterprise Generation Language as both a tester and a developer. At NetApp, I even developed my own language named DS for test automation. Languages are so much fun to learn, build, and extend.

Today, even though I do not actively work on compilers, I still do some pretty interesting things with languages and testing. I strongly advocate for Behavior-Driven Development and its domain-specific language (DSL) Gherkin. In fact, as I wrote in my article Behavior-Driven Blasphemy, I support using Gherkin-based BDD test frameworks for test automation even if a team is not also doing BDD’s collaborative activities. Why? Gherkin is the world’s first major off-the-shelf DSL for test automation, and it doesn’t require the average tester to know the complexities of compiler theory. DSLs like Gherkin can make tests easier to read, faster to write, and more reliable to run. They provide a healthy separation of concerns between test cases and test code. After working on successful large-scale test automation projects with C# and SpecFlow, I don’t think I could go back to traditional test frameworks.

I’m not the only one who thinks this way. Here’s a tweet from Dinis Cruz, CTO and CISO at Glasswall, after he read one of my articles:

Dinis then tweeted at me to invite me to speak about using DSLs for testing at the Open Security Summit in 2021:

Now, I’m not a “security guy” at all, but I do know a thing or two about DSLs and testing. So, I gladly accepted the invitation to speak! I delivered my talk, “Using DSLs for Security Testing” virtually on Thursday, January 14, 2021 at 10am US Eastern. I also uploaded my slides to GitHub at AndyLPK247/using-dsls-for-security-testing. Check out the YouTube recording here:

This talk was not meant to be a technical demo or tutorial. Instead, it was meant to be a “think big” proposal. The main question I raised was, “How can we use DSLs for security testing?” I used my own story to illustrate the value languages deliver, particularly for testing. My call to action breaks that question down into three parts:

  1. Can DSLs make security testing easier to do and thereby more widely practiced?
  2. Is Gherkin good enough for security testing, or do we need to make a DSL specific to security?
  3. Would it be possible to write a set of “standard” or “universal” security tests using a DSL that anyone could either run directly or use as a template?

My goal for this talk was to spark a conversation about DSLs and security testing. Immediately after my talk, Luis Saiz shared two projects he’s working on regarding DSLs and security: SUSTO and Mist. Dinis also invited me back for a session at the Open Source Summit Mini Summit in February to have a follow-up roundtable discussion for my talk. I can’t wait to explore this idea further. It’s an exciting new space for me.

If this topic sparks your interest, be sure to watch my talk recording, and then join us live in February 2021 for the next Open Source Summit event. Virtual sessions are free to join. Many thanks again to Dinis and the whole team behind Open Source Summit for inviting me to speak and organizing the events.

SpecFlow’s Online Gherkin Editor

Finding a good Gherkin editor is difficult. Some editors like Visual Studio Code and similar IDEs work great for engineers but aren’t suitable for product owners and non-programmer Amigos who want to contribute. Other editors like Notepad++ and Atom are lighter in weight but still require extensions and a little expertise. Fancy BDD tools like CucumberStudio and Cucumber for Jira provide Gherkin editors together with a bunch of other nifty features, but they require paid licenses.

For years, I’ve wanted a lightweight Gherkin editor that’s easy to use and accessible to all. Now, one finally exists: the Online Gherkin Editor by SpecFlow!

SpecFlow is the most popular BDD test automation framework for .NET. It’s also my favorite BDD framework. Over the past few years, I’ve built two large-scale test automation solutions with SpecFlow.

The Online Gherkin Editor by SpecFlow is just an editor on a web page. When you first load the page, the editor has example scenarios for you to reference. You can type your own Gherkin into the text area, and the editor highlights it for you. The editor provides line numbers and visual scrolling, too. My language is English, but if you happen to speak German, French, Spanish, or Dutch, then you can change the language setting via a dropdown. Once you’re done writing your Gherkin, you can clear it, copy it to the clipboard, or download it as a feature file using icons in the top-right corner. Be warned, though, that this editor won’t save your Gherkin in the cloud.

If you want to give this new editor a try, here’s the link: https://specflow.org/gherkin-editor/

You can also read SpecFlow’s official announcement here: https://specflow.org/blog/introducing-the-specflow-online-gherkin-editor/

Thanks, SpecFlow! Happy “Gherk-ing”!

Introducing Boa Constrictor: The .NET Screenplay Pattern

Today, I’m excited to announce the release of a new open source project for test automation: Boa Constrictor, the .NET Screenplay Pattern!

The Screenplay Pattern helps you make better interactions for better automation. The pattern can be summarized in one line: Actors use Abilities to perform Interactions.

  • Actors initiate Interactions. Every test has an Actor.
  • Abilities enable Actors to perform Interactions. They hold objects that Interactions need, like WebDrivers or REST API clients.
  • Interactions exercise behaviors under test. They could be clicks, requests, commands, and anything else.

This separation of concerns makes Screenplay code very reusable and scalable, much more so than traditional page objects. Check it out, here’s a C# script to test a search engine:

// Create the Actor
IActor actor = new Actor(logger: new ConsoleLogger());

// Add an Ability to use a WebDriver
actor.Can(BrowseTheWeb.With(new ChromeDriver()));

// Load the search engine
actor.AttemptsTo(Navigate.ToUrl(SearchPage.Url));

// Get the page's title
string title = actor.AsksFor(Title.OfPage());

// Search for something
actor.AttemptsTo(Search.For("panda"));

// Wait for results
actor.AttemptsTo(Wait.Until(
    Appearance.Of(ResultPage.ResultLinks),
    IsEqualTo.True()));

Boa Constrictor provides many interactions for Selenium WebDriver and RestSharp out of the box, like Navigate, Title, and Appearance shown above. It also lets you compose interactions together, like how Search is a composition of typing and clicking.

Over the past two years, my team and I at PrecisionLender, a Q2 Company, developed Boa Constrictor internally as the cornerstone of Boa, our comprehensive end-to-end test automation solution. We were inspired by Serenity BDD‘s Screenplay implementation. After battle-hardening Boa Constrictor with thousands of automated tests, we are releasing it publicly as an open source project. Our goal is to help everyone make better interactions for better test automation.

If you’d like to give Boa Constrictor a try, then start with the tutorial. You’ll implement that search engine test from above in full. Then, once you’re ready to use it for some serious test automation, add the Boa.Constrictor NuGet package to your .NET project and go!

You can view the full source code on GitHub at q2ebanking/boa-constrictor. Check out the repository for full information. In the coming weeks, we’ll be developing more content and code. Since Boa Constrictor is open source, we’d love for you to contribute to the project, too!

Test-Driving TestProject’s New Python SDK

TestProject recently released its new OpenSDK, and one of its major features is the inclusion of Python testing support! Since I love using Python for test automation, I couldn’t wait to give it a try. This article is my crash-course tutorial on writing Web UI tests in Python with TestProject.

What is TestProject?

TestProject is a free end-to-end test automation platform for Web, mobile, and API tests. It provides a cloud-based way to teams to build, run, share, and analyze tests. Manual testers can visually build tests for desktop or mobile sites using TestProject’s in-browser recorder and test builder. Automation engineers can use TestProject’s SDKs in Java, C#, and now Python for developing coded test automation solutions, and they can use packages already developed by others in the community through TestProject’s add-ons. Whether manual or automated, TestProject displays all test results in a sleek reporting dashboard with helpful analytics. And all of these features are legitimately free – there’s no tiered model or service plan.

Recently, TestProject announced the official release of its new OpenSDK. This new SDK (“software development kit”) provides a simple, unified interface for running tests with TestProject across multiple platforms and languages (now including Python). Things look exciting for the future of TestProject!

What’s My Interest?

It’s no secret that I love testing with Python. When I heard that TestProject added Python support, I knew I had to give it a try. I never used TestProject before, but I was interested to learn what it could do. Specifically, I wanted to see the value it could bring to reporting automated tests. In the Python space, test automation is slick, but reporting can be rough since frameworks like pytest and unittest are command-line-focused. I also wanted to see if TestProject’s SDK would genuinely help me automate tests or if it would get it my way. Furthermore, I know some great people in the TestProject community, so I figured it was time to jump in myself!

The Python SDK

TestProject’s Python SDK is an open-source project. It was originally developed by Bas Dijkstra, with the support of the TestProject team, and its code is hosted on GitHub. The Python SDK supports Selenium for Web UI automation (which will be the focus for this tutorial) and Appium for Android and iOS UI automation as well!

Since I’d never used TestProject before, let alone this new Python SDK, I wanted to review the code to see how to use it. Thankfully, the README included lots of helpful information and example code. When I looked at the code for TestProject’s BaseDriver, I discovered that it simply extend’s Selenium WebDriver’s RemoteDriver class. That means all the TestProject WebDrivers use exactly the same API as Python’s Selenium WebDriver implementation. To me, that was a big relief. I know WebDriver’s API very well, so I wouldn’t need to learn anything different in order to use TestProject. It also means that any test automation project can be easily retrofitted to use TestProject’s SDKs – they just need to swap in a new WebDriver object!

Setup Steps

TestProject has a straightforward architecture. Users sign up for free TestProject accounts online. Then, they set up their own machines for running tests. Each testing machine must have the TestProject agent installed and linked to a user’s account. When tests run, agents automatically push results to the TestProject cloud. Users can then log into the TestProject portal to view and analyze results. They can invite team mates to share results, and they can also set up multiple test machines with agents. Users can even integrate TestProject with other tools like Jenkins, qTest, and Sauce Labs. The TestProject docs, especially the ecosystem diagram, explain everything in more detail.

When I did my test drive, I created a TestProject account, installed the agent on my Mac, and ran Python Web UI tests from my Mac. I already had the latest version of Python installed (Python 3.8 at the time of writing this article). I also already had my target browsers installed: Google Chrome and Mozilla Firefox.

Below are the precise steps I followed to set up TestProject:

1. Sign up for an account

TestProject accounts are “free forever.” Use this signup link.

The TestProject signup page

2. Download the TestProject Agent

The signup wizard should direct you to download the TestProject agent. If not, you can always download it from the TestProject dashboard. Be warned, the download package is pretty large – the macOS package was 345 MB. Alternatively, you can fetch the agent as a container image from Docker Hub.

The TestProject agent download page

The TestProject agent contains all the stuff needed to run tests and upload results to the TestProject app in the cloud. You don’t need to install WebDriver executables like ChromeDriver or geckodriver. Once the agent is downloaded, install it on the machine and register the agent with your account. For me, registration happened automatically.

This is what the TestProject agent looks like when running on macOS. You can also close this window to let it run in the background.

3. Find your developer token

You’ll need to use your developer token to connect your automated tests to your account in the TestProject app. The signup wizard should reveal it to you, but you can always find it (and also reset it) on the Integrations page.

The Integrations page. Check here for your developer token. No, you can’t use mine.

4. Install the Python SDK

TestProject’s Python SDK is distributed as a package through PyPI. To install it, simply run pip install testproject-python-sdk at the command line. This package will also install dependencies like selenium and requests.

A Classic Web UI Test

After setting up my Mac to use TestProject, it was time to write some Web UI tests in Python! Since I discovered that TestProject’s WebDriver objects could easily retrofit any existing test automation project, I thought, “What if I try to run my PyCon 2020 tutorial project with TestProject?” For PyCon 2020, I gave an online tutorial about building a Web UI test automation project in Python from the ground up using pytest and Selenium WebDriver. The tutorial includes one test case: a DuckDuckGo web search and verification. I thought it would be easy to integrate with TestProject since I already had the code. Thankfully, it was!

Below, I’ll walk though my code. You can check out my example project repository from GitHub at AndyLPK247/testproject-python-sdk-example. My code will be a bit more advanced than the examples shown in the Python SDK’s README or in Bas Dijkstra’s tutorial article because it uses the Page Object Model and pytest fixtures. Make sure to pip install pytest, too.

1. Write the test steps

The test case covers a simple DuckDuckGo web search. Whenever I automate tests, I always write out the steps in plain language. Good tests follow the Arrange-Act-Assert pattern, and I like to use Gherkin’s Given-When-Then phrasing. Here’s the test case:

Scenario: Basic DuckDuckGo Web Search
    Given the DuckDuckGo home page is displayed
    When the user searches for "panda"
    Then the search result query is "panda"
    And the search result links pertain to "panda"
    And the search result title contains "panda"

2. Specify automation inputs

Inputs configure how automated tests run. They can be passed into a test automation solution using configuration files. Testers can then easily change input values in the config file without changing code. Automation should read config files once at the start of testing and inject necessary inputs into every test case.

In Python, I like to use JSON for config files. JSON data is simple and hierarchical, and Python includes a module in its standard library named json that can parse a JSON file into a Python dictionary in one line. I also like to put config files either in the project root directory or in the tests directory.

Here’s the contents of config.json for this test:

{
  "browser": "Chrome",
  "implicit_wait": 10,
  "testproject_projectname": "TestProject Python SDK Example",
  "testproject_token": ""
}
  • browser is the name of the browser to test
  • implicit_wait is the implicit waiting timeout for the WebDriver instance
  • testproject_projectname is the project name to use for this test suite in the TestProject app
  • testproject_token is the developer token

3. Read automation inputs

Automation code should read inputs one time before any tests run and then inject inputs into appropriate tests. pytest fixtures make this easy to do.

I created a fixture named config in the tests/conftest.py module to read config.json:

import json
import pytest


@pytest.fixture
def config(scope='session'):

  # Read the file
  with open('config.json') as config_file:
    config = json.load(config_file)
  
  # Assert values are acceptable
  assert config['browser'] in ['Firefox', 'Chrome', 'Headless Chrome']
  assert isinstance(config['implicit_wait'], int)
  assert config['implicit_wait'] > 0
  assert config['testproject_projectname'] != ""
  assert config['testproject_token'] != ""

  # Return config so it can be used
  return config

Setting the fixture’s scope to “session” means that it will run only one time for the whole test suite. The fixture reads the JSON config file, parses its text into a Python dictionary, and performs basic input validation. Note that Firefox, Chrome, and Headless Chrome will be supported browsers.

4. Set up WebDriver

Each Web UI test should have its own WebDriver instance so that it remains independent from other tests. Once again, pytest fixtures make setup easy.

The browser fixture in tests/conftest.py initialize the appropriate TestProject WebDriver type based on inputs returned by the config fixture:

from selenium.webdriver import ChromeOptions
from src.testproject.sdk.drivers import webdriver


@pytest.fixture
def browser(config):

  # Initialize shared arguments
  kwargs = {
    'projectname': config['testproject_projectname'],
    'token': config['testproject_token']
  }

  # Initialize the TestProject WebDriver instance
  if config['browser'] == 'Firefox':
    b = webdriver.Firefox(**kwargs)
  elif config['browser'] == 'Chrome':
    b = webdriver.Chrome(**kwargs)
  elif config['browser'] == 'Headless Chrome':
    opts = ChromeOptions()
    opts.add_argument('headless')
    b = webdriver.Chrome(chrome_options=opts, **kwargs)
  else:
    raise Exception(f'Browser "{config["browser"]}" is not supported')

  # Make its calls wait for elements to appear
  b.implicitly_wait(config['implicit_wait'])

  # Return the WebDriver instance for the setup
  yield b

  # Quit the WebDriver instance for the cleanup
  b.quit()

This was the only section of code I needed to change to make my PyCon 2020 tutorial project work with TestProject. I had to change the WebDriver invocations to use the TestProject classes. I also had to add arguments for the project name and developer token, which come from the config file. (Note: you may alternatively set the developer token as an environment variable.)

5. Create page objects

Automated tests could make direct calls to the WebDriver interface to interact with the browser, but WebDriver calls are typically low-level and wordy. The Page Object Model is a much better design pattern. Page object classes encapsulate WebDriver gorp so that tests can call simpler, more readable methods.

The DuckDuckGo search test interacts with two pages: the search page and the result page. The pages package contains a module for each page. Here’s pages/search.py:

from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys


class DuckDuckGoSearchPage:

  URL = 'https://www.duckduckgo.com'

  SEARCH_INPUT = (By.ID, 'search_form_input_homepage')

  def __init__(self, browser):
    self.browser = browser

  def load(self):
    self.browser.get(self.URL)

  def search(self, phrase):
    search_input = self.browser.find_element(*self.SEARCH_INPUT)
    search_input.send_keys(phrase + Keys.RETURN)

And here’s pages/result.py:

from selenium.webdriver.common.by import By

class DuckDuckGoResultPage:
  
  RESULT_LINKS = (By.CSS_SELECTOR, 'a.result__a')
  SEARCH_INPUT = (By.ID, 'search_form_input')

  def __init__(self, browser):
    self.browser = browser

  def result_link_titles(self):
    links = self.browser.find_elements(*self.RESULT_LINKS)
    titles = [link.text for link in links]
    return titles
  
  def search_input_value(self):
    search_input = self.browser.find_element(*self.SEARCH_INPUT)
    value = search_input.get_attribute('value')
    return value

  def title(self):
    return self.browser.title

Notice that this code uses the “regular” WebDriver interface because TestProject’s WebDriver classes extend the Selenium WebDriver classes.

To make setup easier, I added fixtures to tests/conftest.py to construct each page object, too. They call the browser fixture and inject the WebDriver instance into each page object:

from pages.result import DuckDuckGoResultPage
from pages.search import DuckDuckGoSearchPage


@pytest.fixture
def search_page(browser):
  return DuckDuckGoSearchPage(browser)


@pytest.fixture
def result_page(browser):
  return DuckDuckGoResultPage(browser)

6. Automate the test case

All the automation plumbing is finally in place. Here’s the test case in tests/traditional/test_duckduckgo.py:

import pytest


@pytest.mark.parametrize('phrase', ['panda', 'python', 'polar bear'])
def test_basic_duckduckgo_search(search_page, result_page, phrase):
  
  # Given the DuckDuckGo home page is displayed
  search_page.load()

  # When the user searches for the phrase
  search_page.search(phrase)

  # Then the search result query is the phrase
  assert phrase == result_page.search_input_value()
  
  # And the search result links pertain to the phrase
  titles = result_page.result_link_titles()
  matches = [t for t in titles if phrase.lower() in t.lower()]
  assert len(matches) > 0

  # And the search result title contains the phrase
  assert phrase in result_page.title()

I parametrized the test to run it for three different phrases. The test function does not interact with the WebDriver instance directly. Instead, it interacts exclusively with the page objects.

7. Run the tests

The tests run like any other pytest tests: python -m pytest at the command line. If everything is set up correctly, then the tests will run successfully and upload results to the TestProject app.

In the TestProject dashboard, the Reports tab shows all the test you have run. It also shows the different test projects you have.

Check out those results!

You can also drill into results for individual test case runs. TestProject automatically records the browser type, timestamps, pass-or-fail results, and every WebDriver call. You can also download PDF reports!

Results for an individual test

What if … BDD?

I was delighted to see how easily I could run a traditional pytest suite using TestProject. Then, I thought to myself, “What if I could use a BDD test framework?” I personally love Behavior-Driven Development, and Python has multiple BDD test frameworks. There is no reason why a BDD test framework wouldn’t work with TestProject!

So, I rewrote the DuckDuckGo search test as a feature file with step definitions using pytest-bdd. The BDD-style test uses the same fixtures and page objects as the traditional test.

Here’s the Gherkin scenario in tests/bdd/features/duckduckgo.feature:

Feature: DuckDuckGo
  As a Web surfer,
  I want to search for websites using plain-language phrases,
  So that I can learn more about the world around me.


  Scenario Outline: Basic DuckDuckGo Web Search
    Given the DuckDuckGo home page is displayed
    When the user searches for "<phrase>"
    Then the search result query is "<phrase>"
    And the search result links pertain to "<phrase>"
    And the search result title contains "<phrase>"

    Examples:
      | phrase     |
      | panda      |
      | python     |
      | polar bear |

And here’s the step definition module in tests/bdd/step_defs/test_duckduckgo_bdd.py:

from pytest_bdd import scenarios, given, when, then, parsers
from selenium.webdriver.common.keys import Keys


scenarios('../features/duckduckgo.feature')


@given('the DuckDuckGo home page is displayed')
def load_duckduckgo(search_page):
  search_page.load()


@when(parsers.parse('the user searches for "{phrase}"'))
@when('the user searches for "<phrase>"')
def search_phrase(search_page, phrase):
  search_page.search(phrase)


@then(parsers.parse('the search result query is "{phrase}"'))
@then('the search result query is "<phrase>"')
def check_search_result_query(result_page, phrase):
  assert phrase == result_page.search_input_value()


@then(parsers.parse('the search result links pertain to "{phrase}"'))
@then('the search result links pertain to "<phrase>"')
def check_search_result_links(result_page, phrase):
  titles = result_page.result_link_titles()
  matches = [t for t in titles if phrase.lower() in t.lower()]
  assert len(matches) > 0


@then(parsers.parse('the search result title contains "{phrase}"'))
@then('the search result title contains "<phrase>"')
def check_search_result_title(result_page, phrase):
  assert phrase in result_page.title()

There’s one more nifty trick I added with pytest-bdd. I added a hook to report each Gherkin step to TestProject with a screenshot! That way, testers can trace each test case step more easily in the TestProject reports. Capturing screenshots also greatly assists test triage when failures arise. This hook is located in tests/conftest.py:

def pytest_bdd_after_step(request, feature, scenario, step, step_func):
  browser = request.getfixturevalue('browser')
  browser.report().step(description=str(step), message=str(step), passed=True, screenshot=True)

Since pytest-bdd is just a pytest plugin, its tests run using the same python -m pytest command. TestProject will group these test results into the same project as before, but it will separate the traditional tests from the BDD tests by name. Here’s what the Gherkin steps with screenshots look like:

Custom Gherkin step with screenshot reported in the TestProject app

This is Awesome!

As its name denotes, TestProject is a great platform for handling project-level concerns for testing work: reporting, integrations, and fast feedback. Adding TestProject to an existing automation solution feels seamless, and its sleek user experience gives me what I need as a tester without getting in my way. The one word that keeps coming to mind is “simple” – TestProject simplifies setup and sharing. Its design takes to heart the renowned Python adage, “Simple is better than complex.” As such, TestProject’s new Python SDK is a welcome addition to the Python testing ecosystem.

I look forward to exploring Python support for mobile testing with Appium soon. I also look forward to seeing all the new Python add-ons the community will develop.

Using Multiple Test Frameworks Simultaneously

Someone recently asked me the following question, which I’ve paraphrased for better context:

Is it good practice to use multiple test frameworks simultaneously? For example, I’m working on a Python project. I want to do BDD with behave for feature testing, but pytest would be better for unit testing. Can I use both? If so, how should I structure my project(s)?

The short answer: Yes, you should use the right frameworks for the right needs. Using more than one test framework is typically not difficult to set up. Let’s dig into this.

The F-word

I despise the F-word – “framework.” Within the test automation space, people use the word “framework” to refer to different things. Is the “framework” only the test package like pytest or JUnit? Does it include the tests themselves? Does it refer to automation tools like Selenium WebDriver?

For clarity, I prefer to use two different terms: “framework” and “solution.” A test framework is software package that lets programmers write tests as methods or functions, run the tests, and report the results. A test solution is a software implementation for a testing problem. It typically includes frameworks, tools, and test cases. To me, a framework is narrow, but a solution is comprehensive.

The original question used the word “framework,” but I think it should be answered in terms of solutions. There are two potential solutions at hand: one for unit tests written in pytest, while another for feature tests written in behave.

One Size Does Not Fit All

Always use the right tools or frameworks for the right needs. Unit tests and feature tests are fundamentally different. Unit tests directly access internal functions and methods in product code, whereas feature tests interact with live versions of the product as an external user or caller. Thus, they need different kinds of testing solutions, which most likely will require different tools and frameworks.

For example, behave is a BDD framework for Python. Programmers write test cases in plain-language Gherkin with step definitions as Python functions. Gherkin test cases are intuitively readable and understandable, which makes them great for testing high-level behaviors like interacting with a Web page. However, BDD frameworks add complexity that hampers unit test development. Unit tests are inherently “code-y” and low-level because they directly call product code. The pytest framework would be a better choice for unit testing. Conversely, feature tests could be written using raw pytest, but behave provides a more natural structure for describing features. Hence, separate solutions for different test types would be ideal.

Same or Separate Repositories?

If more than one test solution is appropriate for a given software project, the next question is where to put the test code. Should all test code go into the same repository as the product code, or should they go into separate repositories? Unfortunately, there is no universally correct answer. Here are some factors to consider.

Unit tests should always be located in the same repository as the product code they test. Unit tests directly depend upon the product code. They mus be written in the same language. Any time the product code is refactored, unit tests must be updated.

Feature tests can be placed in the same repository or a separate repository. I recommend putting feature tests in the same repository as product code if feature tests are written in the same language as the product code and if all the product code under test is located in the same repository. That way, tests are version-controlled together with the product under test. Otherwise, I recommend putting feature tests in their own separate repository. Mixed language repositories can be confusing to maintain, and version control must be handled differently with multi-repository products.

Same Repository Structure

One test solution in one repository is easy to set up, but multiple test solutions in one repository can be tricky. Thankfully, it’s not impossible. Project structure ultimately depends upon the language. Regardless of language, I recommend separating concerns. A repository should have clearly separate spaces (e.g., subdirectories) for product code and test code. Test code should be further divided by test types and then coverage areas. Testers should be able to run specific tests using convenient filters.

Here are ways to handle multiple test solutions in a few different languages:

  • In Python, project structure is fairly flexible. Conventionally, all tests belong under a top-level directory named “tests.” Subdirectories may be added thereunder, such as “unit” and “feature”. Frameworks like pytest and behave can take search paths so they run the proper tests. Furthermore, if using pytest-bdd instead of behave, pytest can use the markings/tags instead of search paths for filtering tests.
  • In .NET (like C#), the terms “project” and “solution” have special meanings. A .NET project is a collection of code that is built into one artifact. A .NET solution is a collection of projects that interrelate. Typically, the best practice in .NET would be to create a separate project for each test type/suite within the same .NET solution. I have personally set up a .NET solution that included separate projects for NUnit unit tests and SpecFlow feature tests.
  • In Java, project structure depends upon the project’s build automation tool. Most Java projects seem to use Maven or Gradle. In Maven’s Standard Directory Layout, tests belong under “src/test”. Different test types can be placed under separate packages there. The POM might need some extra configuration to run tests at different build phases.
  • In JavaScript, test placement depends heavily upon the project type. For example, Angular creates separate directories for unit tests using Jasmine and end-to-end tests using Protractor when initializing a new project.

Do What’s Best

Different test tools and frameworks meet different needs. No single one can solve all problems. Make sure to use the right tools for the problems at hand. Don’t force yourself to use the wrong thing simply because it is already used elsewhere.

4 Rules for Writing Good Gherkin

In Behavior-Driven Development, Gherkin is a specification language for defining behaviors. Gherkin scenarios can easily be automated into test cases. There are many points to writing good Gherkin, but to keep things simple, I like to focus on four major rules:

  1. Gherkin’s Golden Rule
  2. The Cardinal Rule of BDD
  3. The Unique Example Rule
  4. The Good Grammar Rule

Check out my TechBeacon article to learn about these rules in depth!

How Do We Write Good Gherkin as Part of BDD? (Webinar + Q&A)

On July 23, 2019, I gave a webinar entitled, “How Do We Write Good Gherkin as Part of BDD?” in collaboration with Paul Merrill and his company, Beaufort Fairmont. This webinar was the follow-up to a previous webinar, What Is BDD, and How Do We Practice It? It was an honor to partner with Paul again to go further into BDD practices. (If you want to learn more about BDD, check out Beaufort Fairmont’s two-day BDD training offering, as well as their blog and other webinars.)

To see my webinar recording, register here. Definitely watch the previous webinar first.

Just like last time, attendees asks several great questions that we simply could not answer live. I categorized all questions we received and answered them below. Please note that some questions might be rephrased or combined with others.

Questions about BDD

What is BDD?

Behavior-Driven Development! Read more here.

In a typical Agile development process, who should write feature files?

The Three Amigos! Product owners, developers, and testers should all come together to figure out behaviors. I recommend doing Example Mapping to formulate before writing Gherkin scenarios. The green example cards should be turned into feature files. The specific person who writes the feature files is up to team preference. It could be a collaborative effort, or it could be divided-and-conquered. Any one of the Three Amigos can do it.

How can we apply BDD to SAFe (Scaled Agile Framework) teams?

BDD practices like Three Amigos meetings, Example Mapping, Behavior Specification with Gherkin, and Behavior Implementation can become part of any process. All of these practices happen at the level of the development teams. Teams could even share Gherkin steps and test frameworks wherever sharing makes sense. Check out BDD 101: Behavior-Driven Agile.

What advice can you give to teams that use BDD tests frameworks solely as an automation tool and not part of a greater BDD process?

Do the best with what you’ve got. Try to show how other BDD practices can pragmatically improve your team’s development and delivery work. See also:

Questions about Gherkin Syntax

What is the difference between a scenario and a scenario outline?

A scenario is a procedure of Given-When-Then steps that covers one example for one behavior. If there are any parameters for steps, then a scenario has exactly one combination of possible inputs. A scenario outline is a Given-When-Then procedure that can have multiple examples of one behavior provided as a table of input combos. Each input row will run the same steps once, just with different parameter inputs. See BDD 101: Gherkin by Example to see examples.

What do you think about long tables in scenarios?

Long tables in Gherkin usually look terrible. They’re hard to read, and they create a wall of text. They may also include unnecessary variations. Stick to the Unique Example rule.

Are Given steps mandatory, or can scenarios start directly with When steps?

None of the step types are mandatory. It is valid to write a scenario that skips the Given and has only When-Then steps. It is also valid to write scenarios that are Given-Then or Given-When. In fact, it is syntactically valid to put steps in any order. However, I strongly recommend keeping Given-When-Then step order to properly frame behaviors.

Are quotation marks required for parameters?

No, quotation marks are not required for parameters, but they are a popular convention, and one that I recommend. Quotes make parameters easy to identify.

Questions about Gherkin Scenarios

How do we make sure each scenario focuses on an individual, independent behavior?

Do Example Mapping first as a team. Write scenarios together, or review them with others. Ask, “What makes this behavior unique?” Make sure to use strict Given-When-Then step order when defining the behavior. Rethink the scenario if it is more than 10 lines long. Look out for unnecessary complication.

What does it mean for a scenario to be “chronological”?

Scenario steps should be written as if they were on a timeline. Each step will be executed after the previous one, so its description must start where the previous one ended. Remember, steps will be automated as if they were scripts.

How do we write a very low-level scenario without having a wall of text?

Don’t write low-level scenarios! Gherkin is best for feature testing, not unit testing. Steps should focus on intention and business value. Instead of writing “type, type, click, wait,” write “log into the app.” If you absolutely must write a low-level scenario, remember that the same principles apply. Be intuitively descriptive. Focus on individual behaviors. Keep scenarios concise.

If all scenarios in a feature file have only one user, is it okay to use first-person perspective instead of third-person?

In my opinion, no. I favor third-person perspective universally. Trying to limit usage to one feature file won’t work because any step can be used by any feature file within a test project. The entire solution must be either first-person or third-person. There’s no middle ground.

Can we write Gherkin scenarios with personas?

Yes! Personas can make scenarios more meaningful and understandable. Make sure to define the personas well – they could be described under the Feature section or in a separate text file.

How do we write Gherkin scenarios that need to validate lots of information on a page?

Pick the most important pieces of information to check. You could write separate Then steps for each assertion, or you could push small-but-similar validations down to the automation level to avoid Gherkin clutter.

How do we write Gherkin scenarios for validating Web UI fields?

Typically, I treat each field validation as an independent behavior, and thus I write separate scenarios to check each field. If the scenario steps simply enter a textual value and verify a specific message, then I might make a Scenario Outline with example rows for each equivalence class of inputs.

How do we write Gherkin scenarios that have multiple inputs and setup steps? (Example: an API with ten parameters)

Gherkin allows multiple steps of the same type to be written using “And” and “But” keywords. It’s not a problem to have “Given-And-And” or “When-And-And”. If you discover that different scenarios repeat the same setup steps, then I recommend either moving those common steps to a Background section or writing a new step that covers multiple calls (for conciseness).

One example from the webinar showed searching for shoes and adding them to a shopping cart as part of one scenario. Aren’t those two different behaviors?

Here’s the scenario in question:

Scenario: Add shoes to the shopping cart
  Given the ShoeStore home page is displayed
  When the shopper searches for “red pumps”
  And the shopper adds the first result to the cart
  Then the cart has one pair of “red pumps”

We could have split this scenario into two. I just chose to define the behavior this way. This scenario is a bit more end-to-end because it covers a basic but typical workflow. It may also have leveraged existing steps, which expedites automation development. Overall, the scenario is still concise, chronological, and intuitively understandable. Remember, there is an art as well as a science to writing good Gherkin.

Questions about Automation

Do scenarios need to be independent of each other?

Yes, unequivocally. Tests that are not independent could interfere with each other and cause unexpected failures. Independence also reinforces singular behavioral focus.

How do we start a scenario “in media res” without it depending on other tests?

At the Gherkin level, write Given steps that define a new starting point for the behavior. For example, many teams develop Web apps. It’s common to think that the starting point for all tests is login. However, the starting point can be a few pages after login.

At the automation level, it may be useful to implement the Given steps by calling other steps. For example, if a Given step should start at a user’s profile page, then perhaps it could internally call the login step and the click-the-profile-link step. Test steps may repetitively do the same operations for different tests, but test case independence will be preserved, and unique failures will be reported.

What is the best way to handle preconditions like logging into a Web app?

The simplest way to handle preconditions is to write Given steps. If those Given steps are shared by all scenarios in a feature file, then move them to a Background section. Automation hooks can also perform common setup and cleanup actions, depending upon the test framework. Personally, I prefer to use hooks to do automatic login rather than repeat Given steps for many scenarios.

Is it better to set up and tear down new test objects for each test case, or is it better to use shared, pre-created objects?

That depends upon the object. Most objects like WebDrivers and page objects should have scenario scope, meaning they are created fresh for each scenario and then torn down when the scenario ends. The only time an object should be shared across scenarios is if it is immutable or very expensive to create. For example, configuration data could be read in once before all tests and then injected immutably into each scenario. The safe position is always to use fresh objects; justify why sharing is needed before trying it.

I want to use Serenity for BDD and testing. Should I use Cucumber-like Gherkin feature files, or should I use Serenity’s native methods?

That’s up to you and your team. Personally, I would still use Gherkin feature files with Serenity. I like to separate my test case from my test code. Everyone can read Gherkin feature files, but not everyone can read Java or JavaScript test methods.

If a company already has a large BDD test solution that is poorly implemented, would it be better to keep it going or try to change it?

This question can be applied to all software projects, not just BDD test solutions. The answer is situational. Personally, I favor doing things right, even if it means refactoring. Please read Our Test Automation Has Problems. Should We Start Over? for a thorough answer.

Final Questions

Why do you call yourself “Pandy” and the “Automation Panda”?

Pandas are awesome. Everybody loves them. And nobody forgets my moniker. The nickname “Pandy” came about in the Python community to distinguish me from other folks named “Andy.”

Where can I get team training in BDD?

Beaufort Fairmont provides a one- or two-day course in BDD and writing Gherkin. Sign up for more information here.