Managing the Test Data Nightmare

On April 22, 2021, I delivered a talk entitled “Managing the Test Data Nightmare” at SauceCon 2021. SauceCon is Sauce Labs’ annual conference for the testing community. Due to the COVID-19 pandemic, the conference was virtual, but I still felt a bit of that exciting conference buzz.

My talk covers the topic of test data, which can be a nightmare to handle. Data must be prepped in advance, loaded before testing, and cleaned up afterwards. Sometimes, teams don’t have much control over the data in their systems under test—it’s just dropped in, and it can change arbitrarily. Hard-coding values into tests that reference system tests can make the tests brittle, especially when running tests in different environments.

In this talk, I covered strategies for managing each type of test data: test case variations, test control inputs, config metadata, and product state. I also covered how to “discover” test data instead of hard-coding it, how to pass inputs into automation (including secrets like passwords), and how to manage data in the system. After watching this talk, you can wake up from the nightmare and handle test data cleanly and efficiently like a pro!

Here are some other articles I wrote about test data:

As usual, I hit up Twitter throughout the conference. Here are some action shots:

Many thanks to Sauce Labs and all the organizers who made SauceCon 2021 happen. If SauceCon was this awesome as a virtual event, then I can’t wait to attend in person (hopefully) in 2022!

Announcing Boa Constrictor Docs!

Doc site:

Boa Constrictor is a C# implementation of the Screenplay Pattern. My team and I at PrecisionLender, a Q2 Company, developed Boa Constrictor as part of our test automation solution. Its primary use case is Web UI and REST API test automation. Boa Constrictor helps you make better interactions for better automation!

Our team released Boa Constrictor as an open source project on GitHub in October 2020. This week, we published a full documentation site for Boa Constrictor. They include an introduction to the Screenplay Pattern, a quick-start guide, a full tutorial, and ways to contribute to the project. The doc site itself uses GitHub Pages, Jekyll, and Minimal Mistakes.

Our team hopes that the docs help you with testing and automation. Enjoy!

Testing GitHub Pages without Local Jekyll Setup

TL;DR: If you want to test your full GitHub Pages site before publishing but don’t want to set up Ruby and Jekyll on your local machine, then:

  1. Commit your doc changes to a new branch.
  2. Push the new branch to GitHub.
  3. Temporarily change the repository’s GitHub Pages publishing source to the new branch.
  4. Reload the GitHub Pages site, and review the changes.

If you have a GitHub repository, did you know that you can create your own documentation site for it within GitHub? Using GitHub Pages, you can write your docs as a set of Markdown pages and then configure your repository to generate and publish a static web site for those pages. All you need to do is configure a publishing source for your repository. Your doc site will go live at:


If this is new to you, then you can learn all about this cool feature from the GitHub docs here: Working with GitHub Pages. I just found out about this cool feature myself!

GitHub Pages are great because they make it easy to develop docs and code together as part of the same workflow without needing extra tools. Docs can be written as Markdown files, Liquid templates, or raw assets like HTML and CSS. The docs will be version-controlled for safety and shared from a single source of truth. GitHub Pages also provides free hosting with a decent domain name for the doc site. Clearly, the theme is simplicity.

Unfortunately, I hit one challenge while trying GitHub Pages for the first time: How could I test the doc site before publishing it? A repository using GitHub Pages must be configured with a specific branch and folder (/ (root) or /docs) as the publishing source. As soon as changes are committed to that source, the updated pages go live. However, I want a way to view the doc site in its fullness before committing any changes so I don’t accidentally publish any mistakes.

One way to test pages is to use a Markdown editor. Many IDEs have Markdown editors with preview panes. Even GitHub’s web editor lets you preview Markdown before committing it. Unfortunately, while editor previews may help catch a few typos, they won’t test the full end result of static site generation and deployment. They may also have trouble with links or templates.

GitHub’s docs recommend testing your site locally using Jekyll. Jekyll is a static site generator written in Ruby. GitHub Pages uses Jekyll behind the scenes to turn doc pages into full doc sites. If you want to keep your doc development simple, you can just edit Markdown files and let GitHub do the dirty work. However, if you want to do more hands-on things with your docs like testing site generation, then you need to set up Ruby and Jekyll on your local machine. Thankfully, you don’t need to know any Ruby programming to use Jekyll.

I followed GitHub’s instructions for setting up a GitHub Pages site with Jekyll. I installed Ruby and Jekyll and then created a Jekyll site in the /docs folder of my repository. I verified that I could edit and run my site locally in a branch. However, the setup process felt rather hefty. I’m not a Ruby programmer, so setting up a Ruby environment with a few gems felt like a lot of extra work just to verify that my doc pages looked okay. Plus, I could foresee some developers getting stuck while trying to set up these doc tools, especially if the repository’s main code isn’t a Ruby project. Even if setting up Jekyll locally would be the “right” way to develop and test docs, I still wanted a lighter, faster alternative.

Thankfully, I found a workaround that didn’t require any tools outside of GitHub: Commit doc changes to a branch, push the branch to GitHub, and then temporarily change the repository’s GitHub Pages source to the branch! I originally configured my repository to publish docs from the /docs folder in the main branch. When I changed the publishing source to another branch, it regenerated and refreshed the GitHub Pages site. When I changed it back to main, the site reverted without any issues. Eureka! This is a quick, easy hack for testing changes to docs before merging them. You get to try the full site in the main environment without needing any additional tools or setup.

Above is a screenshot of the GitHub Pages settings for one of my repositories. You can find these settings under Settings -> Options for any repository, as long as you have the administrative rights. In this screenshot, you can see how I changed the publishing source’s branch from main to docs/test. As soon as I selected this change, GitHub Pages republished the repository’s doc site.

Now, I recognize that this solution is truly a hack. Changing the publishing source affects the “live”, “production” version of the site. It effectively does publish the changes, albeit temporarily. If some random reader happens to visit the site during this type of testing, they may see incorrect or even broken pages. I’d recommend changing the publishing source’s branch only for small projects and for short periods of time. Don’t forget to revert the branch once testing is complete, too. If you are working on a larger, more serious project, then I’d recommend doing full setup for local doc development. Local setup would be safer and would probably make it easier to try more advanced tricks, like templates and themes.

The Automation Panda Origin Story

In February 2021, Matthew Weeks interviewed me for the Work in Programming podcast. Matthew asked all sorts of questions about my story – how I got into programming, what I learned at different companies, and why I started blogging and speaking. I greatly enjoyed our conversation, so much so that it lasted an hour and a half!

If you’re interested to hear how my career has gone from high school to the present day, please give it a listen. There are some juicy anecdotes along the way. The link is below. Many thanks to Matthew for hosting and editing the interview. Be sure to listed to other Work in Programming interviews, too!

Work in Programming by Matthew Weeks: Andy Knight – Automation Panda origin story, BDD, test automation before it was cool

Solving: How to write good UI interaction tests? #GivenWhenThenWithStyle

Writing good Gherkin is a passion of mine. Good Gherkin means good behavior specification, which results in better features, better tests, and ultimately better software. To help folks improve their Gherkin skills, Gojko Adzic and SpecFlow are running a series of #GivenWhenThenWithStyle challenges. I love reading each new challenge, and in this article, I provide my answer to one of them.

The Challenge

Challenge 20 states:

This week, we’re looking into one of the most common pain points with Given-When-Then: writing automated tests that interact with a user interface. People new to behaviour driven development often misunderstand what kind of behaviour the specifications should describe, and they write detailed user interactions in Given-When-Then scenarios. This leads to feature files that are very easy to write, but almost impossible to understand and maintain.

Here’s a typical example:

Scenario: Signed-in users get larger capacity
Given a user opens using Chrome
And the user clicks on "Upload Files"
And the page reloads
And the user clicks on "Spreadsheet Formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "500kb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 
And the user clicks on "Login"
And the user enters "testuser123" into the "username" field
And the user enters "$Pass123" into the "password" field
And the user clicks on "Sign in"
And the page reloads
Then the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 
And the user clicks on "spreadsheet formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "1mb-sheet.xlsx" 
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx"

A common way to avoid such issues is to rewrite the specification to avoid the user interface completely. We’ve looked into that option several times in this article series. However, that solution only applies if the risk we’re testing is not in the user interface, but somewhere below. To make this challenge more interesting, let’s say that we actually want to include the user interface in the test, since the risk is in the UI interactions.

Indeed, most behavior-driven practitioners would generally recommend against phrasing steps using language specific to the user interface. However, there are times when testing a user interface itself is valid. For example, I work at PrecisionLender, a Q2 Company, and our main web app is very heavy on the front end. It has many, many interconnected fields for pricing commercial lending opportunities. My team has quite a few tests to cover UI-centric behaviors, such as verifying that entering a new interest rate triggers recalculation for summary amounts. If the target behavior is a piece of UI functionality, and the risk it bears warrants test coverage, then so be it.

Let’s break down the example scenario given above to see how to write Gherkin with style for user interface tests.

Understanding Behavior

Behavior is behavior. If you can describe it, then you can do it. Everything exhibits behavior, from the source code itself to the API, UIs, and full end-to-end workflows. Gherkin scenarios should use verbiage that reflects the context of the target behavior. Thus, the example above uses words like “click,” “select,” and “open.” Since the scenario explicitly covers a user interface, I think it is okay to use these words here. What bothers me, however, are two apparent code smells:

  1. The wall of text
  2. Out-of-order step types

The first issue is the wall of text this scenario presents. Walls of text are hard to read because they present too much information at once. The reader must take time to read through the whole chunk. Many readers simply read the first few lines and then skip the remainder. The example scenario has 27 Given-When-Then steps. Typically, I recommend Gherkin scenarios to have single-digit line length. A scenario with less than 10 steps is easier to understand and less likely to include unnecessary information. Longer scenarios are not necessarily “wrong,” but their longer lengths indicate that, perhaps, these scenarios could be rewritten more concisely.

The second issue in the example scenario is that step types are out of order. Given-When-Then is a formula for success. Gherkin steps should follow strict Given → When → Then ordering because this ordering demarcates individual behaviors. Each Gherkin scenario should cover one individual behavior so that the target behavior is easier to understand, easier to communicate, and easier to investigate whenever the scenario fails during testing. When scenarios break the order of steps, such as Given → Then → Given → Then in the example scenario, it shows that either the scenario covers multiple behaviors or that the author did not bring a behavior-driven understanding to the scenario.

The rules of good behavior don’t disappear when the type of target behavior changes. We should still write Gherkin with best practices in mind, even if our scenarios cover user interfaces.

Breaking Down Scenarios

If I were to rewrite the example scenario, I would start by isolating individual behaviors. Let’s look at the first half of the original example:

Given a user opens using Chrome
And the user clicks on "Upload Files"
And the page reloads
And the user clicks on "Spreadsheet Formats"
Then the buttons "XLS" and "XLSX" show
And the user clicks on "XLSX"
And the user selects "500kb-sheet.xlsx"
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
And the user clicks on "XLSX"
And the user selects "1mb-sheet.xlsx"
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx"

Here, I see four distinct behaviors covered:

  1. Clicking “Upload Files” reloads the page.
  2. Clicking “Spreadsheet Formats” displays new buttons.
  3. Uploading a spreadsheet file makes the filename appear on the page.
  4. Attempting to upload a spreadsheet file that is 1MB or larger fails.

If I wanted to purely retain the same coverage, then I would rewrite these behavior specs using the following scenarios:

Feature: Example site
Scenario: Choose to upload files
Given the Example site is displayed
When the user clicks the "Upload Files" link
Then the page displays the "Spreadsheet Formats" link
Scenario: Choose to upload spreadsheets
Given the Example site is ready to upload files
When the user clicks the "Spreadsheet Formats" link
Then the page displays the "XLS" and "XLSX" buttons
Scenario: Upload a spreadsheet file that is smaller than 1MB
Given the Example site is ready to upload spreadsheet files
When the user clicks the "XLSX" button
And the user selects "500kb-sheet.xlsx" from the file upload dialog
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
Scenario: Upload a spreadsheet file that is larger than or equal to 1MB
Given the Example site is ready to upload spreadsheet files
When the user clicks the "XLSX" button
And the user selects "1mb-sheet.xlsx" from the file upload dialog
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx"

Now, each scenario covers each individual behavior. The first scenario starts with the Example site in a “blank” state: “Given the Example site is displayed”. The second scenario inherently depends upon the outcome of the first scenario. Rather than repeat all the steps from the first scenario, I wrote a new starting step to establish the initial state more declaratively: “Given the Example site is ready to upload files”. This step’s definition method may need to rerun the same operations as the first scenario, but it guarantees independence between scenarios. (The step could also optimize the operations, but that should be a topic for another challenge.) Likewise, the third and fourth scenarios have a Given step to establish the state they need: “Given the Example site is ready to upload spreadsheet files.” Both scenarios can share the same Given step because they have the same starting point. All three of these new steps are descriptive more than prescriptive. They declaratively establish an initial state, and they leave the details to the automation code in the step definition methods to determine precisely how that state is established. This technique makes it easy for Gherkin scenarios to be individually clear and independently executable.

I also added my own writing style to these scenarios. First, I wrote concise, declarative titles for each scenario. The titles dictate interaction over mechanics. For example, the first scenario’s title uses the word “choose” rather than “click” because, from the user’s perspective, they are “choosing” an action to take. The user will just happen to mechanically “click” a link in the process of making their choice. The titles also provide a level of example. Note that the third and fourth scenarios spell out the target file sizes. For brevity, I typically write scenario titles using active voice: “Choose this,” “Upload that,” or “Do something.” I try to avoid including verification language in titles unless it is necessary to distinguish behaviors.

Another stylistic element of mine was to remove explicit details about the environment. Instead of hard coding the website URL, I gave the site a proper name: “Example site.” I also removed the mention of Chrome as the browser. These details are environment-specific, and they should not be specified in Gherkin. In theory, this site could have multiple instances (like an alpha or a beta), and it should probably run in any major browser (like Firefox and Edge). Environmental characteristics should be specified as inputs to the automation code instead.I also refined some of the language used in the When and Then steps. When I must write steps for mechanical actions like clicks, I like to specify element types for target elements. For example, “When the user clicks the “Upload Files” link” specifies a link by a parameterized name. Saying the element is a link helps provides context to the reader about the user interface. I wrote other steps that specify a button, too. These steps also specified the element name as a parameter so that the step definition method could possibly perform the same interaction for different elements. Keep in mind, however, that these linguistic changes are neither “required” nor “perfect.” They make sense in the immediate context of this feature. While automating step definitions or writing more scenarios, I may revisit the verbiage and do some refactoring.

Determining Value for Each Behavior

The four new scenarios I wrote each covers an independent, individual behavior of the fictitious Example site’s user interface. They are thorough in their level of coverage for these small behaviors. However, not all behaviors may be equally important to cover. Some behaviors are simply more important than others, and thus some tests are more valuable than others. I won’t go into deep detail about how to measure risk and determine value for different tests in this article, but I will offer some suggestions regarding these example scenarios.

First and foremost, you as the tester must determine what is worth testing. These scenarios aptly specify behavior, and they will likely be very useful for collaborating with the Three Amigos, but not every scenario needs to be automated for testing. You as the tester must decide. You may decide that all four of these example scenarios are valuable and should be added to the automated test suite. That’s a fine decision. However, you may instead decide that certain user interface mechanics are not worth explicitly testing. That’s also a fine decision.

In my opinion, the first two scenarios could be candidates for the chopping block:

  1. Choose to upload files
  2. Choose to upload spreadsheets

Even though these are existing behaviors in the Example site, they are tiny. The tests simply verify that a user clicks makes certain links or buttons appear. It would be nice to verify them, but test execution time is finite, and user interface tests are notoriously slow compared to other tests. Consider the Rule of 1’s: typically, by orders of magnitude, a unit test takes about 1 millisecond, a service API test takes about 1 second, and a web UI test takes about 1 minute. Furthermore, these behaviors are implicitly exercised by the other scenarios, even if they don’t have explicit assertions.

One way to condense the scenarios could be like this:

Feature: Example site
Given the Example site is displayed
When the user clicks the "Upload Files" link
And the user clicks the "Spreadsheet Formats" link
And the user clicks the "XLSX" button
Scenario: Upload a spreadsheet file that is smaller than 1MB
When the user selects "500kb-sheet.xlsx" from the file upload dialog
Then the upload completes
And the table "Uploaded Files" contains a cell with "500kb-sheet.xlsx" 
Scenario: Upload a spreadsheet file that is larger than or equal to 1MB
When the user selects "1mb-sheet.xlsx" from the file upload dialog
Then the upload fails
And the table "Uploaded Files" does not contain a cell with "1mb-sheet.xlsx" 

This new feature file eliminates the first two scenarios and uses a Background section to cover the setup steps. It also eliminates the need for special Given steps in each scenario to set unique starting points. Implicitly, if the “Upload Files” or “Spreadsheet Formats” links fail to display the expected elements, then those steps would fail.

Again, this modification is not necessarily the “best” way or the “right” way to cover the desired behaviors, but it is a reasonably good way to do so. However, I would assert that both the 4-scenario feature file and the 2-scenario feature file are much better approaches than the original example scenario.

More Gherkin

What I showed in my answer to this Gherkin challenge is how I would handle UI-centric behaviors. I try to keep my Gherkin scenarios concise and focused on individual, independent behaviors. Try using these style techniques to rewrite the second half of Gojko’s original scenario. Feel free to drop your Gherkin in the comments below. I look forward to seeing how y’all write #GivenWhenThenWithStyle!

Extending Grace in Small Ways

Back in 2011, I was a recent college grad working at IBM as a “performance engineer” for z/OS mainframe software. Now, I didn’t know anything about mainframes, but I was thankful to have a job on the heels of the Great Recession.

At the time, IBM had recently released the Jazz platform with Rational Team Concert (RTC), a collaborative project management tool geared towards Agile software development. Teams company-wide started adopting RTC whether they wanted it or not. My team was no different: we created a team project in RTC and started writing work items in it. In my opinion, RTC was decent. It was very customizable, and its aesthetics and user experience were better than other tools at the time.

One day, I made a typo while trying to assign a work item to myself. When typing a name into the “owner” field, RTC would show a list of names from which to choose. For whatever reason, the list included all IBM employees, not just members from my team. IBM had nearly 400,000 employees worldwide at the time. I accidentally selected someone else with a similar name to mine. Blissfully unaware of my mistake, I proceeded to save the work item and start doing the actual work for it.

About a day later, I received a nastygram from another IBMer named Andrea Knight, demanding to know why I assigned her this work item in RTC. I had never met this person before, and she certainly wasn’t on my team. (To be honest, I don’t remember exactly what her name was, but for the sake of the story, we can call her Andrea.) At first, I felt perplexed. Then, once I read her message, I quickly realized that I must have accidentally listed her as the owner of the work item. I immediately corrected the mistake and humbly replied with an apology for my typo. No big deal, right?

Well, Andrea replied to my brief apology later that day to inform me that she was NOT responsible for that work item because she had NEVER seen it before and that she would NOT do any work for it.

Really GIFs | Tenor

I was quite taken back by her response.

I let it go, but I couldn’t help but wonder why she would answer that way. Perhaps she was having a bad day? Perhaps her manager scrutinized all work items bearing her name? Perhaps the culture in her part of the company was toxic? Was my mistake that bad?

Even though this incident was small, it taught me one important lesson early in my career: a little bit of grace goes a long way. Poor reactions create awkward situations, hurt feelings, and wasted time. If we make a mistake, we should fix it and apologize. If someone else makes a mistake, we should strive to be gracious instead of unpleasant. I try to practice this myself, though, sometimes, I fail.

Nobody is perfect. That’s why we all need grace.

Improving Teamwork with SpecFlow+ LivingDoc

SpecFlow is an excellent Behavior-Driven Development test framework for .NET. Recently, SpecFlow released a new reporting tool called SpecFlow+ LivingDoc, which generates living documentation for features. It combines all scenarios from all SpecFlow feature files into one central HTML report. The report looks crisp and professional. It is filterable and can optionally show test results. Teams can generate updated reports as part of their Continuous Integration pipelines. The best part is that SpecFlow+ LivingDoc, along with all other features, is completely free to use – all you need to do is register for a free SpecFlow account. There is no reason for any SpecFlow project to not also use LivingDoc.

SpecFlow provides rich documentation on all of SpecFlow+ LivingDoc’s benefits, features, and configurations. In this article, I won’t simply repeat what the official docs already state. Instead, I’m going to share how my team and I at PrecisionLender, a Q2 Company, adopted SpecFlow+ LivingDoc into our test automation solution. I’ll start by giving a brief overview of how we test the PrecisionLender web app. Then, I’ll share why we wanted to make LivingDoc part of our quality workflow. Next, I’ll walk through how we added the new report to our testing pipelines. Finally, as an advanced technique, I’ll show how we modified some of the LivingDoc data files to customize our reports. My goal for this article is to demonstrate the value SpecFlow+ LivingDoc adds to BDD collaboration and automation practices.

PrecisionLender’s Test Automation

PrecisionLender is a web application that empowers commercial bankers with in-the-moment insights that help them structure and price commercial deals. Andi®, PrecisionLender’s intelligent virtual analyst, delivers these hyper-focused recommendations in real time allowing relationship managers to make data-driven decisions while pricing their commercial deals.

The PrecisionLender Opportunity Screen
(Picture taken from the PrecisionLender Support Center)

The PrecisionLender app is quite complex. It has several rich features to help bankers price any possible nuance for loan opportunities. Some banks also have unique configurations and additional features that make testing challenging.

On top of thorough unit testing, we run suites of end-to-end tests against the PrecisionLender web app. Our test automation solution is named “Boa,” and it is written in C# using SpecFlow for test cases and Boa Constrictor for Web UI and REST API interactions. We use BDD practices like Three Amigos, Example Mapping, and Good Gherkin to develop behaviors and cover them with automated tests. As of January 2021, Boa has over 1400 unique tests that target multiple test bank configurations. We run Boa tests continuously (for every code change), nightly (across all test banks), and “release-ly” (every two weeks before production deployments) at a rate of ~15K test iterations per week. Each test takes roughly half a minute to complete, and we run tests in parallel with up to 32 threads using Selenium Grid.

Introducing SpecFlow+ LivingDoc

SpecFlow+ LivingDoc is living documentation for features. SpecFlow started developing the tool a few years ago, but in recent months under Tricentis, they have significantly ramped up its development with the standalone generator and numerous feature enhancements. To learn about LivingDoc, watch this short introduction video:

When I saw the new SpecFlow+ LivingDoc reports, I couldn’t wait to try them myself. I love SpecFlow, and I’ve used it daily for the past few years. I knew it would bring value to my team at PrecisionLender.

Why Adopt SpecFlow+ LivingDoc?

My team and I wanted to bring SpecFlow+ LivingDoc into our testing workflow for a few reasons. First and foremost, we wanted to share our features with every team member, whether they were in business, development, or testing roles. I originally chose SpecFlow to be the core test framework for our Boa tests because I wanted to write all tests in plain-language Gherkin. That way, product owners and managers could read and understand our tests. We could foster better discussions about product behaviors, test coverage, and story planning. However, even though tests could be understood by anyone, we didn’t have an effective way to share them. Feature files for tests must be stored together with automation code in a repository. Folks must use Visual Studio or a version control tool like Git to view them. That’s fine for developers, but it’s inaccessible for folks who don’t code. SpecFlow+ LivingDoc breaks down that barrier. It combines all scenarios from all feature files into one consolidated HTML report. Folks could use a search bar to find the tests they need instead of plunging through directories of feature files. The report could be generated by Continuous Integration pipelines, published to a shared dashboard, or emailed directly to stakeholders. Pipelines could also update LivingDoc reports any time features change. SpecFlow+ LivingDoc would enable us to actually share our features instead of merely saying that we could.

SpecFlow+ LivingDoc Living Documentation for PrecisionLender Features

Second, we liked the concise test reporting that SpecFlow+ LivingDoc offered. The SpecFlow+ Runner report, which our team already used, provides comprehensive information about test execution: full log messages, duration times, and a complete breakdown of pass-or-fail results by feature and scenario. That information is incredibly helpful when determining why tests fail, but it is too much information when reporting failures to managers. LivingDoc provides just the right amount of information for reporting high-level status: the tests, the results per test, and the pass-or-fail totals. Folks can see test status at a glance. The visuals look nice, too.

SpecFlow+ LivingDoc Analytics for PL App Boa Tests

Third, we wanted to discover any unused step definitions in our C# automation code. The Boa test solution is a very large automation project. As of January 2020, it has over 1400 unique tests and over 1100 unique step definitions, and those numbers will increase as we continue to add new tests. Maintaining any project of this size is a challenge. Sometimes, when making changes to scenarios, old step definitions may no longer be needed, but testers may not think to remove them from the code. These unused step definitions then become “dead code” that bloats the repository. It’s easy to lose track of them. SpecFlow+ LivingDoc offers a special option to report unused step definitions on the Analytics tab. That way, the report can catch dead steps whenever they appear. When I generated the LivingDoc report for the Boa tests, I discovered over a hundred unused steps!

SpecFlow+ LivingDoc Unused Step Definitions

Fourth and finally, my team and I needed a test report that we could share with customers. At PrecisionLender, our customers are banks – and banks are very averse to risk. Some of our customers ask for our test reports so they can take confidence in the quality of our web app. When sharing any information with customers, we need to be careful about what we do share and what we don’t share. Internally, our Boa tests target multiple different system configurations, and we limit the test results we share with customers to the tests for the features they use. For example, if a bank doesn’t factor deposits into their pricing calculations, then that bank’s test report should not include any tests for deposits. The reports should also be high-level instead of granular: we want to share the tests, their scenarios, and their pass-or-fail results, but nothing more. SpecFlow+ LivingDoc fits this need perfectly. It provides Gherkin scenarios with their steps in a filterable tree, and it visually shows results for each test as well as in total. With just a little bit of data modification (as shown later in this article), the report can include exactly the intended scenarios. Our team could use LivingDoc instead of generating our own custom report for customers. LivingDoc would look better than any report we would try to make, too!

Setting Up SpecFlow+ LivingDoc

At PrecisionLender, we currently use JetBrains TeamCity to schedule and launch Boa tests. Some tests launch immediately after app deployments, while others are triggered based on the time of day. When a test pipeline is launched, it follows these steps:

  1. Check out the code repository.
  2. Build the Boa test automation solution.
  3. For each applicable bank configuration, run appropriate Boa tests.

We wanted to add SpecFlow+ LivingDoc in two places: after the build completes and after tests run for each configuration. The LivingDoc generated for the build step would not include test results. It would show all scenarios in all features, and it would also include the unused step definitions. This report would be useful for showing folks our tests and our coverage. The LivingDoc generated for each test run, however, would include test results. Since we run tests against multiple configurations, each run would need its own LivingDoc report. Not all tests run on each configuration, too. Generating LivingDoc reports at each pipeline step serve different needs.

Adding SpecFlow+ LivingDoc to our testing pipelines required only a few things to set up. The first step was to add the SpecFlow.Plus.LivingDocPlugin NuGet package to the .NET test project. Adding this NuGet package makes SpecFlow automatically save test results to a file named TestExecution.json every time tests run. The docs say you can customize this output path using specflow.json, too.

Required SpecFlow NuGet packages, including SpecFlow.Plus.LivingDoc
An example snippet of TestExecution.json

The next step was to install the LivingDoc CLI tool on our TeamCity agents. The CLI tool is a dotnet command line tool, so you need the .NET Core SDK 3.1 or higher. Also, note that you cannot install this package as a NuGet dependency for your .NET test project. (I tried to do that in the hopes of simplifying my build configuration, but NuGet blocks it.) You must install it to your machine’s command line. The installation command looks like this:

dotnet tool install --global SpecFlow.Plus.LivingDoc.CLI

After installing the LivingDoc CLI tool, the final step was to invoke it after each build and each test run. There are three sources from which to generate LivingDoc reports:

  1. Using a folder of feature files
  2. Using a SpecFlow test assembly (.dll)
  3. Using a feature data JSON file previously generated by the LivingDoc CLI tool

For generating LivingDoc after the build, I used this command in PowerShell to include unused steps but exclude test results:

livingdoc test-assembly "$TestAssemblyPath" --binding-assemblies "$TestAssemblyPath" --output-type HTML --output "$LivingDocDir\PLAppLivingDoc.html"

Then, for generating LivingDoc after test runs, I used this command in PowerShell that included TestExecution.json:

livingdoc test-assembly "$TestAssemblyPath" --test-execution-json "$TestExecutionPath" --output-type HTML --output "$HtmlReportPath" --title "PL App Boa Tests"

All the “$” variables are paths configured in our TeamCity projects. I chose to generate reports using the test assembly because I discovered that results wouldn’t appear in the report if I generated them from the feature folder.

Here’s what SpecFlow+ LivingDoc looks like when published as a TeamCity report:

SpecFlow+ LivingDoc report in TeamCity for the build (without test results)

Our team can view reports from TeamCity, or they can download them to view them locally.

Modifying SpecFlow+ LivingDoc Data

As I mentioned previously in this article, my team and I wanted to share SpecFlow+ LivingDoc reports with some of our customers. We just needed to tweak the contents of the report in two ways. First, we needed to remove scenarios that were inapplicable (meaning not executed) for the bank. Second, we needed to remove certain tags that we use internally at PrecisionLender. Scrubbing this data from the reports would give our customers what they need without including information that they shouldn’t see.

Thankfully, SpecFlow+ LivingDoc has a “backdoor” in its design that makes this kind of data modification easy. When generating a LivingDoc report, you can set the --output-type parameter to be “JSON” instead of “HTML” to generate a feature data JSON file. The feature data file contains all the data for the LivingDoc report in JSON format, including scenarios and tags. You can modify the data in this JSON file and then use it to generate an HTML LivingDoc report. Modifying JSON data is much simpler and cleaner than painfully splicing HTML text.

An example snippet of a feature data JSON file

I wrote two PowerShell scripts to modify feature data. Both are available publicly in GitHub at AndyLPK247/SpecFlowPlusLivingDocScripts. You can copy them from the repository to use them for your project, and you can even enhance them with your own changes. Note that the feature data JSON files they use must be generated from test assemblies, not from feature data folders.

The first script is RemoveSkippedScenarios.ps1. It takes in both a feature data JSON file and a test execution JSON file, and it removes all scenarios from the feature data that did not have results in the test execution data. It uses recursive functions to traverse the feature data JSON “tree” of folders, features, and scenarios. Removing unexecuted scenarios means that the LivingDoc report will only include scenarios with test results – none of the scenarios in it should be “skipped.” For my team, this means a LivingDoc report for a particular bank configuration will not include a bunch of skipped tests for other banks. Even though we currently have over 1400 unique tests, any given bank configuration may run only 1000 of those tests. The extra 400 skipped tests would be noise at best and a data privacy violation at worst.

The second script is RemoveTags.ps1. It takes in a list of tags and a feature data JSON file, and it removes all appearances of those tags from every feature, scenario, and example table. Like the script for removing skipped scenarios, it uses recursive functions to traverse the feature data JSON “tree.” The tags must be given as literal names, but the script could easily be adjusted to handle wildcard patterns or regular expressions.

With these new scripts, our test pipelines now look like this:

  1. Check out the code repository.
  2. Build the Boa test automation solution.
  3. Generate the SpecFlow+ LivingDoc report with unused steps but without test results.
  4. For each applicable bank configuration:
    1. Run appropriate Boa tests and get the test execution JSON file.
    2. Generate the feature data JSON file.
    3. Remove unexecuted scenarios from the feature data.
    4. Remove PrecisionLender-specific tags from the feature data.
    5. Generate the SpecFlow+ LivingDoc report using the modified feature data and the test results.

Below is an example of what the modified LivingDoc report looks like when we run our 12 smoke tests:

SpecFlow+ LivingDoc report using modified feature data after running only 12 smoke tests

(Note: At the time of writing this article, the most recent version of SpecFlow+ LivingDoc now includes a filter for test results in addition to its other filters. Using the test result filter, you can remove unexecuted scenarios from view. This feature is very helpful and could be used for our internal testing, but it would not meet our needs of removing sensitive data from reports for our customers.)


Ever since acquiring SpecFlow from TechTalk in January 2020, Tricentis has done great things to improve SpecFlow’s features and strengthen its community. SpecFlow+ LivingDoc is one of the many fruits of that effort. My team and I at PrecisionLender love these slick new reports, and we are already getting significant value out of them.

If you like SpecFlow+ LivingDoc, then I encourage you to check out some of SpecFlow’s other products. Everything SpecFlow offers is now free to use forever – you just need to register a free SpecFlow account. SpecFlow+ Runner is by far the best way to run SpecFlow tests (and, believe me, I’ve used the other runners for NUnit,, and MsTest). SpecMap is great for mapping and planning stories with Azure Boards. SpecFlow’s Online Gherkin Editor is also one of the best and simplest ways to write Gherkin without needing a full IDE.

Finally, if you use SpecFlow for test automation, give Boa Constrictor a try. Boa Constrictor is a .NET implementation of the Screenplay Pattern that my team and I developed at PrecisionLender. It helps you make better interactions for better automation, and it’s a significant step up from the Page Object Model. It’s now an open source project – all you need to do is install the Boa.Constrictor NuGet package! If you’re interested, be sure to check out the SpecFlow livestream in which Andi Willich and I teamed up to convert an existing SpecFlow project from page objects and drivers to Boa Constrictor’s Screenplay calls. SpecFlow and Boa Constrictor work together beautifully.

Using Domain-Specific Languages for Security Testing

I love programming languages. They have fascinated me ever since I first learned to program my TI-83 Plus calculator in ninth grade, many years ago. When I studied computer science in college, I learned how parsers, interpreters, and compilers work. During my internships at IBM, I worked on a language named Enterprise Generation Language as both a tester and a developer. At NetApp, I even developed my own language named DS for test automation. Languages are so much fun to learn, build, and extend.

Today, even though I do not actively work on compilers, I still do some pretty interesting things with languages and testing. I strongly advocate for Behavior-Driven Development and its domain-specific language (DSL) Gherkin. In fact, as I wrote in my article Behavior-Driven Blasphemy, I support using Gherkin-based BDD test frameworks for test automation even if a team is not also doing BDD’s collaborative activities. Why? Gherkin is the world’s first major off-the-shelf DSL for test automation, and it doesn’t require the average tester to know the complexities of compiler theory. DSLs like Gherkin can make tests easier to read, faster to write, and more reliable to run. They provide a healthy separation of concerns between test cases and test code. After working on successful large-scale test automation projects with C# and SpecFlow, I don’t think I could go back to traditional test frameworks.

I’m not the only one who thinks this way. Here’s a tweet from Dinis Cruz, CTO and CISO at Glasswall, after he read one of my articles:

Dinis then tweeted at me to invite me to speak about using DSLs for testing at the Open Security Summit in 2021:

Now, I’m not a “security guy” at all, but I do know a thing or two about DSLs and testing. So, I gladly accepted the invitation to speak! I delivered my talk, “Using DSLs for Security Testing” virtually on Thursday, January 14, 2021 at 10am US Eastern. I also uploaded my slides to GitHub at AndyLPK247/using-dsls-for-security-testing. Check out the YouTube recording here:

This talk was not meant to be a technical demo or tutorial. Instead, it was meant to be a “think big” proposal. The main question I raised was, “How can we use DSLs for security testing?” I used my own story to illustrate the value languages deliver, particularly for testing. My call to action breaks that question down into three parts:

  1. Can DSLs make security testing easier to do and thereby more widely practiced?
  2. Is Gherkin good enough for security testing, or do we need to make a DSL specific to security?
  3. Would it be possible to write a set of “standard” or “universal” security tests using a DSL that anyone could either run directly or use as a template?

My goal for this talk was to spark a conversation about DSLs and security testing. Immediately after my talk, Luis Saiz shared two projects he’s working on regarding DSLs and security: SUSTO and Mist. Dinis also invited me back for a session at the Open Source Summit Mini Summit in February to have a follow-up roundtable discussion for my talk. I can’t wait to explore this idea further. It’s an exciting new space for me.

If this topic sparks your interest, be sure to watch my talk recording, and then join us live in February 2021 for the next Open Source Summit event. Virtual sessions are free to join. Many thanks again to Dinis and the whole team behind Open Source Summit for inviting me to speak and organizing the events.

I’m Writing a Software Testing Book!

That’s right! You read the title. I’m writing a book about software testing!

One of the most common questions people ask me is, “What books can you recommend on software testing and automation?” Unfortunately, I don’t have many that I can recommend. There are plenty of great books, but most of them focus on a particular tool, framework, or process. I haven’t found a modern book that covers software testing as a whole. Trust me, I looked – when I taught my college course on software testing at Wake Tech, the textbook’s copyright date was 2002. Its content felt just as antiquated.

I want to write a book worthy of answering that question. I want to write a treatise on software testing for our current generation of software professionals. My goal is ambitious, but I think I can do it. It will probably take a year to write. I hope to find deep joy in this endeavor.

Manning Publications will be the publisher. They accepted my proposal, and we signed a contract. The working title of the book is The Way to Test Software. The title pays homage to Julia Child’s classic, The Way to Cook. Like Julia Child, I want to teach “master recipes” that can be applied to any testing situations.

I don’t want to share too many details this early in the process, but the tentative table of contents has the following parts:

  1. Orientation
  2. Testing Code
  3. Testing Features
  4. Testing Performance
  5. Running Tests
  6. Development Practices

Python will be the language of demonstration. This should be no surprise to anyone. I chose Python because I love the language. I also think it’s a great language for test automation. Python will be easy for both beginners and experts to learn. Besides, the book is about testing, not programming – Python will be just the linguistic tool for automation.

If you’re as excited about this book as I am, please let me know! I need all the encouragement I can get. This book probably won’t enter print until 2022, given the breadth of its scope. I’ll work to get it done as soon as I can.

Learning Python Test Automation

Do you want to learn how to automate tests in Python? Python is one of the best languages for test automation because it is easy to learn, concise to write, and powerful to scale. These days, there’s a wealth of great content on Python testing. Here’s a brief reference to help you get started.

If you are new to Python, read How Do I Start Learning Python? to find the best way to start learning the language.

If you want to roll up your sleeves, check out Test Automation University. I developed a “trifecta” of Python testing courses for TAU with videos, transcripts, quizzes, and example code. You can take them for FREE!

  1. Introduction to pytest
  2. Selenium WebDriver with Python
  3. Behavior-Driven Python with pytest-bdd

If you wants some brief articles for reference, check out my Python Testing 101 blog series:

  1. Python Testing 101: Introduction
  2. Python Testing 101: unittest
  3. Python Testing 101: doctest
  4. Python Testing 101: pytest
  5. Python Testing 101: behave
  6. Python Testing 101: pytest-bdd
  7. Python BDD Framework Comparison

RealPython also has excellent guides:

I’ve given several talks about Python testing:

If you prefer to read books, here are some great titles:

Here are links to popular Python test tools and frameworks:

Do you have any other great resources? Drop them in the comments below! Happy testing!