automation

BDD Gherkin Guidelines for AI Coding and Testing

AI coding agents following BDD (Behavior-Driven Development) principles can write great Gherkin scenarios if they are given the proper rules. Without explicit rules, AI-generated Gherkin often drifts into vague Then steps, UI-heavy scripts, multi-behavior scenarios, and placeholder examples that read like filler. That is not a model problem alone; it is a missing context problem.

I have written Gherkin Guidelines for AI, an open-source context file created to become a default BDD reference for AI-assisted scenario writing, AI code review, and Gherkin-based test automation. It is one Markdown file you can attach to Cursor, Claude, Copilot, Codex, or any tool that accepts project context.

The context file is gherkin-guidelines.md, located in the GitHub repository at https://github.com/AutomationPanda/gherkin-guidelines-for-ai.

To add these guidelines to your project:

Download gherkin-guidelines.md from the GitHub repository.
Place it in your own project alongside your specs or context files.
Wire it up to your project’s rules, skills, or sub-agents so your AI coding agents will abide by it.

Please review the repository’s README for full setup and usage instructions.

When you are ready, try it on your next user story: load the guidelines, ask your agent for scenarios, and see how it feels when everyone is reading from the same playbook. Good specs are a team sport, and this file is here to make your first pass a little lighter and a lot clearer. Whether you want to simply code the vibes or do all-out Spec-Driven Development, I hope that my Gherkin guidelines can help!

The Testing Skyscraper: A Modern Alternative to the Testing Pyramid

Every good software tester knows that a good testing strategy should adhere to the classic Testing Pyramid structure: a strong base of unit tests at the bottom, a solid layer of API tests in the middle, and a few UI tests at the top for good measure. The Testing Pyramid has been around longer than I’ve been working in the software industry, and it is arguably the most prevalent mental model in the discipline of testing.

For years, I abided by the Testing Pyramid. I formed my test plans based upon it. Heck, I even wrote a popular article about it. However, after many years of blindly accepting it, I’m ready to make a rather bold claim: the Testing Pyramid is an antiquated scheme that deceives testers. I’m leaving the pyramid scheme and embracing a new, more modern approach. Even if you think this is heresy, please allow me to explain my rationale.

The Testing Pyramid: A Relic of History

I started my professional career in software in 2007. Back then, Apple had just released the first iPhone, and Facebook was so new that they only allowed college students to create accounts. Web applications, RESTful architecture, and Selenium were all new things. Developing and testing software systems looked much different.

The Testing Pyramid evolved as a simple mental model to help testers decide what to test and how to test it based on the constraints of the time. Web UI testing was notoriously difficult. Browsers were not as standardized as they are today. Selenium WebDriver enabled UI automation but required testers to write their own frameworks around it. Test execution was often slow and flaky. As a result, testers called UI tests “bad” and did everything they could to avoid them, favoring lower-level tests instead. Unit tests were “good” because they were fast, reliable, and close to the code they covered. Plus, code coverage tools could automatically quantify coverage levels and identify gaps. API tests were “okay” because they were typically small and fast, even if they needed to make a network hop to a live environment. Thus, a “proper” test strategy took a pyramidal shape that favored lower level tests for their speed and reliability. It made sense at the time.

Are We Stuck in the Past?

The factors that pushed strategies to take a triangular shape have changed since the inception of the Testing Pyramid all those years ago. Pyramids now feel like relics of ancient history. Let’s take a reality check.

UI testing tools are better, faster, and more reliable. New frameworks like Playwright and Cypress provide greater stability through automatic waiting, faster execution times, and overall better testing experiences. Selenium is still kicking with the BiDi protocol for better testing support, Selenium Manager for automatic driver management, and a plethora of community projects (like Boa Constrictor) helping testers maximize Selenium’s potential.
Traditional API testing can largely be replaced by other kinds of tests. Internal handler unit tests can cover the domain logic for what happens inside the services. Contract tests can cover the handshakes between different services to make sure updates to one won’t break the integrations with others. And UI tests can make sure the system works as a whole.
Test orchestration can now run tests continuously. Tests can run for every code change. They can run for pull requests. Some developers even run end-to-end tests locally before committing changes. The ability to deliver fast feedback on important areas matters far more than the types or times of tests.

Therefore, it is wrong to say a test is bad simply based on its type. All test types are good because they mitigate different kinds of risks. We should focus on building robust continuous feedback loops rather than quotas for test types.

The Testing Skyscraper: A New Model

I think a better mental model for modern testing is the Testing Skyscraper. The skyscraper is a symbol of industrial might and technological advancement. Each skyscraper has a unique architecture that makes it stand out against the skyline. Pyramids get narrower as you approach the top, but skyscrapers have several levels of varying sizes and layouts, where each floor is tailored to the needs of the building’s tenants.

Skyscrapers are a great analogy for testing strategies because one size does not fit all:

Testers can architect their strategies to meet their needs. They can design it as they see fit.
Testers can build out tests at any level they need. Every level of testing is deemed good if it meets the business needs.
Testers can build out as much testing at each level as they like. A floor may have zero-to-many “tenants.”
Testers can choose to skip tests at different levels as a calculated risk. They’ll just be empty floors in the building until needs change.
New testing tools are as strong as steel. Testers can build strategies that scale upwards and onwards, faster and higher than ever before.

The shape of the skyscraper does NOT imply that there should be an equal number of tests at each level. Instead, the metaphor implies that each test strategy is unique and that each level can be built as needed with the freedom of modern architecture. It’s not about quantities or quotas.

I’ve seen anti-pattern models such as ice cream cones and cupcakes. Testing Pyramid might now be an anti-pattern as well.

Modern Architecture for the Present Day

Pyramids were great for their time. The ancients like the Egyptians, the Sumerians, and the Mayans built impressive pyramids that still stand today. However, no civilization has built new pyramids for centuries, unless you count the ones at the Louvre or in Las Vegas. They’re impractical. They’re short. They require an enormous base. Let’s let go of the past and embrace the modern future. Let’s build structures that reflect our times. Let’s build Testing Skyscrapers that reach for the stars – and look snazzy while doing it.

Boa Constrictor Update (May 2025)

Boa Constrictor, the .NET Screenplay Pattern, started in 2018 and is still actively used in 2025. In total, its NuGet packages have over half a million downloads! However, the project’s activity has slowed down significantly in recent times. It’s been two and a half years since I gave my last major update about Boa Constrictor. In this article, I want to cover major developments, explain why things have been slow, and suggest a path for the future.

Major developments

We accomplished many of the goals for Boa Constrictor 3. In fact, all the Boa Constrictor packages are currently set at version 4! Here’s a quick summary of what’s available:

Each set of interactions has its own dedicated NuGet package. For example, Boa.Constrictor.Selenium contains all the Selenium WebDriver interactions for Web UIs, and Boa.Constrictor.RestSharp contains all the RestSharp interactions for APIs. That way, testers can configure their test projects to download only the packages that are needed.
We released a new XUnit package that provides special loggers for XUnit tests.
We recently released a new Playwright package that provides Abilities and Interactions for Playwright. This package should be treated as a beta version initially.
We updated all unit test projects to run on .NET 8.
We made minor fixes and updated various dependency versions.

Many thanks to all our contributors for all their hard work. Special thanks goes to “thePantz” for implementing the XUnit and Playwright packages.

Unfortunately, there are some things we did not accomplish. We did not add shadow DOM support for the Selenium package as hoped. We also did not add support for Applitools, and we no longer plan to add it. There is also a lot of information that is now missing from the doc site.

Why have things been slow?

The answer is simple: the maintainers are no longer actively using Boa Constrictor. I haven’t used Boa Constrictor in my day-to-day work since I left Q2 in November 2021, which was 3.5 years ago. The other maintainers have also either moved onto new jobs or new responsibilities. Things moved quickly from about 2020-2021 because the maintainers and I used Boa Constrictor on a daily basis. Now, it’s difficult for us to find time to work on the project because we just don’t use it ourselves anymore. Frankly, I have barely touched .NET since leaving Q2.

A path for the future

The Boa Constrictor project is neither “dead” nor “abandoned,” but there need to be some changes for it to continue in the future.

First, we have invited new maintainers to the project who have demonstrated a sense of ownership and contributed meaningful changes to the codebase. These new maintainers use Boa Constrictor actively and have the right stuff to keep the project going. Note that “maintainers” are folks who have the power to approve and complete pull requests. Anyone can submit code contributions to the project – you do not need to be a maintainer to participate.

Second, I will focus more on project management and less on coding. I haven’t made any serious or significant code contributions to Boa Constrictor in about two years, and since I’m not actively using the project itself (let alone working in the .NET stack), it is unlikely that I will be contributing any code in the foreseeable future. For transparency, I should recognize the reality and publicly state it. I still want Boa Constrictor to be useful to others for .NET test automation. I think the best way for me to serve the project and the community now is to empower others to contribute.

Third, the community should be empowered to make their own Boa Constrictor packages. Boa Constrictor’s design adheres to SOLID principles, which enables developers to add new Abilities, Tasks, and Questions without needing to modify the core pattern or any of the other Interactions. New packages do not necessarily need to be added to the main Boa Constrictor repository, either. Developers could build and maintain new Screenplay Interactions privately for themselves and their teams. They could also release them to NuGet as their own open source packages separate from the main project. If they feel like their package adds major value, they could contribute it to the main repository through a pull request. All of these are valid options, and I would support any of them. Just note that contributing to the main repository will likely be the slowest path because the maintainers would need to review the code, which could take a long time. So, if you want a particular Boa Constrictor package, be empowered to build it yourself. Don’t wait for the maintainers to build it for you.

The Screenplay Pattern’s place in test automation

When I first started implementing the Screenplay Pattern in C#, I was looking for a better alternative to page objects with Selenium WebDriver. Previous test projects had burned me with unintelligible page object classes that stretched for thousands of lines each with duplicative methods and no proper waiting. I saw the Screenplay Pattern as a way to define better web interactions that could handle waiting automatically and be composed together. Then, I realized that it could be used for any kind of testing, not just Web UI. The pattern provided a natural way to join multiple kinds of interactions together into seamless workflows. It made test automation for large, complex system manageable and well-organized, rather than the mess it typically becomes.

One of my biggest goals in releasing Boa Constrictor as an open source project was to reinvigorate interest in the Screenplay Pattern in our industry. I believed it was the way to make better interactions for better automation, and I wanted to make it simpler for folks to understand. Based on the number of package downloads, the event talks, and the Discord server, I think we accomplished that goal.

Running tests in a Java Maven project

Java continues to be one of the most popular languages for test automation, and Maven continues to be its most popular build tool. Adding tests in the right place to a Java project with Maven can be a bit tricky, however. Let’s briefly learn how to do it. These steps work for both JUnit and TestNG.

Test code location

Maven project follow the Standard Directory Layout. The main code should go under src/main, while all test code should go under src/test:

src/test/java should hold Java test classes
src/test/resources should hold resource files that tests use

Your project directory should look something like this:

src/
+-- main/
|   +-- java/
|   \-- resources/
|-- test/
|   +-- java/
|   \-- resources/
\-- pom.xml

Don’t put test code in the main source folder. You don’t want to include it with the final build artifact. The project might have other files as well, like a README.

Unit tests

The Maven Surefire Plugin runs unit tests during Maven’s test phase. To run unit tests:

Add maven-surefire-plugin to the plugins section of your pom.xml
Name your unit tests *Tests.java
Put them under src/test/java
Mirror the package structure from the main code always
Run tests with mvn test

There are a bunch of options for configuring the Maven Surefire Plugin. If you don’t want to configure anything special, you actually don’t need to add the plugin to the POM file. Nevertheless, it’s good practice to add it to the POM file anyway. Here’s what that would look like:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>3.2.5</version>
            </plugin>
        </plugins>
    </build>

Integration tests

The Maven Failsafe Plugin runs integration tests during Maven’s integration-test phase. Integration tests are distinct from unit tests due to their external dependencies and should be treated differently. To run integration tests:

Add maven-failsafe-plugin to the plugins section of your pom.xml
Name your unit tests *IT.java
Put them under src/test/java
Mirror the package structure from the main code as appropriate
Run tests with mvn verify

Maven actually has multiple integration test phases: pre-integration-test, integration-test, and post-integration-test to handle appropriate setup, testing, and cleanup. However, none of these phases will cause the build to fail. Instead, use the verify goal to make the build fail when ITs fail.

Like the Maven Surefire Plugin, the Maven Failsafe Plugin has a bunch of options. However, to run integration tests, you must configure the Failsafe plugin in the POM file. Here’s what it looks like:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-failsafe-plugin</artifactId>
                <version>3.2.5</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>integration-test</goal>
                            <goal>verify</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

Maven phases

Phases in the Maven Build Lifecycle are cumulative:

Running mvn test includes the compile phase
Running mvn verify includes the compile and test phases

It is also a good practice to run mvn clean before other phases to delete the build output (target/) directory. That way, old class files and test reports are removed before generating new ones. You may also include clean with commands to run tests, like this: mvn clean test or mvn clean verify.

Customizations

You can customize how tests run. For example, you can create a separate directory for integration tests (like src/it instead of src/test). However, I recommend avoiding customizations like this. They require complicated settings in the POM file that are difficult to get right and confusing to maintain. Others who join the project later will expect Maven standards.

Modern Web Testing with Playwright

Playwright is an awesome new web testing framework, and it can help you take a modern approach to web development. In this article, let’s learn how.

Asking tough questions about testing

Let me ask you a series of questions:

Question 1: Do you like it when bugs happen in your code? Most likely not. Bugs are problems. They shouldn’t happen in the first place, and they require effort to fix. They’re a big hassle.

Question 2: Would you rather let those bugs ship to production? Absolutely not! We want to fix bugs before users ever see them. Serious bugs could cause a lot of damage to systems, businesses, and even reputations. Whenever bugs do slip into production, we want to find them and fix them ASAP.

Question 3: Do you like to create tests to catch bugs before that happens? Hmmm… this question is tougher to answer. Most folks understand that good tests can provide valuable feedback on software quality, but not everyone likes to put in the work for testing.

Why the distaste for testing?

Why doesn’t everyone like to do testing? Testing is HARD! Here are common complaints I hear:

Tests are slow – they take too long to run!
Tests are brittle – they break whenever the app changes!
Tests are flaky – they crash all the time!
Tests don’t make sense – they are complicated and unreadable!
Tests don’t make money – we could be building new features instead!
Tests require changing context – they interrupt my development workflow!

These are all valid reasons. To mitigate these pain points, software teams have historically created testing strategies around the Testing Pyramid, which separates tests by layer from top to bottom:

UI tests
API tests
Component tests
Unit tests

Tests at the bottom were considered “better” because they were closer to the code, easier to automate, and faster to execute. They were also considered to be less susceptible to flakiness and, therefore, easier to maintain. Tests at the top were considered just the opposite: big, slow, and expensive. The pyramid shape implied that teams should spent more time on tests at the base of the pyramid and less time on tests at the top.

End-to-end tests can be very valuable. Unfortunately, the Testing Pyramid labeled them as “difficult” and “bad” primarily due to poor practices and tool shortcomings. It also led teams to form testing strategies that emphasized categories of tests over the feedback they delivered.

Rethinking modern web testing goals

Testing doesn’t need to be hard, and it doesn’t need to suffer from the problems of the past. We should take a fresh, new approach in testing modern web apps.

Here are three major goals for modern web testing:

Focus on building fast feedback loops rather than certain types of tests.
Make test development as fast and painless as possible.
Choose test tooling that naturally complements dev workflows.

These goals put emphasis on results and efficiency. Testing should just be a natural part of development without any friction.

Introducing Playwright

Playwright is a modern web testing framework that can help us meet these goals.

It is an open source project from Microsoft.
It manipulates the browser via (superfast) debug protocols
It works with Chromium/Chrome/Edge, Firefox, and WebKit
It provides automatic waiting, test generation, UI mode, and more
It can test UIs and APIs together
It provides bindings for JavaScript/TypeScript, Python, Java, and C#

Playwright takes a unique approach to browser automation. First of all, it uses browser projects rather than full browser apps. For example, this means you would test Chromium instead of Google Chrome. Browser projects are smaller and don’t use as many resources as full browsers. Playwright also manages the browser projects for you, so you don’t need to install extra stuff.

Second, it uses browsers very efficiently:

Instead of launching a full, new browser instance for each test, Playwright launches one browser instance for the entire suite of tests.
It then creates a unique browser context from that instance for each test. A browser context is essentially like an incognito session: it has its own session storage and tabs that are not shared with any other context. Browser contexts are very fast to create and destroy.
Then, each browser context can have one or more pages. All Playwright interactions happen through a page, like clicks and scrapes. Most tests only ever need one page.

Playwright browsers, contexts, and pages

Playwright handles all this setup automatically for you.

Comparing Playwright to other tools

Playwright is not the only browser automation tool out there. The other two most popular tools are Selenium and Cypress. Here is a chart with high-level comparisons:

All three are good tools, and each one has their advantages. Playwright’s main advantages are that offers excellent developer experience with the fastest execution time, multiple language bindings, and several quality-of-life features.

Learning Playwright

If you want to learn how to automate your web tests with Playwright, take my tutorial, Awesome Web Testing with Playwright. All instructions and example code for the tutorial are located in GitHub. This tutorial is designed to be self-guided, so give it a try!

Test Automation University also has a Playwright learning path with introductory and advanced courses:

Playwright is an awesome new framework for modern web testing. Give it a try, and let me know what you automate!

Which web testing tool should I use?

This article is based on my talk at PyCon US 2023. The web app under test and most of the example code is written in Python, but the information presented is applicable to any stack.

There are several great tools and frameworks for automating browser-based web UI testing these days. Personally, I gravitate towards open source projects that require coding skills to use, rather than low-code/no-code automation tools. The big three browser automation tools right now are Selenium, Cypress, and Playwright. There are other great tools, too, but these three seem to be the ones everyone is talking about the most.

It can be tough to pick right right tool for your needs. In this article, let’s compare and contrast these tools.

Choosing a web app to test

I developed a small web app named Bulldoggy, the reminders app. You can clone the repository and run it yourself. The repository URL is https://github.com/AutomationPanda/bulldoggy-reminders-app.

Bulldoggy is a full-stack Python app:

It uses FastAPI for APIs.
It uses Jinja templates for HTML and CSS files.
It uses HTMX for handling dynamic interactions without needing any explicit JavaScript.
It uses TinyDB to store data.
It uses Pydantic to model data.

If you want to run it locally, all you need is Python!

The app is pretty simple. When you first load it, it presents a standard login page. I actually used ChatGPT to help me write the HTML and CSS:

After logging in, you’ll see the reminders page:

The title card at the top has the app’s name, the logo, and a logout button. On the left, there is a card for reminder lists. Here, I have different lists for Chores and Projects. On the right, there is a card for all the reminders in the selected list. So, when I click the Chores list, I see reminders like “Buy groceries” and “Walk the dog.” I can click individual reminder rows to strike them out, indicating that they are complete. I can also add, edit, or delete reminders and lists through the buttons along the right sides of the cards.

Now that we have a web app to test, let’s learn how to use the big three web testing tools to automate tests for it.

Selenium

Selenium WebDriver is the classic and still the most popular browser automation tool. It’s the original. It carries that old-school style and swagger. Selenium manipulates the browser using the WebDriver protocol, a W3C Recommendation that all major browsers have adopted. The Selenium project is fully open source. It relies on open standards, and it is run by community volunteers according to open governance policies. Selenium WebDriver offers language bindings for Java, JavaScript, C#, and – my favorite language – Python.

Selenium WebDriver works with real, live browsers through a proxy server running on the same machine as the target browser. When test automation starts, it will launch the WebDriver executable for the proxy and then send commands through it via the WebDriver protocol.

To set up Selenium WebDriver, you need to install the WebDriver executables on your machine’s system path for the browsers you intend to test. Make sure the versions all match!

Then, you’ll need to add the appropriate Selenium package(s) to your test automation project. The names for the packages and the methods for installation are different for each language. For example, in Python, you’ll probably run pip install selenium.

In your project, you’ll need to construct a WebDriver instance. The best place to do that is in a setup method within a test framework. If you are using Python with pytest, that would go into a fixture like this:

We could hardcode the browser type we want to use as shown here in the example, or we could dynamically pick the browser type based on some sort of test inputs. We may also set options on the WebDriver instance, such as running it headless or setting an implicit wait. For cleanup after the yield command, we need to explicitly quit the browser.

Here’s what a login test would look like when using Selenium in Python:

The test function would receive the WebDriver instance through the browser fixture we just wrote. When I write tests, I follow the Arrange-Act-Assert pattern, and I like to write my test steps using Given-When-Then language in comments.

The first step is, “Given the login page is displayed.” Here, we call “browser dot get” with the full URL for the Bulldoggy app running on the local machine.

The second step is, “When the user logs into the app with valid credentials.” This actually requires three interactions: typing the username, typing the password, and clicking the login button. For each of these, the test must first call “browser dot find element” with a locator to get the element object. They locate the username and password fields using CSS selectors based on input name, and they locate the login button using an XPath that searches for the text of the button. Once the elements are found, the test can call interactions on them like “send keys” and “click”.

Now, one thing to note is that these calls should probably use page objects or the Screenplay Pattern to make them reusable, but I chose to put raw Selenium code here to keep it basic.

The third step is, “Then the reminders page is displayed.” These lines perform assertions, but they need to wait for the reminders page to load before they can check any elements. The WebDriverWait object enables explicit waiting. With Selenium WebDriver, we need to handle waiting by ourselves, or else tests will crash when they can’t find target elements. Improper waiting is the main cause for flakiness in tests. Furthermore, implicit and explicit waits don’t mix. We must choose one or the other. Personally, I’ve found that any test project beyond a small demo needs explicit waits to be maintainable and runnable.

Selenium is great because it works well, but it does have some paint points:

Like we just said, there is no automatic waiting. Folks often write flaky tests unintentionally because they don’t handle waiting properly. Therefore, it is strongly recommended to use a layer on top of raw Selenium like Pylenium, SeleniumBase, or a Screenplay implementation. Selenium isn’t a full test framework by itself – it is a browser automation tool that becomes part of a test framework.
Selenium setup can be annoying. We need to install matching WebDriver executables onto the system path for every browser we test, and we need to keep their versions in sync. It’s very common to discover that tests start failing one day because a browser automatically updated its version and no longer matched its WebDriver executable. Thankfully, a new part of the Selenium project named Selenium Manager now automatically handles the executables.
Selenium-based tests have a bad reputation for slowness. Usually, poor performance comes more from the apps under test than the tool itself, but Selenium setup and cleanup do cause a performance hit.

Cypress

Cypress is a modern frontend test framework with rich developer experience. Instead of using the WebDriver protocol, it manipulates the browser via in-browser JavaScript calls. The tests and the app operate in the same browser process. Cypress is an open source project, and the company behind it sells advanced features for it as a paid service. It can run tests on Chrome, Firefox, Edge, Electron, and WebKit (but not Safari). It also has built-in API testing support. Unfortunately, due to its design, Cypress tests must be written exclusively in JavaScript (or TypeScript).

Here’s the code for the Bulldoggy login test in Cypress in JavaScript:

The steps are pretty much the same as before. Instead of creating some sort of browser object, all Cypress calls go to its cy object. The syntax is very concise and readable. We could even fit in a few more assertions. Cypress also handles waiting automatically, which makes the code less prone to flakiness.

The rich developer experience comes alive when running Cypress tests. Cypress will open a browser window that will visually execute the test in front of us. Every step is traced so we can quickly pinpoint failures. Cypress is essentially a web app that tests web apps.

While Cypress is awesome, it is JavaScript-only, which stinks for folks who use other programming languages. For example, I’m a Pythonista at heart. Would I really want to test a full-stack Python web app like Bulldoggy with a browser automation tool that doesn’t have a Python language binding? Cypress is also trapped in the browser. It has some inherent limitations, like the fact that it can’t handle more than one open tab.

Playwright

Playwright is similar to Cypress in that it’s a modern, open source test framework that is developed and maintained by a company. Playwright manipulates the browser via debug protocols, which make it the fastest of the three tools we’ve discussed today. Playwright also takes a unique approach to browsers. Instead of testing full browsers like Chrome, Firefox, and Safari, it tests the corresponding browser engines: Chromium, Firefox (Gecko), and WebKit. Like Cypress, Playwright can also test APIs, and like Selenium, Playwright offers bindings for multiple popular languages, including Python.

To set up Playwright, of course we need to install the dependency packages. Then, we need to install the browser engines. Thankfully, Playwright manages its browsers for us. All we need to do is run the appropriate “Playwright install” for the chosen language.

Playwright takes a unique approach to browser setup. Instead of launching a new browser instance for each test, it uses one browser instance for all tests in the suite. Each test then creates a unique browser context within the browser instance, which is like an incognito session within the browser. It is very fast to create and destroy – much faster than a full browser instance. One browser instance may simultaneously have multiple contexts. Each context keeps its own cookies and session storage, so contexts are independent of each other. Each context may also have multiple pages or tabs open at any given time. Contexts also enable scalable parallel execution. We could easily run tests in parallel with the same browser instance because each context is isolated.

Let’s see that Bulldoggy login test one more time, but this time with Playwright code in Python. Again, the code is pretty similar to what we saw before. The major differences between these browser automation tools is not so much the appearance of the code but rather how they work and perform:

With Playwright, all interactions happen with the “page” object. By default, Playwright will create:

One browser instance to be shared by all tests in a suite
One context for each test case
One page within the context for each test case

When we read this code, we see locators for finding elements and methods for acting upon found elements. Notice how, like Cypress, Playwright automatically handles waiting. Playwright also packs an extensive assertion library with conditions that will wait for a reasonable timeout for their intended conditions to become true.

Again, like we said for the Selenium example code, if this were a real-world project, we would probably want to use page objects or the Screenplay Pattern to handle interactions rather than raw calls.

Playwright has a lot more cool stuff, such as the code generator and the trace viewer. However, Playwright isn’t perfect, and it also has some pain points:

Playwright tests browser engines, not full browsers. For example, Chrome is not the same as Chromium. There might be small test gaps between the two. Your team might also need to test full browsers to satisfy compliance rules.
Playwright is still new. It is years younger than Selenium and Cypress, so its community is smaller. You probably won’t find as many StackOverflow articles to help you as you would for the other tools. Features are also evolving rapidly, so brace yourself for changes.

Which one should you choose?

So, now that we have learned all about Selenium, Cypress, and Playwright, here’s the million-dollar question: Which one should we use? Well, the best web test tool to choose really depends on your needs. They are all great tools with pros and cons. I wanted to compare these tools head-to-head, so I created this table for quick reference:

In summary:

Selenium WebDriver is the classic tool that historically has appealed to testers. It supports all major browsers and several programming languages. It abides by open source, standards, and governance. However, it is a low-level browser automation tool, not a full test framework. Use it with a layer on top like Serenity, Boa Constrictor, or Pylenium.
Cypress is the darling test framework for frontend web developers. It is essentially a web app that tests web apps, and it executes tests in the same browser process as the app under test. It supports many browsers but must be coded exclusively in JavaScript. Nevertheless, its developer experience is top-notch.
Playwright is gaining popularity very quickly for its speed and innovative optimizations. It packs all the modern features of Cypress with the multilingual support of Selenium. Although it is newer than Cypress and Selenium, it’s growing fast in terms of features and user base.

If you want to know which one I would choose, come talk with me about it! You can also watch my PyCon US 2023 talk recording to see which one I would specifically choose for my personal Python projects.

Passing Test Inputs into pytest

Someone recently asked me this question:

I’m developing a pytest project to test an API. How can I pass environment information into my tests? I need to run tests against different environments like DEV, TEST, and PROD. Each environment has a different URL and a unique set of users.

This is a common problem for automated test suites, not just in Python or pytest. Any information a test needs about the environment under test is called configuration metadata. URLs and user accounts are common configuration metadata values. Tests need to know what site to hit and how to authenticate.

Using config files with an environment variable

There are many ways to handle inputs like this. I like to create JSON files to store the configuration metadata for each environment. So, something like this:

dev.json
test.json
prod.json

Each one could look like this:

{
  "base_url": "http://my.site.com/",
  "username": "pandy",
  "password": "DandyAndySugarCandy"
}

The structure of each file must be the same so that tests can treat them interchangeably.

I like using JSON files because:

they are plain text files with a standard format
they are easy to diff
they store data hierarchically
Python’s standard json module turns them into dictionaries in 2 lines flat

Then, I create an environment variable to set the desired config file:

export TARGET_ENV=dev.json

In my pytest project, I write a fixture to get the config file path from this environment variable and then read that file as a dictionary:

import json
import os
import pytest

@pytest.fixture
def target_env(scope='session'):
  config_path = os.environ['TARGET_ENV']
  with open(config_path) as config_file:
    config_data = json.load(config_file)
  return config_data

I’ll put this fixture in a conftest.py file so all tests can share it. Since it uses session scope, pytest will execute it one time before all tests. Test functions can call it like this:

import requests

def test_api_get(target_env):
  url = target_env['base_url']
  creds = (target_env['username'], target_env['password'])
  response = requests.get(url, auth=creds)
  assert response.status_code == 200

Selecting the config file with a command line argument

If you don’t want to use environment variables to select the config file, you could instead create a custom pytest command line argument. Bas Dijkstra wrote an excellent article showing how to do this. Basically, you could add the following function to conftest.py to add the custom argument:

def pytest_addoption(parser):
  parser.addoption(
    '--target-env',
    action='store',
    default='dev.json',
    help='Path to the target environment config file')

Then, update the target_env fixture:

import json
import pytest

@pytest.fixture
def target_env(request):
  config_path = request.config.getoption('--target-env')
  with open(config_path) as config_file:
    config_data = json.load(config_file)
  return config_data

When running your tests, you would specify the config file path like this:

python -m pytest --target-env dev.json

Why bother with JSON files?

In theory, you could pass all inputs into your tests with pytest command line arguments or environment variables. You don’t need config files. However, I find that storing configuration metadata in files is much more convenient than setting a bunch of inputs each time I need to run my tests. In our example above, passing one value for the config file path is much easier than passing three different values for base URL, username, and password. Real-world test projects might need more inputs. Plus, configurations don’t change frequency, so it’s okay to save them in a file for repeated use. Just make sure to keep your config files safe if they have any secrets.

Validating inputs

Whenever reading inputs, it’s good practice to make sure their values are good. Otherwise, tests could crash! I like to add a few basic assertions as safety checks:

import json
import os
import pytest

@pytest.fixture
def target_env(request):
  config_path = request.config.getoption('--target-env')
  assert os.path.isfile(config_path)

  with open(config_path) as config_file:
    config_data = json.load(config_file)

  assert 'base_url' in config_data
  assert 'username' in config_data
  assert 'password' in config_data

  return config_data

Now, pytest will stop immediately if inputs are wrong.

Democratizing the Screenplay Pattern

I started Boa Constrictor back in 2018 because I loathed page objects. On a previous project, I saw page objects balloon to several thousand lines long with duplicative methods. Developing new tests became a nightmare, and about 10% of tests failed daily because they didn’t handle waiting properly.

So, while preparing a test strategy at a new company, I invested time in learning the Screenplay Pattern. To be honest, the pattern seemed a bit confusing at first, but I was willing to try anything other than page objects again. Eventually, it clicked for me: Actors use Abilities to perform Interactions. Boom! It was a clean separation of concerns.

Unfortunately, the only major implementations I could find for the Screenplay Pattern at the time were Serenity BDD in Java and JavaScript. My company was a .NET shop. I looked for C# implementations, but I didn’t find anything that I trusted. So, I took matters into my own hands and implemented the Screenplay Pattern myself in .NET. Initially, I implemented Selenium WebDriver interactions. Later, my team and I added RestSharp interactions. We eventually released Boa Constrictor as an open source project in October 2020 as part of Hacktoberfest.

With Boa Constrictor, I personally sought to reinvigorate interest in the Screenplay Pattern. By bringing the Screenplay Pattern to .NET, we enabled folks outside of the Java and JavaScript communities to give it a try. With our rich docs, examples, and videos, we made it easy to onboard new users. And through conference talks and webinars, we popularized the concepts behind Screenplay, even for non-C# programmers. It’s been awesome to see so many other folks in the testing community start talking about the Screenplay Pattern in the past few years.

I also wanted to provide a standalone implementation of the Screenplay Pattern. Since the Screenplay Pattern is a design for automating interactions, it could and should integrate with any .NET test framework: SpecFlow, MsTest, NUnit, xUnit.net, and any others. With Boa Constrictor, we focused singularly on making interactions as excellent as possible, and we let other projects handle separate concerns. I did not want Boa Constrictor to be locked into any particular tool or system. In this sense, Boa Constrictor diverged from Serenity BDD – it was not meant to be a .NET version of Serenity, despite taking much inspiration from Serenity.

Furthermore, in the design and all the messaging for Boa Constrictor, I strived to make the Screenplay Pattern easy to understand. So many folks I knew gave up on Screenplay in the past because they thought it was too complicated. I wanted to break things down so that any automation developer could pick it up quickly. Hence, I formed the soundbite, “Actors use Abilities to perform Interactions,” to describe the pattern in one line. I also coined the project’s slogan, “Better Interactions for Better Automation,” to clearly communicate why Screenplay should be used over alternatives like raw calls or page objects.

So far, Boa Constrictor has succeeded modestly well in these goals. Now, the project is pursuing one more goal: democratizing the Screenplay Pattern.

At its heart, the Screenplay Pattern is a generic pattern for any kind of interactions. The core pattern should not favor any particular tool or package. Anyone should be able to implement interaction libraries using the tools (or “Abilities”) they want, and each of those libraries should be treated equally without preference. Recently, in our plans for Boa Constrictor 3, we announced that we want to create separate packages for the “core” pattern and for each library of interactions. We also announced plans to add new libraries for Playwright and Applitools. The existing libraries – Selenium WebDriver and RestSharp – need not be the only libraries. Boa Constrictor was never meant to be merely a WebDriver wrapper or a superior page object. It was meant to provide better interactions for any kind of test automation.

In version 3.0.0, we successfully separated the Boa.Constrictor project into three new .NET projects and released a NuGet package for each:

This separation enables folks to pick the parts they need. If they only need Selenium WebDriver interactions, then they can use just the Boa.Constrictor.Selenium package. If they want to implement their own interactions and don’t need Selenium or RestSharp, then they can use the Boa.Constrictor.Screenplay package without being forced to take on those extra dependencies.

Furthermore, we continued to maintain the “classic” Boa.Constrictor package. Now, this package simply claims dependencies on the other three packages in order to preserve backwards compatibility for folks who used previous version of Boa Constrictor. As part of the upgrade from 2.0.x to 3.0.x, we did change some namespaces (which are documented in the project changelog), but the rest of the code remained the same. We wanted the upgrade to be as straightforward as possible.

The core contributors and I will continue to implement our plans for Boa Constrictor 3 over the coming weeks. There’s a lot to do, and we will do our best to implement new code with thoughtfulness and quality. We will also strive to keep everything documented. Please be patient with us as development progresses. We also welcome your contributions, ideas, and feedback. Let’s make Boa Constrictor excellent together.

Plans for Boa Constrictor 3

Boa Constrictor is the .NET Screenplay Pattern. It helps you make better interactions for better test automation!

I originally created Boa Constrictor starting in 2018 as the cornerstone of PrecisionLender‘s end-to-end test automation project. In October 2020, my team and I released it as an open source project hosted on GitHub. Since then, the Boa Constrictor NuGet package has been downloaded over 44K times, and my team and I have shared the project through multiple conference talks and webinars. It’s awesome to see the project really take off!

Unfortunately, Boa Constrictor has had very little development over the past year. The latest release was version 2.0.0 in November 2021. What happened? Well, first, I left Q2 (the company that acquired PrecisionLender) to join Applitools, so I personally was not working on Boa Constrictor as part of my day job. Second, Boa Constrictor didn’t need much development. The core Screenplay Pattern was well-established, and the interactions for Selenium WebDriver and RestSharp were battle-hardened. Even though we made no new releases for a year, the project remained alive and well. The team at Q2 still uses Boa Constrictor as part of thousands of test iterations per day!

The time has now come for new development. Today, I’m excited to announce our plans for the next phase of Boa Constrictor! In this article, I’ll share the vision that the core contributors and I have for the project – tentatively casting it as “version 3.” We will also share a rough timeline for development.

Separate interaction packages

Currently, the Boa.Constrictor NuGet package has three main parts:

The Screenplay Pattern’s core interfaces and classes
Interactions for Selenium WebDriver
Interactions for RestSharp

This structure is convenient for a test automation project that uses Selenium and RestSharp, but it forces projects that don’t use them to take on their dependencies. What if a project uses Playwright instead of Selenium, or RestAssured.NET instead of RestSharp? What if a project wants to make different kinds of interactions, like mobile interactions with Appium?

At its heart, the Screenplay Pattern is a generic pattern for any kind of interactions. In theory, the core pattern should not favor any particular tool or package. Anyone should be able to implement interaction libraries using the core pattern.

With that in mind, we intend to split the current Boa.Constrictor package into three separate packages, one for each of the existing parts. That way, a project can declare dependencies only on the parts of Boa Constrictor that it needs. It also enables us (and others) to develop new packages for different kinds of interactions.

Playwright support

One of the new interaction packages we intend to create is a library for Playwright interactions. Playwright is a fantastic new web testing framework from Microsoft. It provides several advantages over Selenium WebDriver, such as faster execution, automatic waiting, and trace logging.

We want to give people the ability to choose between Selenium WebDriver or Playwright for their web UI interactions. Since a test automation project would use only one, and since there could be overlap in the names and types of interactions, separating interaction packages as detailed in the previous section will be a prerequisite for developing Playwright support.

We may also try to develop an adapter for Playwright interactions that uses the same interfaces as Selenium interactions so that folks could switch from Selenium to Playwright without rewriting their interactions.

Applitools support

Another new interaction package we intend to create is a library for Applitools interactions. Applitools is the premier visual testing platform. Visual testing catches UI bugs that are difficult to catch with traditional assertions, such as missing elements, broken styling, and overlapping text. A Boa Constrictor package for Applitools interactions would make it easier to capture visual snapshots together with Selenium WebDriver interactions. It would also be an “optional” feature since it would be its own package.

Shadow DOM support

Shadow DOM is a technique for encapsulating parts of a web page. It enables a hidden DOM tree to be attached to an element in the “regular” DOM tree so that different parts between the two DOMs do not clash. Shadow DOM usage has become quite prevalent in web apps these days.

We intend to add support for Selenium interactions to pierce the shadow DOM. Selenium WebDriver requires extra calls to pierce the shadow DOM. Unfortunately, Boa Constrictor’s Selenium interactions currently do not support shadow DOM interactivity. Most likely, we will add new builder methods for Selenium-based Tasks and Questions that take in a locator for the shadow root element and then update the action methods to handle the shadow DOM if necessary.

.NET 7 targets

The main Boa Constrictor project, the unit tests project, and the example project all target .NET 5. Unfortunately, NET 5 is no longer supported by Microsoft. The latest release is .NET 7.

We intend to add .NET 7 targets. We will make the library packages target .NET 7, .NET 5 (for backwards compatibility), and .NET Standard 2.0 (again, for backwards compatibility). We will change the unit test and example projects to target .NET 7 exclusively. In fact, we have already made this change in version 2.0.2!

Dependency updates

Many of Boa Constrictor’s dependencies have released new versions over the past year. GitHub’s Dependabot has also flagged some security vulnerabilities. It’s time to update dependency versions. This is standard periodic maintenance for any project. Already, we have updated our Selenium WebDriver dependencies to version 4.6.

Documentation enhancements

Boa Constrictor has a doc site hosted using GitHub Pages. As we make the changes described above, we must also update the documentation for the project. Most notably, we will need to update our tutorial and example project, since the packages will be different, and we will have support for more kinds of interactions.

What’s the timeline?

The core contributors and I plan to implement these enhancements within the next three months:

Today, we just released two new versions with incremental changes: 2.0.1 and 2.0.2.
This week, we hope to split the existing package into three, which we intend to release as version 3.0.
In December, we will refresh the GitHub Issues for the project.
In January, the core contributors and I will host an in-person hackathon (a “Constrictathon”) in Cary, NC.

There is tons of work ahead, and we’d love for you to join us. Check out the GitHub repository, read our contributing guide, and join our Discord server!

Making Great Waves: 8 Software Testing Convictions

The Great Wave Off Kanagawa.

Katsushika Hokusai, 1830.

It is one of the most recognizable works of art in the world. It is so famous, it has an emoji: 🌊.

The Great Wave Off Kanagawa is a Japanese woodblock print. It is not a painting or a drawing but a print. In Japanese, the term for this type of art is ukiyo-e, which means “pictures of the floating world.” Ukiyo-e prints first appeared around the 1660s and did not decline in popularity until the Meiji Restoration two centuries later. While most artists focused on subjects of people, late masters like Hokusai captured perspectives of landscapes and nature. Here, in The Great Wave, we see a giant wave, full of energy and ferocity, crashing down onto three fast boats attempting to transport live fish to market. Its vibrant blue water and stark white peaks contrast against a yellowish-gray sky. In the distance is Mount Fuji, the highest mountain in Japan, yet it is dwarfed in perspective by the waves. In fact, the water spray from the waves appears to fall over Mount Fuji like snow. If you didn’t look closely, you might presume that Mount Fuji is just the crest of another wave.

The Great Wave is absolutely stunning. It is arguably Hokusai’s finest work. The colors and the lines reflect boldness. The claws of the wave impart vitality. The men on the boat show submission and possibly fear. The spray from the wave reveals delicacy and attention to detail. Personally, I love ukiyo-e prints like this. I travel the world to see them in person. The quality, creativity, and craftsmanship they exhibit inspire me to instill the highest quality possible into my own work.

As software quality professionals, there are several lessons we can learn from ukiyo-e masters like Hokusai. Testing is an art as much as it is engineering. We can take cues from these prolific artists in how we approach quality in our own work. In this article, I will share how we can make our own “Great Waves” using 8 software testing convictions inspired by ukiyo-e prints like The Great Wave. Let’s begin!

Conviction #1: Focus on behavior

Although we hold these Japanese woodblock prints today in high regard, they were seen as anything but fancy centuries ago in Japan. Ukiyo-e was “low” art for the common people, whereas paintings on silk scrolls were considered “high” art for the high classes.

Folks would buy these prints from local merchants for slightly more than the cost of a bowl of noodles – about $5 to $10 US dollars today – and they would use these prints to decorate their homes. By comparison, a print of The Great Wave sold at auction for $1.11 million in September 2020.

These prints weren’t very large, either. The Great Wave measures 10 inches tall by 15 inches wide, and most prints were of similar size. That made them convenient to buy at the market, carry them home, and display on the wall. To understand how the Japanese people treated these prints in their day, think about the decorations in your homes that you bought at stores like Home Goods and Target. You probably have some screen prints or posters on your walls.

Since the target consumer for ukiyo-e prints were ordinary people with working-class budgets, they needed to be affordable, popular, and recognizable. When Hokusai published The Great Wave, it wasn’t a standalone piece. It was the first print in a series named Thirty-six Views of Mount Fuji. Below are three other prints from that series. The central feature in each print is Mount Fuji, which would be instantly recognizable to any Japanese person. The various views would also be relatable.

*Fine Wind, Clear Morning* shows nice weather against the slopes of the mountain with a powerful contrast of colors.

*Thunderstorm Beneath the Summit* depicts Mount Fuji from a nearly identical profile, but with lightning striking the lower slopes of the mountain amidst a far darker palate.

*Kajikazawa in Kai Province* depicts two fisherman with Mount Fuji in the background.

The features of these prints made them valuable. Anyone could find a favorite print or two out of a series of 36. They made art accessible. They were inexpensive yet impressive. They were artsy yet accessible. Artists like Hokusai knew what people wanted, and they delivered the goods.

This isn’t any different from software development. Features add value for the users. For example, if you’re developing a banking app, folks better be able to log in securely and view their latest transactions. If those features are broken or unintuitive, folks might as well move their accounts to other banks! We, as the developers and testers, are like the ukiyo-e artists: we need to know what our customers need. We need to make products that they not only want, but they also enjoy.

Features add value. However, I would use a better word to describe this aspect of a product: behavior. Behavior is the way one acts or conducts oneself. In software, we define behaviors in terms of inputs and responses. For example, login is a behavior: you enter valid credentials, and you expect to gain access. You gave inputs, the app did something, and you got the result.

My conviction on software testing AND development is that if you focus on good software behaviors, then everything else falls into place. When you plan development work, you prioritize the most important behaviors. When you test the features, you cover the most important behaviors. When users get your new product, they gain value from those features, and hopefully you make that money, just like Hokusai did.

This is why I strongly believe in the value of Behavior-Driven Development, or BDD for short. As a set of pragmatic practices, BDD helps you and your team stay focused on the things that matter. BDD involves activities like Three Amigos collaboration, Example Mapping, and writing Gherkin. When you focus on behavior – not on shiny new tech, or story points, or some other distractions – you win big.

Conviction #2: Prioritize on risk

Ukiyo-e artists depicted more than just views of Mount Fuji. In fact, landscape scenes became popular only during the late period of woodblock printing – the 1830s to the 1860s. Before then, artists focused primarily on people: geisha, courtesans, sumo wrestlers, kabuki actors, and legendary figures. These were all characters from the “floating world,” a world of pleasure and hedonism apart from the dreary everyday life of feudal Japan.

Here is a renowned print of a kabuki actor by Sharaku, printed in 1794:

*Kabuki Actor Ōtani Oniji III as Yakko Edobei in the Play The Colored Reins of a Loving Wife*
Tōshūsai Sharaku, 1794

Sharaku was active only for one year, but he produced some of the most expressive portraits seen during ukiyo-e’s peak period. A yakko was a samurai’s henchman. In this portrait, we see Edobei ready for dirty deeds, with a stark grimace on his face and hands pulsing with anger.

Why would artists like Sharaku print faces like these? Because they would sell. Remember, ukiyo-e was not high-class art. It was a business. Artists would make a series of prints and sell them on the streets of Edo (now Tokyo). They needed to make prints that people wanted to buy. If they picked lousy or boring subjects, their prints wouldn’t sell. No soba noodles for them! So, what subjects did they choose? Celebrities. Actors. “Female beauties.” And some content that was not safe for work.

Artists prioritized their work based on business risk. They chose subjects that would be easy to sell. They pursued value. As testers, we should also prioritize test coverage based on risk.

I know there’s a popular slogan saying, “Test all the things!”, but that’s just impossible. It’s like saying, “Print all the pictures!” Modern apps are too complex to attempt any sort of “complete” or “100%” coverage. Instead, we should focus our testing efforts on the most important behaviors, the ones that would cause the most problems if they broke. Testing is ultimately a risk-mitigating activity. We do testing to de-risk problems that enter during development.

So, what does a risk-based testing strategy look like? Well, start by covering the most valuable behaviors. You can call them the MVBs. These are behaviors that are core to your app. If they break, then it’s game over. No soba noodles. For example, if you can’t log in, you’re done-zo. The MVBs should be tested before every release. They are non-negotiable test coverage. If your team doesn’t have enough resources to run these tests, then get more resources.

In addition to the MVBs, cover areas that were changed since the previous release. For example, if your banking app just added mobile deposits, then you should test mobile deposits. Things break where developers make changes. Also, look at testing different layers and aspects of the product. Not every test should be a web UI test. Add unit tests to pinpoint failures in the code. Add API tests to catch problems at the service layer. Consider aspects like security, accessibility, and visuals.

When planning these tests, try to keep them fast and atomic, covering individual behaviors instead of long workflows. Shorter tests are more reliable and give space for more coverage. And if you do have the resources for more coverage beyond the MVBs and areas of change, expand your coverage as resources permit. Keep adding coverage for the next most valuable behaviors until you either run out of time or the coverage isn’t worth the time.

Overall, ask yourself this when weighing risks: How painful would it be if a particular behavior failed? Would it ruin a user’s experience, or would they barely notice?

Conviction #3: Automate

The copy of The Great Wave shown at the top of this article is located at the Metropolitan Museum of Art in New York City. However, that’s not the only version. When ukiyo-e artists produced their prints, they kept printing copies until the woodblocks wore out! Remember, these weren’t precious paintings for the rich, they were posters for the commoners. One set of woodblocks could print thousands of impressions of popular designs for the masses. It’s estimated that there were five to eight thousand original impressions of The Great Wave, but nobody knows for sure. To this day, only a few hundred have survived. And much to my own frustration, museums that have copies do not put them on public display because the pieces are so fragile.

Here are different copies of The Great Wave from different museums:

The Great Wave Off Kanagawa — From The Metropolitan Museum of Art

Print production had to be efficient and smooth. Remember, this was a business. Publishers would make more money if they could print more impressions from the same set of woodblocks. They’d gain more renown if their prints maintained high quality throughout the lifetime of the blocks. And the faster they could get their prints to market, the sooner they could get paid and enjoy all the soba noodles.

What can we learn from this? Automate! That’s our third conviction.

What can we learn from this? Automate! Automation is a force multiplier. If Hokusai spent all his time manually laboring over one copy of The Great Wave, then we probably wouldn’t be talking about it today. But because woodblock printing was a whole process, he produced thousands of copies for everyone to enjoy. I wouldn’t call the woodblock printing process fully “automated” because it had several tedious steps with manual labor, but in Edo period Japan, it was about as automated as you could get.

Compare this to testing. If we run a test manually, we cover the target behavior one time. That’s it: lots of labor for one instance. However, if we automate that test, we can run it thousands of times. It can deliver value again and again. That’s the difference between a painting and a print.

So, how should we go about test automation? First, you should define your goals. What do you hope to achieve with automation? Do you want to speed up your testing cycles? Are you looking to widen your test coverage? Perhaps you want to empower Continuous Delivery through Continuous Testing? Carefully defining your goals from the start will help you make good decisions in your test automation strategy.

When you start automating tests, treat it like full software development. You aren’t just writing a bunch of scripts, you are developing a software system. Follow recommended practices. Use design patterns. Do code reviews. Fix bugs quickly. These principles apply whether you are using coded or codeless tools.

Another trap to avoid is delaying test automation. So many times, I’ve heard teams struggle to automate their tests because they schedule automation work as their lowest priority. They wish they could develop automation, but they just never have the time. Instead, they grind through testing their MVBs manually just to get the job done. My advice is flip that attitude right-side up. Automate first, not last. Instead of planning a few tests to automate if there’s time, plan to automate first and cover anything that couldn’t be automated with manual testing.

Furthermore, integrate automated tests into the team’s Continuous Integration system as soon as possible. Automated tests that aren’t running are dead to me. Get them running automatically in CI so they can deliver value. Running them nightly or even weekly can be a good start, as long as they run on a continuous cadence.

Finally, learn good practices. Test automation technologies are ever-evolving. It seems like new tools and frameworks hit the market all the time. If you’re new to automation or you want to catch up with the latest trends, then take time to learn. One of the best resources I can recommend is Test Automation University. TAU has about 70 courses on everything you can imagine, taught by the best instructors in the world, and it’s 100% FREE!

Now, you might be thinking, “Andy, come on, you know everything can’t be automated!” And that’s true. There are times when human intervention adds value. We see this in ukiyo-e prints, too. Here is Plum Garden at Kameido by Utagawa Hiroshige, Hokusai’s main rival. Notice the gradient colors of green and red in the background:

Plum Garden in Kameido — *Plum Garden at Kameido*
Utagawa Hiroshige, 1857

Printers added these gradients using a technique called bokashi, in which they would apply layers of ink to the woodblocks by hand. Sometimes, they would even paint layers directly on the prints. In these cases, the “automation” of the printing process was insufficient, and humans needed to manually intervene.

It’s always good to have humans test-drive software. Automation is great for functional verification, but it can’t validate user experience. Exploratory testing is an awesome complement to automated testing because it mitigates different risks.

Nevertheless, automation is able to do things it could never do before. As I said before, I work at Applitools, where we specialize in automated visual testing. Take a look at these two prints of Matsumoto Hoji’s Frog from Meika Gafu. Notice anything different between the two?

Two different versions of Matsumoto Hoji’s *Frog*.

If we use Visual AI to compare these two prints, it will quickly identify the main difference:

Applitools Visual AI identifying visual differences (highlighted in magenta) between two prints.

The signature block is in a different location! Small differences like small pixel offsets are ignored, while major differences are highlighted. If you apply this style of visual testing to your web and mobile apps, you could catch a ton of visual bugs before they cause problems for your users. Modern test automation can do some really cool tricks!

Conviction #4: Shift left and right

Mokuhanga, or woodblock printing, was a huge process with multiple steps. Artists like Hokusai and Hiroshige did not print their artwork themselves. In fact, printing required multiple roles to be successful: a publisher, an artist, a carver, and a printer.

The publisher essentially ran the process. They commissioned, financed, and distributed prints. They would even collaborate with artists on print design to keep them up with the latest trends.
The artist designed the patterns for the prints. They would sketch the patterns on washi paper and give instructions to the carver and printer on how to properly produce the prints.
The carver would chisel the artist’s pattern into a set of wooden printing blocks. Each layer of ink would have its own block. Carvers typically used a smooth, hard wood like cherry.
The printer used the artist’s patterns and carver’s woodblocks to actually make the prints. They would coat the blocks in appropriately-colored water-based inks and then press paper onto the blocks.

Quality had to be considered at every step in the process, not just at the end. If the artist was not clear about colors, then the printer might make a mistake. If the carver cut a groove too deep, then ink might not adhere to the paper as intended. If the printer misaligned a page during printing, then they’d need to throw it away – wasting time, supplies, and woodblock life – or risk tarnishing everyone’s reputation with a misprint. Hokusai was noted for his stringent quality standards for carvers and printers.

The words of W. Edwards Deming ring true:

Inspection does not improve the quality, nor guarantee quality. Inspection is too late. The quality, good or bad, is already in the product. As Harold F. Dodge said, “You cannot inspect quality into a product.”
W. Edwards deming

This is just like software development. We can substitute the word “testing” for “inspection” in Deming’s quote. Testers don’t exclusively “own” quality. Every role – business, development, and testing – has a responsibility for high-caliber work. If a product owner doesn’t understand what the customer needs, or a developer skips code reviews, or if a tester neglects an important feature, then software quality will suffer.

How do we engage the whole team in quality work? Shift left and right.

Most testers are probably familiar with the term shift left. It means, start doing testing work earlier in the development process. Don’t wait until developers are “done” and throw their code “over the fence” to be tested. Run tests continuously during development. Automate tests in-sprint. Adopt test-driven and behavior-driven practices. Require unit tests. Add test implementation to the “Definition of Done.”

But what about shift right? This is a newer phase, but not necessarily a newer practice. Shift right means, continue to monitor software quality during and after releases. Build observability into apps. Monitor apps for bugs, failures, and poor performance. Do canary deployments to see how systems respond to updates. Perform chaos testing to see how resilient environments are to outages. Issue different UIs to user groups as part of A/B testing to find out what’s most effective. And feed everything you learn back into development a la “shift left.”

The DevOps Infinity Loop
(Source: https://www.atlassian.com/devops)

The famous DevOps infinity loop shows how “shift left” and “shift right” are really all part of the same flow. If you start in the middle where the paths cross, you can see arrows pointing leftward for feedback, planning, and building. Then, they push rightward with continuous integration, deployment, monitoring, and operations. We can (and should) take all the quality measures we said before as we spin through this loop perpetually. When we plan, we should build quality in with good design and feedback from the field. When we develop, we should do testing together with coding. As we deploy, automated safety checks should give thumbs-up or thumbs-down. Post-deployment, we continue to watch, learn, and adjust.

Conviction #5: Give fast feedback

The acronym CI/CD is ubiquitous in our industry, but I feel like it’s missing something important: “CT”, or Continuous Testing. CI and CD are great for pushing code fast, but without testing, they could be pushing garbage. Testing does not improve quality directly, but continuous revelation of quality helps teams find and resolve issues fast. It demands response. Continuous Testing keeps the DevOps infinity loop safe.

Fast feedback is critical. The sooner and faster teams discover problems, the less pain those problems will cause. Think about it: if a developer is notified that their code change caused a failure within a minute, they can immediately flip back to their code, which is probably still open in an editor. If they find out within an hour, they’ll still have their code fresh in their mind. Within a day, it’ll still be familiar. A week or more later? Fuggedaboutit! Heaven forbid the problem goes undetected until a customer hits it.

Continuous testing enables fast feedback. Automation enables continuous testing. Test automation that isn’t running continuously is worthless because it provides no feedback.

Japanese woodblock printers also relied on fast feedback. If they noticed anything wrong with the prints as they pressed them, they could scrap the misprint and move on. However, since they were meticulous about quality, misprints were rare. Nevertheless, each print was unique because each impression was done manually. The amount, placement, and hue of ink could vary slightly from print to print. Over time, the woodblocks themselves wore down, too.

Here, you can see differences in the title cartouche between different prints of The Great Wave:

Differences in the title cartouche between two prints of *The Great Wave*.
(Source: https://blog.britishmuseum.org/the-great-wave-spot-the-difference/)

On the left, the outline around the title is solid, whereas on the right, the outline has breaks. This is because the keyblock had very fine ridges for printing outlines, which suffered the most from wear and tear during repeated impressions. Furthermore, if you look very closely, you can see that the Japanese characters appear bolder on the right than the left. The printer must have used more ink or pressed the title harder for the impression on the right.

Printers would need to spot these issues quickly so they could either correct their action for future prints or warn the publisher that the woodblocks were wearing down. If the print was popular, the publisher could commission a carver to carve new woodblocks to keep production going.

Conviction #6: Go lean

As I’ve said many times now, woodblock printing was a business. Ukiyo-e was commercial art, and competition was fierce. By the 1840s, production peaked with about 250 different publishers. Artists like Hokusai and Hiroshige were rivals. While today we recognize famous prints like The Great Wave, countless other prints were also made.

Publishers competed in a rat race for the best talent and the best prints. They had to be savvy. They had to build good reputations. They needed to respond to market demands for subject material. For example, Kitagawa Utamaro was famous for prints of “female beauties.”

*Two Beauties with Bamboo*
Kitagawa Utamaro, 1795

Ukiyo-e artists also took inspiration from each other. If one artist made a popular design, then other artists would copy their style. Here is a print from Hiroshige’s series, Thirty-Six Views of Mount Fuji. That’s right, Hokusai’s biggest rival made his own series of 36 prints about Mount Fuji, and he also made his own version of The Great Wave. If you can’t beat ‘em, join ‘em!

*The Sea off Satta in Suruga Province*
Utagawa Hiroshige, 1858

Publishers also had to innovate. Oftentimes, after a print had been in production for a while, they would instruct the printer to change the color scheme. Here are two versions of Hokusai’s Kajikazawa in Kai Province, from Thirty-six Views of Mount Fuji:

Kajikazawa in Kai Province — An early impression

The print on the left is an early impression. The only colors used were shades of blue. This was Hokusai’s original artistic intention. However, later prints, like the one on the right, added different colors to the palette. The fishermen now wear red coats. The land has a bokashi green-yellow gradient. The sky incorporates orange tones to contrast the blue. Publishers changed up the colors to squeeze more money out of existing designs without needing to pay artists for new work or carvers for new woodblocks.

However, sometimes when doing this, artistic quality was lost. Compare the fine detail in the land between these two prints. In the early impression, you can see dark blue shading used to pronounce the shadows on the side of the rocks, giving them height and depth, and making the fisherman appear high above the water. However, in the later impression, the green strip of land has almost no shading, making it appear flat and less prominent.

Ukiyo-e publishers would have completely agreed with today’s lean business model. Seek first and foremost to deliver value to your customers. Learn what they want. Try some designs, and if they fail, pivot to something else. When you find what works, get a full end-to-end process in place, and then continuously improve as you go. Respond quickly to changes.

Going lean is very important for software testing, too. Testing is engineering, and it has serious business value. At the same time, testing activities never seem to have as many resources as they should. Testers must be scrappy to deliver valuable quality feedback using the resources they have.

When I think about software testing going lean, I’m not implying that testers should skip tests or skimp on coverage. Rather, I’m saying that world-class systems and processes cannot be built overnight. The most important thing a team can do is build basic end-to-end feedback loops from the start, especially for test automation.

So many times, I’ve seen teams skew their test automation strategy entirely towards implementation. They spend weeks and weeks developing suites of automated tests before they set up any form of Continuous Testing. Instead of triggering tests as part of Continuous Integration, folks must manually push buttons or run commands to make them start. Other folks on the team see results sporadically, if ever. When testers open bug reports, developers might feel surprised.

I recommend teams set up Continuous Testing with feedback loops from the start. As soon as you automate your first test, move onto running it from CI and sending you notifications for results before automating your second test. Close the feedback loop. Start delivering results immediately. As you find hotspots, add more coverage. Talk with developers about the kinds of results they find most valuable. Then, grow your suite once you demonstrate its value. Increase the throughput. Turn those sidewalks into highways. Continue to iteratively improve upon the system as you go. Don’t waste time on tests that don’t matter or dashboards that nobody reads. Going lean means allocating your resources to the most valuable activities. What you’ll find is that success will snowball!

Conviction #7: Open up

Once you have a good thing going, whether it’s woodblock printing or software testing, how can you take it to the next level? Open up! Innovation stalls when you end up staring at your own belly button for too long. Outside influences inspire new creativity.

Ukiyo-e prints had a profound impact on Western art. After Japan opened up to the rest of the world in the mid-1800s, Europeans became fascinated by Japanese art, and European artists began incorporating Japanese styles and subjects into their work. This phenomenon became known as Japonisme. Here, Claude Monet, famous for his impressionist paintings, painted a picture of his wife wearing a kimono with fans adorning the wall behind her:

Vincent van Gogh in particular loved Japanese woodblock prints. He painted his own versions of different prints. Here, we see Hiroshige’s Plum Garden at Kameido side-by-side with Van Gogh’s Flowering Plum Orchard (after Hiroshige):

Plum Garden at Kameido — Hiroshige’s original print

Flowering Plum Orchard (after Hiroshige) — Hiroshige’s original print

Van Gogh was drawn to the bold lines and vibrant colors of ukiyo-e prints. There is even speculation that The Great Wave inspired the design of The Starry Night, arguably Van Gogh’s most famous painting:

The Starry Night — Hokusai’s *The Great Wave Off Kanagawa*

Notice how the shapes of the waves mirror the shapes of the swirls in the sky. Notice also how deep shades of blue contrast yellows in each. Ukiyo-e prints served as great inspiration for what became known as Modern art in the West.

Influence was also bidirectional. Not only did Japan influence the West, but the West influenced Japan! One thing common to all of the prints in Thirty-six Views of Mount Fuji is the extensive use of blue ink. Prussian blue pigment had recently come to Japan from Europe, and Hokusai’s publisher wanted to make extensive use of the new color to make the prints stand out. Indeed, they did. To this day, Hokusai is renowned for popularizing the deep shades of Prussian blue in ukiyo-e prints.

It’s important in any line of work to be open to new ideas. If Hokusai had not been willing to experiment with new pigments, then we wouldn’t have pieces like The Great Wave.

That’s why I’m a huge proponent of Open Testing. What if we open our tests like we open our source? There are so many great advantages to open source software: helping folks learn, helping folks develop better software, and helping folks become better maintainers. If we become more open in our testing, we can improve the quality of our testing work, and thus also the quality of the software products we are building. Open testing involves many things: building open source test frameworks, getting developers involved in testing, and even publicly sharing test cases and results.

Conviction #8: Show empathy

In this article, we’ve seen lots of great artwork, and we’ve learned lots of valuable lessons from it. I think ukiyo-e prints remain popular today because their subject matter focuses on the beauty of the world. Artists strived to make pieces of the “floating world” tangible for the common people.

Ukiyo-e prints revealed the supple humanity of the Japanese people, like in this print by Utagawa Kunisada:

*Twilight Snowfall at Ueno*
Utagawa Kunisada, 1850

They revealed the serene beauty of nature in harmony with civilization, like in these prints from Hiroshige’s One Hundred Famous Views of Edo:

Prints from *One Hundred Famous Views of Edo*
Utagawa Hiroshige, 1856-1858

Ukiyo-e prints also revealed ordinary people living out their lives, like this print from Hokusai’s Thirty-six Views of Mount Fuji:

*Fuji View Field in Owari Province*
Katsushika Hokusai, 1830

Art is compelling. And software, like art, is meant for people. Show empathy. Care about your customers. Remember, as a tester, you are advocating for your users. Try to help solve their problems. Do things that matter for them. Build things that actually bring them value. Be thoughtful, mindful, and humble. Don’t be a jerk.

The Golden Conviction

These eight convictions are things I’ve learned the hard way throughout my career:

Focus on behavior
Prioritize on risk
Automate
Shift left and right
Give fast feedback
Go lean
Open up
Show empathy

I live and breathe these convictions every day. Whether you are making woodblock prints or running test cases, these principles can help you do your best work.

If I could sum up these eight convictions in one line, it would be this: Be excellent in all things. If you test software, then you are both an artist and an engineer. You have a craft. Do it with excellence.