Author: Andy Knight

I'm a software engineer who specializes in test automation.

BDD Gherkin Guidelines for AI Coding and Testing

AI coding agents following BDD (Behavior-Driven Development) principles can write great Gherkin scenarios if they are given the proper rules. Without explicit rules, AI-generated Gherkin often drifts into vague Then steps, UI-heavy scripts, multi-behavior scenarios, and placeholder examples that read like filler. That is not a model problem alone; it is a missing context problem.

I have written Gherkin Guidelines for AI, an open-source context file created to become a default BDD reference for AI-assisted scenario writing, AI code review, and Gherkin-based test automation. It is one Markdown file you can attach to Cursor, Claude, Copilot, Codex, or any tool that accepts project context.

The context file is gherkin-guidelines.md, located in the GitHub repository at https://github.com/AutomationPanda/gherkin-guidelines-for-ai.

To add these guidelines to your project:

Download gherkin-guidelines.md from the GitHub repository.
Place it in your own project alongside your specs or context files.
Wire it up to your project’s rules, skills, or sub-agents so your AI coding agents will abide by it.

Please review the repository’s README for full setup and usage instructions.

When you are ready, try it on your next user story: load the guidelines, ask your agent for scenarios, and see how it feels when everyone is reading from the same playbook. Good specs are a team sport, and this file is here to make your first pass a little lighter and a lot clearer. Whether you want to simply code the vibes or do all-out Spec-Driven Development, I hope that my Gherkin guidelines can help!

Prioritizing Small Things

Tonight, I fixed a long-festering car problem: a slow oil leak.

The previous owner had known about the leak for a few years. Her mechanic traced it to the back of the engine but could not locate its source and therefore did not fix it. Thankfully, she informed me about the leak when I acquired the vehicle.

The oil leak was not a major issue, but it did create a burning smell. I put it on my backlog of car projects. (Yes, I have a whole Kanban board for my cars.) It was low priority for the longest time until it finally bubbled up towards the top.

With some Internet sleuthing, I quickly pinpointed the cause: failed expansion plugs on the backside of the engine. I bought OEM replacement plugs because the cheap aftermarket ones had terrible reviews. The job was straightforward: open the hood, remove the engine covers, and pop the old plugs off with a flathead screwdriver. The new ones simply pushed right on. It took me a half hour tops, including the time to take photos.

There were actually two oil leaks. They had clear root causes: the driver side plug was cracked, and the passenger side plug had hardened gunk on its mating surface. All better now!

There are lots of small problems festering in our projects, whether they are car projects, software projects, or something else. I often wonder if it is worth prioritizing lesser fixes if they require smaller effort but improve quality of life. If I had known how easy and inexpensive the fix would be, then I probably would have done it much sooner. I also recognize how smaller tasks can become impossible problems without the right knowledge. I still don’t know why that one specialist mechanic shop couldn’t fix the issue.

Perhaps, in the end, what we choose to prioritize is a reflection of ourselves. We do what we want – and what we can.

The Testing Skyscraper: A Modern Alternative to the Testing Pyramid

Every good software tester knows that a good testing strategy should adhere to the classic Testing Pyramid structure: a strong base of unit tests at the bottom, a solid layer of API tests in the middle, and a few UI tests at the top for good measure. The Testing Pyramid has been around longer than I’ve been working in the software industry, and it is arguably the most prevalent mental model in the discipline of testing.

For years, I abided by the Testing Pyramid. I formed my test plans based upon it. Heck, I even wrote a popular article about it. However, after many years of blindly accepting it, I’m ready to make a rather bold claim: the Testing Pyramid is an antiquated scheme that deceives testers. I’m leaving the pyramid scheme and embracing a new, more modern approach. Even if you think this is heresy, please allow me to explain my rationale.

The Testing Pyramid: A Relic of History

I started my professional career in software in 2007. Back then, Apple had just released the first iPhone, and Facebook was so new that they only allowed college students to create accounts. Web applications, RESTful architecture, and Selenium were all new things. Developing and testing software systems looked much different.

The Testing Pyramid evolved as a simple mental model to help testers decide what to test and how to test it based on the constraints of the time. Web UI testing was notoriously difficult. Browsers were not as standardized as they are today. Selenium WebDriver enabled UI automation but required testers to write their own frameworks around it. Test execution was often slow and flaky. As a result, testers called UI tests “bad” and did everything they could to avoid them, favoring lower-level tests instead. Unit tests were “good” because they were fast, reliable, and close to the code they covered. Plus, code coverage tools could automatically quantify coverage levels and identify gaps. API tests were “okay” because they were typically small and fast, even if they needed to make a network hop to a live environment. Thus, a “proper” test strategy took a pyramidal shape that favored lower level tests for their speed and reliability. It made sense at the time.

Are We Stuck in the Past?

The factors that pushed strategies to take a triangular shape have changed since the inception of the Testing Pyramid all those years ago. Pyramids now feel like relics of ancient history. Let’s take a reality check.

UI testing tools are better, faster, and more reliable. New frameworks like Playwright and Cypress provide greater stability through automatic waiting, faster execution times, and overall better testing experiences. Selenium is still kicking with the BiDi protocol for better testing support, Selenium Manager for automatic driver management, and a plethora of community projects (like Boa Constrictor) helping testers maximize Selenium’s potential.
Traditional API testing can largely be replaced by other kinds of tests. Internal handler unit tests can cover the domain logic for what happens inside the services. Contract tests can cover the handshakes between different services to make sure updates to one won’t break the integrations with others. And UI tests can make sure the system works as a whole.
Test orchestration can now run tests continuously. Tests can run for every code change. They can run for pull requests. Some developers even run end-to-end tests locally before committing changes. The ability to deliver fast feedback on important areas matters far more than the types or times of tests.

Therefore, it is wrong to say a test is bad simply based on its type. All test types are good because they mitigate different kinds of risks. We should focus on building robust continuous feedback loops rather than quotas for test types.

The Testing Skyscraper: A New Model

I think a better mental model for modern testing is the Testing Skyscraper. The skyscraper is a symbol of industrial might and technological advancement. Each skyscraper has a unique architecture that makes it stand out against the skyline. Pyramids get narrower as you approach the top, but skyscrapers have several levels of varying sizes and layouts, where each floor is tailored to the needs of the building’s tenants.

Skyscrapers are a great analogy for testing strategies because one size does not fit all:

Testers can architect their strategies to meet their needs. They can design it as they see fit.
Testers can build out tests at any level they need. Every level of testing is deemed good if it meets the business needs.
Testers can build out as much testing at each level as they like. A floor may have zero-to-many “tenants.”
Testers can choose to skip tests at different levels as a calculated risk. They’ll just be empty floors in the building until needs change.
New testing tools are as strong as steel. Testers can build strategies that scale upwards and onwards, faster and higher than ever before.

The shape of the skyscraper does NOT imply that there should be an equal number of tests at each level. Instead, the metaphor implies that each test strategy is unique and that each level can be built as needed with the freedom of modern architecture. It’s not about quantities or quotas.

I’ve seen anti-pattern models such as ice cream cones and cupcakes. Testing Pyramid might now be an anti-pattern as well.

Modern Architecture for the Present Day

Pyramids were great for their time. The ancients like the Egyptians, the Sumerians, and the Mayans built impressive pyramids that still stand today. However, no civilization has built new pyramids for centuries, unless you count the ones at the Louvre or in Las Vegas. They’re impractical. They’re short. They require an enormous base. Let’s let go of the past and embrace the modern future. Let’s build structures that reflect our times. Let’s build Testing Skyscrapers that reach for the stars – and look snazzy while doing it.

Fixing a Bug

I own a 1970 Volkswagen Beetle. Come on, wouldn’t you expect a software tester to drive a Bug? It’s been my daily driver for about four years now. To be honest, it’s given me more trouble than any software project I’ve ever had, but I still love to drive it.

There’s a lot I’ve learned from my journey with my Bug. I had to learn to drive stick. I blew up the engine not just once but twice. I’ve repaired, replaced, or renovated practically every part on this vehicle. The journey has been tough but rewarding. There are many lessons from software I applied to my repairs – and many lessons I learned from getting dirty fingernails that are just as applicable to software engineering.

Last year, I shared the story of my Bug at QA or the Highway in the closing keynote. Later, I recorded a video version of the talk for Abstracta’s Amplify event in Tulsa. I have finally now published it publicly to YouTube for everyone to enjoy.

I hope you love it:

Vibe coding while live streaming

AI-assisted coding tools are great, but they can do unexpected things.

I gave a 90-minute workshop today on Playwright at Testµ 2025, an online testing conference hosted by LambdaTest. I’ve given my Playwright workshop many times before, but I’ve always taught it with “traditional” coding techniques – the way we automated tests before LLMs hit the scene. In today’s workshop, I tried to spice it up with vibe coding in Cursor. We had mixed results.

Perhaps my approach was too ambitious. I tried to develop a small web app first and then teach how to automate tests for it. I’ve had great success recently building small web apps quickly with AI. Unfortunately, though, the workshop app turned out to be a mess. Thankfully, I had another pre-built web app handy as a backup plan. I was able to get Playwright tests up and running with it pretty quickly. The AI did a decent job refining a script produced by Playwright’s code generator, removing duplicate interactions, adding assertions, and abstracting steps into page objects.

My “workshop” was really more like a livestream. It’s hard to set up lessons with exercises on a short virtual call. Even though the code didn’t turn out like I planned, I was honest and direct with the attendees. I demonstrated where AI-assisted coding tools shine and where they stink. We came up with decent results for Playwright testing. And the attendees seemed to get a lot of value out of it. They asked tons of questions and remained active in the chat for the whole session.

While I am slightly disappointed in myself for not being more prepared to avoid pitfalls, I’m glad that I could show the real me. I’m not perfect in my software practices, but I can still be productive, and I can deliver meaningful value. Hopefully, my workshop encouraged others to be bold in trying new things. And I even learned a few things to make my future workshops better.

Mercedes-Benz W203 Rear Subframe Rust

Do you own a Mercedes-Benz C-Class (W203) vehicle from the 2000s? If so, did your rear subframe rust from the inside out? If it did, then you’re not alone. Mine also rusted out – and almost killed me when it did. It’s unfortunately a very common problem with many Mercedes models over the past two decades. I’m writing this article to share my experiences and offer advice to anyone else who hits this problem.

What happened to my vehicle?

On May 23, 2025, I was driving my 2006 Mercedes-Benz C280 4MATIC from Maryland to North Carolina. At about 10pm, I was driving past Alberta, VA on I-85 South at about 70 mph. I was alert. The road was clear. Then suddenly, I heard and felt a loud “boom,” as if I hit a massive pothole. Immediately, the vehicle started fishtailing violently. Thinking that I must have hit something that I just didn’t see, I slowed down and attempted to regain control of the vehicle with careful steering. I had this happen other times before. However, this time, after a few seconds, the fishtailing became worse.

Thankfully, I was near an exit with a well-lit gas station. Praise God! (Bear in mind, on this stretch of I-85, exits can be several miles apart.) I slowed down sharply and pulled off the highway. As I drove into the gas station, I noticed that I had to hold the steering wheel at 90º to the left to keep the car straight. The dashboard also said that ESP (Electronic Stability Program) was no longer working.

Once parked, I discovered that the rear driver-side wheel was completely out of alignment. It was severely toe-out with negative camber. There was no way I could safely drive it home. I suspected the axle had broken, but I could not determine the true root cause in the dark. I called AAA for roadside assistance. They sent a tow truck that didn’t show up until about 2:15am – over 4 hours later.

Diagnosing the root cause

The tow truck arrived at the auto shop where my uncle worked at about 4am. We immediately put it on a lift and discovered the root cause: the rear subframe had rusted to the point of breaking. There were perforation holes the size of quarters in the subframe. We could poke through the rusted metal with our bare hands. It was crumbling into a myriad of rusty flakes. One of the suspension control arm mounts had completely separated from the frame, which explained the wheel misalignment.

It was then that I realized: I could have died. If I had lost control, the vehicle could have spun out at 70 mph and perhaps even flipped. I shudder to think if cars behind me could have hit me, too.

Discovering the product defect

After a little online sleuthing, I discovered that rear subframe rust was a common problem on several Mercedes-Benz models, particularly the W203 and W204 platforms. In fact, Mercedes-Benz USA extended the warranty on the rear subframe from 4 years / 50K miles to 20 years / unlimited miles for several models, but the C-Class was covered only for years 2008-2015. Unfortunately, my vehicle was not covered because it was a 2006, even though it was the exact same issue.

Furthermore, I discovered that there was a class-action lawsuit against Mercedes-Benz USA for this rear subframe rust issue. The lawsuit claims that the rear subframe rust is a product defect that poses a serious safety threat and that Mercedes-Benz’s extended warranty did not do enough to rectify the issue. The lawsuit covered various models from 2010-2022, which again did not cover my vehicle. At the time of writing this article, the lawsuit is still active as Case § 1:23-cv-00636 in the state of Georgia.

Requesting a goodwill repair

The repair estimates to fix the subframe were quite expensive. My uncle’s shop quoted over $3000. My local Mercedes-Benz dealership quoted close to $6000. I wasn’t going to pay that much to fix a 20-year-old car.

Since this was clearly a known product defect, I asked Mercedes-Benz USA for a goodwill repair. I called the MBUSA customer service line, and they told me to take it to the dealership for an inspection. Then, I asked the service advisor at the dealership to file a goodwill repair request on my behalf. After waiting about a week and calling the customer service line back, a case manager finally contacted me to tell me that MBUSA had rejected my goodwill repair request. I was on my own to fix the problem.

Fixing the problem myself

I chose to replace the rear subframe myself. I bought an after-market subframe as well as new suspension control arms, new spring mounts, a new sway bar, and new bolts, since all those parts also rusted. I spent about $1100 total in parts. The job was a LOT of work, and I would not recommend it for anyone not experienced with car repairs. Nevertheless, I was able to do it all in my home garage with standard tools.

Here were the steps:

Jack up the car and remove the rear wheels.
Remove the exhaust system. (I had to cut rusted bolts.)
Disconnect the rear differential from the drive shaft.
Disconnect ABS and brake pad sensor wires from the wheels and the body.
Disconnect the parking brake lines at the junction box under the rear seat.
Disconnect the rear shocks.
Loosen the four subframe mounting bolts, and lower the whole rear wheel assembly carefully.
Disconnect the suspension control arms and sway bar.
Disconnect and remove the subframe from the differential.
Paint the differential, axles, and other appropriate parts with rust encapsulator.
Reverse the steps to rebuild the rear assembly.
- Don’t tighten the bushings while removed or in the air. Tighten them at ride height.
Take the car to a shop to do a 4 wheel alignment.

Below is a comparison of the old assembly to the new subframe. I shared additional pictures in an Instagram post.

The following YouTube video helped me greatly:

Pursuing other avenues

I filed a safety complaint with the National Highway Traffic Safety Administration (NHTSA) regarding the rear subframe rust. It is complaint #11669316.

I also contacted the attorneys representing the plaintiffs in the class-action lawsuit against Mercedes-Benz USA. One of them actually called me to discuss the case. He told me that he knew of many instances similar to mine. Unfortunately, my vehicle could not be included in the case because it was outside the model year range. He also told me that it would be difficult (but not impossible) to win a case against Mercedes-Benz USA due to the age of my vehicle, despite an impeccable service record and the fact that the car was garage-kept for the majority of its life.

I considered filing a complaint with the Better Business Bureau, but I have not yet done so, and I’m doubtful if I will.

My reasons for sharing

I wrote this article to help others who, like me, discover that the rear subframe in their Mercedes has rusted to the point of failure. I hope it can clarify that this is a common issue and offer advice for what to do about it. I’m not angry about the situation, but I am disappointed in Mercedes-Benz for not covering the W203 models in its extended warranty or as a goodwill repair. In the end, I’m just glad to have my C280 back on the road.

Boa Constrictor Update (May 2025)

Boa Constrictor, the .NET Screenplay Pattern, started in 2018 and is still actively used in 2025. In total, its NuGet packages have over half a million downloads! However, the project’s activity has slowed down significantly in recent times. It’s been two and a half years since I gave my last major update about Boa Constrictor. In this article, I want to cover major developments, explain why things have been slow, and suggest a path for the future.

Major developments

We accomplished many of the goals for Boa Constrictor 3. In fact, all the Boa Constrictor packages are currently set at version 4! Here’s a quick summary of what’s available:

Each set of interactions has its own dedicated NuGet package. For example, Boa.Constrictor.Selenium contains all the Selenium WebDriver interactions for Web UIs, and Boa.Constrictor.RestSharp contains all the RestSharp interactions for APIs. That way, testers can configure their test projects to download only the packages that are needed.
We released a new XUnit package that provides special loggers for XUnit tests.
We recently released a new Playwright package that provides Abilities and Interactions for Playwright. This package should be treated as a beta version initially.
We updated all unit test projects to run on .NET 8.
We made minor fixes and updated various dependency versions.

Many thanks to all our contributors for all their hard work. Special thanks goes to “thePantz” for implementing the XUnit and Playwright packages.

Unfortunately, there are some things we did not accomplish. We did not add shadow DOM support for the Selenium package as hoped. We also did not add support for Applitools, and we no longer plan to add it. There is also a lot of information that is now missing from the doc site.

Why have things been slow?

The answer is simple: the maintainers are no longer actively using Boa Constrictor. I haven’t used Boa Constrictor in my day-to-day work since I left Q2 in November 2021, which was 3.5 years ago. The other maintainers have also either moved onto new jobs or new responsibilities. Things moved quickly from about 2020-2021 because the maintainers and I used Boa Constrictor on a daily basis. Now, it’s difficult for us to find time to work on the project because we just don’t use it ourselves anymore. Frankly, I have barely touched .NET since leaving Q2.

A path for the future

The Boa Constrictor project is neither “dead” nor “abandoned,” but there need to be some changes for it to continue in the future.

First, we have invited new maintainers to the project who have demonstrated a sense of ownership and contributed meaningful changes to the codebase. These new maintainers use Boa Constrictor actively and have the right stuff to keep the project going. Note that “maintainers” are folks who have the power to approve and complete pull requests. Anyone can submit code contributions to the project – you do not need to be a maintainer to participate.

Second, I will focus more on project management and less on coding. I haven’t made any serious or significant code contributions to Boa Constrictor in about two years, and since I’m not actively using the project itself (let alone working in the .NET stack), it is unlikely that I will be contributing any code in the foreseeable future. For transparency, I should recognize the reality and publicly state it. I still want Boa Constrictor to be useful to others for .NET test automation. I think the best way for me to serve the project and the community now is to empower others to contribute.

Third, the community should be empowered to make their own Boa Constrictor packages. Boa Constrictor’s design adheres to SOLID principles, which enables developers to add new Abilities, Tasks, and Questions without needing to modify the core pattern or any of the other Interactions. New packages do not necessarily need to be added to the main Boa Constrictor repository, either. Developers could build and maintain new Screenplay Interactions privately for themselves and their teams. They could also release them to NuGet as their own open source packages separate from the main project. If they feel like their package adds major value, they could contribute it to the main repository through a pull request. All of these are valid options, and I would support any of them. Just note that contributing to the main repository will likely be the slowest path because the maintainers would need to review the code, which could take a long time. So, if you want a particular Boa Constrictor package, be empowered to build it yourself. Don’t wait for the maintainers to build it for you.

The Screenplay Pattern’s place in test automation

When I first started implementing the Screenplay Pattern in C#, I was looking for a better alternative to page objects with Selenium WebDriver. Previous test projects had burned me with unintelligible page object classes that stretched for thousands of lines each with duplicative methods and no proper waiting. I saw the Screenplay Pattern as a way to define better web interactions that could handle waiting automatically and be composed together. Then, I realized that it could be used for any kind of testing, not just Web UI. The pattern provided a natural way to join multiple kinds of interactions together into seamless workflows. It made test automation for large, complex system manageable and well-organized, rather than the mess it typically becomes.

One of my biggest goals in releasing Boa Constrictor as an open source project was to reinvigorate interest in the Screenplay Pattern in our industry. I believed it was the way to make better interactions for better automation, and I wanted to make it simpler for folks to understand. Based on the number of package downloads, the event talks, and the Discord server, I think we accomplished that goal.

5 Dev Hacks For Your Software Work Life

Those of us who work in software all spend countless hours grinding away on our laptops to push bits and bytes around. We get set in our ways. We use the tools we are given. We get the job done. Many of us could benefit from a few “dev hacks” – kind of like “life hacks” that you see in short-form media, but for developers (or engineers or whatever kind of software title you bear). Here’s a list of dev tips that have helped me, and that I hope can help you.

#1. Buy a laptop case for your stickers

Stickers are like tattoos for inanimate objects, and the greatest canvas for dev stickers is the dev laptop. The only problem is that when the laptop goes, so too go the stickers. If it’s a corporate laptop, then IT department will curse you for making them scrape the stickers off.

What’s the solution? Buy a laptop case, and put your stickers on the case instead of directly onto the machine. That way, you can easily remove the stickers whenever you want! You could even turn old cases into wall art. Laptop cases are typically inexpensive, so there’s no reason why not to buy one.

Here are my laptop stickers on a hard plastic case:

#2. Use Excalidraw for nice diagrams

Excalidraw is an online diagraming tool that is easy to use and makes beautiful diagrams. It is free to use, and you can pay-to-win with advanced features. I use it all the time. It’s incredible. Check it out:

Screenshot 2025-04-12 at 10.32.58 AM.png

#3. Invest in a decent audio/video setup

Ever since the COVID shutdowns, everybody has virtual meetings. You need to dial in, appear on camera, and speak on a microphone. You might as well look good and sound good while doing it. It’s more professional, and others will subconsciously think more highly of you. So, what should you get?

What you need	What I use
A high-quality USB microphone with a boom arm	Blue Yeti
A 4K webcam	MacBook Pro webcam / Sony ZV-1
Dual desk lights, angled downward	2 x Elgato Key Light
Noise-cancelling headphones	Sony WH-1000XM5

Here’s my setup:

Also, use those headphones to listen to some Lo-Fi music while working!

#4. Keep your snacks and drinks close

When you’re in the flow state, you do not want to be interrupted. Don’t let your hunger or thirst be an interruption. Keep a stockpile of your favorite snacks and drinks within arm’s reach. You could hold a stash of Reese’s peanut butter cups in a drawer. You could put a coffee machine on your desk. I keep a mini-fridge stocked full of Cheerwine, Ito En green tea, and Liquid Death cans in my office.

#5. Become part of tech community

Joining a tech community is one of the best ways to level up your skills. You will meet awesome people and become inspired by the ideas they share. I know I wouldn’t be the Panda I am today if I didn’t get involved in the Python community and the Testing community. There’s a big difference between being a user of a technology and being part of the community for the technology. Go find local meetups for tech that interests you. Attend a conference. Contribute to open source projects. Put yourself out there!

Is BDD Dying?

These days, folks keep asking me an uncomfortable question: “Is BDD dying?” Behavior-Driven Development has been around for about twenty years, but recently, it feels like the movement has stalled. Is BDD dead? Are we really at this point?

NO! Heck no. Not if I have anything to do with it. But I’ll be honest, we have some work to do to set things right. BDD needs to evolve. Let’s cover the story of what happened to BDD, what are the good things we should preserve from it, and how we build a better future on behavior-driven principles.

The Core Principles

Let’s start by defining Behavior-Driven Development. I’ve always defined BDD as a set of helpful practices to help you put your primary focus on the behaviors of the software that you are developing. Not on the code, not on the tools, but on the actual functionality of the product. Why? That’s what your users do! If we focus on behaviors first, then everything else falls into place.

Ultimately, behaviors matter more than code. Users don’t care if you wrote your app in JavaScript or Python. They don’t care if it’s running on Azure or AWS. They need your app to solve their problems. To get things done. To do their jobs. You can have the best code in the world with 100% unit test coverage and a perfect Agile process, but it doesn’t matter if the behaviors of the software product itself don’t deliver any meaningful business value.

The Panda is inspired to think about behaviors.

If we want to prioritize software behaviors, then we should ask ourselves a few important questions to shift our thinking:

What if we tested behaviors together with the code? Both are important. It helps to have developers and testers on the same page.
What if we described behaviors in plain language rather than in cryptic programming languages? Software is meant to be used by people. If we can’t explain how to use it in plain language, how will anybody ever be able to understand it enough to use it?
What if we defined all these behaviors before ever touching any code? We could resolve many design issues before committing the time to code it. We would also have an agreement on what should be built that we could use to hold the team accountable.
What if all the different roles on the team – business, development, and testing – all collaborated on these behavior specs? Multiple perspectives bring valuable insights into product development that ultimately contribute to higher quality and greater value.
What if those behavior specs that the team writes together could be automated directly with special tooling? All those specs essentially become test cases. We could set up continuous feedback loops to tell us if we are building the right things and if they are working. The specs become Living Documentation.

And that’s how BDD was born! BDD is an orientation towards business value. It became a set of practices to help people focus first on behaviors and second on implementation details. People may love or hate the outcomes of the movement, but I don’t think anybody can rightfully disagree with the premise.

A Concise History

The first major BDD tools were test frameworks. They appeared in every major language by the early 2010s. Many of you have probably used one of these frameworks. The most popular one was Cucumber. My favorite was SpecFlow – a masterpiece of “automationeering.” Almost all other BDD test frameworks are derivatives of Cucumber.

A panda wearing ancient Chinese robes and holding a cucumber.

The early 2010s were truly the Golden Age of BDD, a time of flourishing and growth. Teams around the world started embracing it. Engineers integrated BDD frameworks with tools like Selenium WebDriver and Jenkins. Design patterns like the Screenplay Pattern arose. There was this incredible outpouring of new, exciting stuff!

BDD’s secret sauce was Gherkin. I’m sure almost everyone has seen Gherkin’s Given-When-Then steps. Its beauty was its simplicity. Folks could read Gherkin scenarios and immediately understand the intended behavior without worrying about all the implementation details.

Since all the BDD frameworks required testers to write their tests in Gherkin, the world came to see Gherkin as a domain-specific language for test automation. The frameworks naturally separated the “what” of the test from the “how.” Testers also realized how effective it was to write steps once and reuse them in any test.

Angry pandas fighting each other with cucumbers.

Alas, BDD’s golden age did not last forever. What ended it? War – primarily fought over Gherkin. Many folks like myself recognized how a programming language for test automation could make tests much easier and faster to automate. Others, however, hated writing an extra layer of steps above their test automation code. They felt like it was completely unnecessary, making tests slower to write and code less “clean.”

Meanwhile, many of the original leaders of the BDD movement felt horrified that teams would use frameworks like Cucumber without cross-role collaboration over behaviors first. They also saw how BDD became pigeonholed as a testing activity.

Many of these leaders reacted by focusing more on the collaboration side of BDD than the automation side. Good things came of this, such as the practice of Example Mapping. However, I believe the pendulum swung too hard the other way, almost to the point where many leaders practically shunned automation, discrediting the value in a language for testing apart from collaborative practices. Unfortunately, this meant that the “thought leaders” and the “trench workers” were heading in opposite directions.

A very sad, upset panda wearing ancient Chinese robes.

Regardless of disagreements, BDD became very popular throughout the software industry. By the end of the 2010s, testing tool vendors took notice. SmartBear bought Cucumber, and Tricentis bought SpecFlow. Initially, there were high hopes. The new owners started investing in building BDD tools beyond the classic test frameworks. BDD was having its moment. Personally, I loved SpecFlow’s LivingDoc generator.

What Happened?

Things have been rough for BDD the past few years.

Test tool vendors failed to commercialize BDD tools and ultimately gave up on them. I don’t know exactly what happened or why. All I know is that SmartBear gave Cucumber to the Open Source Collective, and Tricentis literally just shut down the entire SpecFlow project. Thankfully, Gaspar Nagy forked SpecFlow and rebranded it as Reqnroll. Nevertheless, seeing major test tool vendors move on from BDD tools sent a bleak signal about their future.

In that, we as an industry failed to successfully productize tools for collaboration. The world jumped on BDD because Cucumber-esque frameworks were easy to adopt. The world was less willing to adopt BDD’s collaborative techniques because they were merely processes, not products. Products are sticky; processes are not. Cucumber tests will still be running after we all retire.

The testing world also moved on. Exciting new tools like Cypress and Playwright came out. Then AI became big. Folks just aren’t talking about BDD as much.

Speaking of AI, LLMs made Gherkin feel old and clunky. Gherkin enables folks to write their specs in plain language, but the steps they write must be written perfectly and identically every time. In large projects, it becomes difficult to find the “right” step. Tools using LLMs let users write free-form steps and then figure out what the users meant. They are more user-friendly. They “do what I mean” rather than “do what I say.”

And finally, the movement itself slowed down. There is still good work happening, but from my perspective, it is not as groundbreaking as the golden age. We haven’t seen many new things. The movement feels like it has stalled, especially post-COVID.

Deep Thoughts

I’ve thought about this a lot. A LOT. The principles of BDD are still valuable. In fact, I think they’re more applicable than ever. As a software industry, we continue to flounder through the same problems. I’ve also realized that the tooling doesn’t match the principles. As a movement, we haven’t productized aspects of the process in ways that help people improve their software development practices. We haven’t seen advancements to the same degree that we saw during BDD’s golden age of the Cucumber patch.

A panda deep sitting at a desk and writing in a notebook while deep in thought.

I’ve been asking myself two questions:

What “Behavior” could be?
How could we provide better tooling for it?

I think Artificial Intelligence technologies could help provide much better tooling. If used appropriately, it could remove many points of friction in the development process. AI would not replace product owners, developers, or testers but rather empower them… and perhaps even coach them into better practices. I can’t help but think about what BDD would have looked like if it had technology like LLMs from the start.

I’ve also come to realize that the “intelligence” itself is not artificial. It’s just regurgitated from existing sources. We can bake what we already know into it. The challenge then is not merely prompting it with the right queries. The challenge is delivering the insights it offers in a way that seamlessly becomes part of the greater development experience. So, rather than “artificial” intelligence, what we truly need is…

Automated Intelligence. We need insights to automatically come to us as we are developing software. We need coaching to help us stay oriented on behaviors. We need agents to take care of grindwork like organizing the artifacts of our process, running tests, and analyzing results. Intelligence doesn’t always need to come from heavyweight technology like LLMs, either. Sometimes, it can be as simple as an extra comment inserted into a spec or a rote question asked by a chatbot.

I think the notion of “automated” intelligence fits perfectly into the BDD process. I first learned about the process from Seb Rose and Gaspar Nagy in The BDD Books. There are three phases: Discovery → Formulation → Automation. This process frames software development into one seamless, cohesive process that focuses on behaviors through and through.

Behaviors: Discovery -> Formulation -> Automation in one Seamless Process

Let’s walk through these phases together.

A Renewed Process

Discovery

Discovery is the first phase. It is an activity of learning. In Discovery, teams are figuring out what the business-critical behaviors should be.

To be truly successful, teams need Cross-Role Collaboration to gain insights from all perspectives. This has commonly been referred to as “The Three Amigos” of business, development, and testing, but I prefer to think of it more broadly as cross-role collaboration to account for multiple people in those roles as well as people in other roles such as UI/UX design.

To be productive, teams engage in Structured Activities to make sure their meetings produce meaningful artifacts that move development forward. Structured activities include story mapping, example mapping, and question storming. Nobody likes to be part of a soul-crushing meeting with no agenda and no tangible outcome.

Furthermore, the artifacts like stories, rules, and examples help teams do real planning instead of guessing. Estimates for size are much more accurate when they are based on actual rules and examples.

I feel like Discovery has an untapped market. Any time I have led these kinds of sessions, every participant finds them to be shockingly beneficial. Yet every time I teach others how to run these sessions on their own, they always get stuck. Now, I am usually not an expert in the domain of the stories we end up exploring, but I do have coaching skills to ask the right questions. The process works magic; I just need to do a little smoke in mirrors as a faux magician.

That’s what I think tools for activities like Example Mapping really need. You can build the most beautiful diagramming tools, but if folks don’t have guidance on how to use them, then they’re worthless. Now, with some social engineering and a little bit of AI, we could build a virtual coach into the tool to help teams complete the structured activities. The coach could ask the team probing questions to get them thinking. It could be a “scribe” for the team, rendering cards on the board and moving them around as a team’s exploration progresses. It could challenge the team when they are moving too slowly or becoming distracted by a tangent. Overall, the coach should help the team discover the product’s most important behaviors as efficiently and painlessly as possible.

Your Discovery Coach, helping with Example Mapping

Formulation

Formulation is the second phase of the behavior process. It is an activity of defining. In Formulation, teams write carefully-phrased specifications for the behaviors they explored during the Discovery phase.

Plain-language definitions with concrete examples are vital for good formulation. If a team cannot explain how a behavior works in plain language, then how could they ever expect a user to understand it? It’s much easier to explain behaviors with real-world examples.

Scenarios also need structure. Gherkin is the go-to language for BDD because its Given-When-Then steps follow the Arrange-Act-Assert pattern. Given the system is ready, When an actor performs an interaction, Then the system produces a desired outcome. This pattern frames one behavior individually and independently. Each scenario covers one behavior, which makes it easier to write, easier to understand, and easier to automate.

Gherkin syntax itself is fine for Formulation, but behavior tooling could be supercharged with a Formulation Copilot. In the same way that a Discovery Coach could help teams push through structured activities, a Formulation Copilot could provide advice and insights to team members as they are writing their behavior specifications. It should also be able to start by turning the artifacts generated by Discovery such as Example Mapping cards into “starter specs.”

Your Formulation Copilot, guiding your Gherkin.

Automation

Automation is the third phase of the behavior process. It is an activity of verifying. In Automation, teams automate their specs using test frameworks.

They should run their tests in Fast Feedback Loops like Continuous Integration pipelines to get value from their tests for every single code change. Remember, Continuous Integration is the production environment for test automation. If tests are not running in CI, then they are practically useless. Fast feedback enables teams to learn about the quality of the software and make informed business decisions about it.

When tests run continuously, their specs and their results become Living Documentation for the product. The results reveal the actual quality of the system as an active health monitor. The specs pinpoint where quality issues are happening.

Carrying out a cohesive process from inspiration to implementation also makes behaviors traceable. The specs act as a receipt or a proof-of-purchase for what the team “bought” with their planning. They hold the team accountable to delivering the behaviors that were intended.

There is so much that an intelligent Automation Watchdog could do for the Automation phase. If AI delivers on all its promises, a watchdog could just automate all the plain-language specs written during the previous phase. For example with web testing, it could either generate code in, say, Playwright, or it could just go into the web browser and enact the interactions based on its natural language understanding of the steps. Perhaps that’s too ambitious. More realistically, the watchdog could yield better analysis of test results and manage some of the triaging grindwork.

Furthermore, an Automation Watchdog could help us shift test automation away from “failure is bad.” For the longest time, we as testers have automated tests that perform interactions and assert verification. Interactions plus verification. That’s it; that’s testing. So, we hard-code the interactions, and we hard-code the verifications. Unfortunately, one of the most common problems with test automation is that any change in the behaviors under test will inevitably require tests to change as well, or else the tests will break and fail and send red X’s everywhere. Psychologically, it spreads bad vibes to say a test is “failing” or “broken” when the automation technically was successful in detecting a change. Perhaps an Automation Watchdog could become the ultimate change-detector and steer teams towards healthier perspectives about failures whenever they inevitably occur.

Your Automation Watchdog, always running tests and providing feedback.

All Together

Discovery → Formulation → Automation. That is the full BDD process. I just outlined many ways we could reinvigorate these practices with better tooling. I want to top it off with one more idea: What if all this could be done within one app as one seamlessly integrated experience? One of the biggest pain points I’ve felt with BDD is the fact that feature files holding Gherkin scenarios can never truly be read and written by everyone on a team. Teams either stuff them into Jira for the product folks or commit them to Git repositories for developers and testers. There’s no possibility for a single source of truth because the tooling just is not there. It’s maddening. If we truly believe in these behavior-driven principles and practices, then as an industry and a community, we need to commit to them with better tooling.

I believe in these principles. I believe in these practices. I am thoroughly behavior-driven, and I do not believe I could approach software development any other way. It pains me to see how the BDD movement has stumbled when its tenets still have so much value to offer.

Time to Rebrand

Before I conclude, I want to propose one more big idea: I think it is time we rebrand BDD. I think it needs a new name. The current name has way too much baggage. Too many people hate it – or rather, they hate the terrible experiences they had under its impact. If we can be honest, “Behavior-Driven Development” just isn’t a good name anymore. Heck, anything that is “Something-Something-Driven Development” is a pretty lousy name.

I’ve thought to myself, what if we remove the word “driven”? Driving is for cars. We could shorten the name to “Behavior Development”. Still, that sounds clunky, and it doesn’t convey how folks need to focus more on the principles than the practices.

What I genuinely want folks to have is a Behavior Mindset when they approach software development. If folks actually put behaviors first and foremost in their mind, then everything else will fall into place. The specs, the code, and the practices all become artifacts of the process. We already have software development methodologies. Behavior orientation is really more of a mindset that complements existing paradigms.

So, enough with the three-letter acronyms. Enough with the baggage of the past. Let’s build software right. Let’s focus on the business value that matters.

We do Agile. We do DevOps. I think it’s time we also do Behavior.

Let’s build this future together!

1974 VW Karmann Ghia Convertible For Sale

THIS CAR HAS BEEN SOLD!

Year, Make, & Model	1974 VW Karmann Ghia Convertible
Price	~~Asking $7474 but make me an offer~~ SOLD!
Location	Cary, NC
Runs & Drives	Yes
Mileage	At least 62K (odometer was broken)
Title	Yes (good NC title in hand)
Contact	Please use the Contact form or call/text the phone number at the bottom of this page

Pictures

What’s Good

This car is great:

It runs and drives!
The convertible top works and has the rear glass windshield.
The car has its spare tire.
The car has its emblems.
It comes with a car cover.

Karmann Ghia convertibles are rare. Only 18% of all Karmann Ghias ever produced were convertibles. VW produced only 1558 Karmann Ghia convertibles for the 1974 model year – lower than all model years except 1955-57. Although it may look rough, this car is a survivor.

What’s New

I put a lot of work into restoring this car. Mechanically, the car is sound.

Interior:
- New interior carpet with padding
- New back bench (in lieu of the backseat that VW removed for the 1974 model)
- New seatbelts
- New rear side panels
- New EMPI trigger-style shifter
- Refurbished steering wheel with new leather grip
Exterior:
- New tires
- New side mirrors
- New rear emblems (“Volkswagen”, “Karmann Ghia” script)
- Refurbished hubcaps
- Refurbished rear bumper (original)
- New tail light mounts and rubber
Engine:
- Volkzbitz-restored 34 PICT-3 carburetor
- New timing belt
- New battery
- Oil changes and valve adjustments
All new brake system:
- New master cylinder
- All new brake lines (hard and soft)
- New front discs and calipers
- Rebuilt rear drum brakes
- Fresh brake fluid with cleaned reservoir
- Refurbished parking brake with new boot and levers
Clutch system:
- New clutch plate
- New throw-out bearing
- New shaft and arm
- New Bowden tube and cable
Front-end:
- New tie rods
- New steering knuckles
- New steering shaft
- Adjusted steering box
- De-rusted and refurbished gas tank with a new sending unit
- Cleaned and painted the trunk
- New trunk carpet
- New speedometer cable (which fixed the speedometer and odometer)
- New horns

What Needs Work

The biggest need is body work. This car is a survivor. On the front-end, the nose has been smashed more than once, and the passenger-side fender is dented. The whole front-end is caked in body filler up to 1/4″ thick. The driver-side door post has rusted and weakened to the point where the door falls about an inch when it opens. The rear driver-side bumper mount has rusted off. I de-rusted and paint-sealed vulnerable spots while uncovering them during my restoration work.

For a full restoration:

The convertible top should be restored.
The door panels should be replaced.
The dashboard should be refreshed.
The sound system is not connected.
The fuel gauge sending unit needs to be calibrated.
The front bumper is missing. (The original rear bumper is restored.)
The heating system is not connected. (But, c’mon, it’s a convertible!)

What’s Odd

The floor pans are not stock. A previous owner welded them in. Thankfully, they are very sturdy. The seats and their tracks are also not stock.

The speedometer cable was broken when I bought the vehicle. The speedometer did not show velocity, and the odometer was frozen at 62202. When I replaced the cable, the speedometer and odometer started working again. I do not know the true mileage of the car, but all the mileage after 62202 is what I have driven.

The Backstory

The earliest I can trace its story is back to a title in Virginia in the 1990s. Somehow, it ended up in the hands of a VW enthusiast in Tennessee who purportedly planned to chop its top and part out the rest. It was saved in early 2022 by a lady named Ms. Anna. She was friends with the VW enthusiast, and she wanted to buy an old VW Beetle from him as her dream car. When she saw this Karmann Ghia, she changed her mind on the spot and bought it instead. Ms. Anna drove it through the backwoods for a few months as her “fun car” until she decided to sell it due to a change in life circumstances. She did not make any significant changes to the vehicle.

I purchased the Karmann Ghia from Ms. Anna in June 2022. The car was in rougher shape than I expected, but it was still a good car. On my first test drive, the transmission shifted as smoothly as butter. The speedometer was broken at the time, but I could shift by listening to the revs of the engine. I drove it through the mountains of Tennessee from Dunlap down to Chattanooga and then hauled it home to North Carolina from there.

Why I’m Selling

I love the Karmann Ghia, but I’ve decided to sell this one for one main reason: I want an earlier year model. I like the earlier design elements. My grandfather drove a ’67 convertible, and I also own a ’70 Beetle. I’d like to sell my current Karmann Ghia convertible to buy another one within that year range.

I’m ready to sell this car immediately. I don’t have anything to hide. If you’re interested, please message me through the Contact form.