Almost every major programming language has BDD test frameworks, and Python is no exception. In fact, Python has several! So, how do they compare, and which one is best? Let’s find out.
Head-to-Head Comparison
behave
behave is one of the most popular Python BDD frameworks. Although it is not officially part of the Cucumber project, it functions very similarly to Cucumber frameworks.
Resources
- Python Testing 101: behave
- Behavior-Driven Python
- The behave project on GitHub
- The behave-parallel project on GitHub
- Read The Docs for behave
Logo

Pros
- It fully supports the Gherkin language.
- Environmental functions and fixtures make setup and cleanup easy.
- It has Django and Flask integrations.
- It is popular with Python BDD practitioners.
- Online docs and tutorials are great.
- It has PyCharm Professional Edition support.
Cons
- There’s no support for parallel execution.
- behave-parallel, a spinoff framework, is needed.
- It’s a standalone framework.
- Sharing steps between feature files can be a bit of a hassle.
pytest-bdd
pytest-bdd is a plugin for pytest that lets users write tests as Gherkin feature files rather than test functions. Because it integrates with pytest, it can work with any other pytest plugins, such as pytest-html for pretty reports and pytest-xdist for parallel testing. It also uses pytest fixtures for dependency injection.
Resources
Logo

Pros
- It is fully compatible with pytest and major pytest plugins.
- It benefits from pytest‘s community, growth, and goodness.
- Fixtures are a great way to manage context between steps.
- Tests can be filtered and executed together with other pytest tests.
- Step definitions and hooks are easily shared using conftest.py.
- Tabular data can be handled better for data-driven testing.
- Online docs and tutorials are great.
- It has PyCharm Professional Edition support.
Cons
- Step definition modules must have explicit declarations for feature files (via “@scenario” or the “scenarios” function).
- Scenario outline steps must be parsed differently.
radish
radish is a BDD framework with a twist: it adds new syntax to the Gherkin language. Language features like scenario loops, scenario preconditions, and constants make radish‘s Gherkin variant more programmatic for test cases.
Resources
- The radish project on GitHub
- radish-bdd.io
- Read The Docs for radish
Logo

Pros
- Gherkin language extensions empower testers to write better tests.
- The website, docs, and logo are on point.
- Feature files and step definitions come out very clean.
Cons
- It’s a standalone framework with limited extensions.
- BDD purists may not like the additions to the Gherkin syntax.
lettuce
lettuce is another vegetable-themed Python BDD framework that’s been around for years. However, the website and the code haven’t been updated for a while.
Resources
- The lettuce project on GitHub
- lettuce.it
Logo

Pros
- Its code is simpler.
- It’s tried and true.
Cons
- It lacks the feature richness of the other frameworks.
- It doesn’t appear to have much active, ongoing support.
freshen
freshen was one of the first BDD test frameworks for Python. It was a plugin for nose. However, both freshen and nose are no longer maintained, and their doc pages explicitly tell readers to use other frameworks.
My Recommendations
None of these frameworks are perfect, but some have clear advantages. Overall, my top recommendation is pytest-bdd because it benefits from the strengths of pytest. I believe pytest is one of the best test frameworks in any language because of its conciseness, fixtures, assertions, and plugins. The 2018 Python Developers Survey showed that pytest is, by far, the most popular Python test framework, too. Even though pytest-bdd doesn’t feel as polished as behave, I think some TLC from the open source community could fix that.
Here are other recommendations:
- Use behave if you want a robust, clean experience with the largest community.
- Use pytest-bdd if you need to integrate with other plugins, already have a bunch of pytest tests, or want to run tests in parallel.
- Use radish if you want more programmatic control of testing at the Gherkin layer.
- Don’t use lettuce or freshen.