Python Testing 101: pytest


pytest is an awesome Python test framework. According to its homepage:

pytest is a mature full-featured Python testing tool that helps you write better programs.

Pytests may be written either as functions or as methods in classes – unlike unittest, which forces tests to be inside classes. Test classes must be named “Test*”, and test functions/methods must be named “test_*”. Test classes also need not inherit from unittest.TestCase or any other base class. Thus, pytests tend to be more concise and more Pythonic. pytest can also run unittest and nose tests.

pytest provides many advanced test framework features:

pytest is actively supported for both Python 2 and 3.


Use pip to install the pytest module. Optionally, install other plugins as well.

pip install pytest
pip install pytest-cov
pip install pytest-xdist
pip install pytest-bdd

Project Structure

The modules containing pytests should be named “test_*.py” or “*”. While the pytest discovery mechanism can find tests anywhere, pytests must be placed into separate directories from the product code packages. These directories may either be under the project root or under the Python package. However, the pytest directories must not be Python packages themselves, meaning that they should not have “” files. (My recommendation is to put all pytests under “[project root]/tests”.) Test configuration may be added to configuration files, which may go by the names “pytest.ini”, “tox.ini”, or “setup.cfg”.

[project root directory]
|‐‐ [product code packages]
|-- [test directories]
|   |-- test_*.py
|   `-- *
`-- [pytest.ini|tox.ini|setup.cfg]

Example Code

An example project named example-py-pytest is located in my GitHub python-testing-101 repository. The project has the following structure:

|-- com.automationpanda.example
|   |--
|   |--
|   `--
|-- tests
|   |--
|   `--
`-- pytest.ini

The pytest.ini file is simply a configuration file stub. Feel free to add contents for local testing needs.

# Add pytest options here

view raw
hosted with ❤ by GitHub

The com.automationpanda.example.calc_func module contains basic math functions.

def add(a, b):
return a + b
def subtract(a, b):
return a b
def multiply(a, b):
return a * b
def divide(a, b):
return a * 1.0 / b
def maximum(a, b):
return a if a >= b else b
def minimum(a, b):
return a if a <= b else b

view raw
hosted with ❤ by GitHub

The calc_func tests located in tests/ are written as functions. Test functions are preferable to test classes when testing functions without side effects.

import pytest
from com.automationpanda.example.calc_func import *
NUMBER_1 = 3.0
NUMBER_2 = 2.0
def test_add():
value = add(NUMBER_1, NUMBER_2)
assert value == 5.0
def test_subtract():
value = subtract(NUMBER_1, NUMBER_2)
assert value == 1.0
def test_subtract_negative():
value = subtract(NUMBER_2, NUMBER_1)
assert value == 1.0
def test_multiply():
value = multiply(NUMBER_1, NUMBER_2)
assert value == 6.0
def test_divide():
value = divide(NUMBER_1, NUMBER_2)
assert value == 1.5

view raw
hosted with ❤ by GitHub

The divide-by-zero test uses pytest.raises:

def test_divide_by_zero():
with pytest.raises(ZeroDivisionError) as e:
divide(NUMBER_1, 0)
assert "division by zero" in str(e.value)

view raw
hosted with ❤ by GitHub

And the min/max tests use parameterization:

@pytest.mark.parametrize("a,b,expected", [
def test_maximum(a, b, expected):
assert maximum(a, b) == expected
@pytest.mark.parametrize("a,b,expected", [
def test_minimum(a, b, expected):
assert minimum(a, b) == expected

view raw
hosted with ❤ by GitHub

The com.automationpanda.example.calc_class module contains the Calculator class, which uses the math functions from calc_func. Keeping the functional spirit, the private _do_math method takes in a reference to the math function for greater code reusability.

from com.automationpanda.example.calc_func import *
class Calculator(object):
def __init__(self):
self._last_answer = 0.0
def last_answer(self):
return self._last_answer
def _do_math(self, a, b, func):
self._last_answer = func(a, b)
return self.last_answer
def add(self, a, b):
return self._do_math(a, b, add)
def subtract(self, a, b):
return self._do_math(a, b, subtract)
def multiply(self, a, b):
return self._do_math(a, b, multiply)
def divide(self, a, b):
return self._do_math(a, b, divide)
def maximum(self, a, b):
return self._do_math(a, b, maximum)
def minimum(self, a, b):
return self._do_math(a, b, minimum)

view raw
hosted with ❤ by GitHub

While tests for the Calculator class could be written using a test class, pytest test functions are just as capable. Fixtures enable a more fine-tuned setup/cleanup mechanism than the typical xUnit-like methods found in test classes. Fixtures can also be used in conjunction with parameterized methods. The tests/ module is very similar to tests/ and shows how to use fixtures for testing a class.

import pytest
from com.automationpanda.example.calc_class import Calculator
# "Constants"
NUMBER_1 = 3.0
NUMBER_2 = 2.0
# Fixtures
def calculator():
return Calculator()
# Helpers
def verify_answer(expected, answer, last_answer):
assert expected == answer
assert expected == last_answer
# Test Cases
def test_last_answer_init(calculator):
assert calculator.last_answer == 0.0
def test_add(calculator):
answer = calculator.add(NUMBER_1, NUMBER_2)
verify_answer(5.0, answer, calculator.last_answer)
def test_subtract(calculator):
answer = calculator.subtract(NUMBER_1, NUMBER_2)
verify_answer(1.0, answer, calculator.last_answer)
def test_subtract_negative(calculator):
answer = calculator.subtract(NUMBER_2, NUMBER_1)
verify_answer(1.0, answer, calculator.last_answer)
def test_multiply(calculator):
answer = calculator.multiply(NUMBER_1, NUMBER_2)
verify_answer(6.0, answer, calculator.last_answer)
def test_divide(calculator):
answer = calculator.divide(NUMBER_1, NUMBER_2)
verify_answer(1.5, answer, calculator.last_answer)
def test_divide_by_zero(calculator):
with pytest.raises(ZeroDivisionError) as e:
calculator.divide(NUMBER_1, 0)
assert "division by zero" in str(e.value)
@pytest.mark.parametrize("a,b,expected", [
def test_maximum(calculator, a, b, expected):
answer = calculator.maximum(a, b)
verify_answer(expected, answer, calculator.last_answer)
@pytest.mark.parametrize("a,b,expected", [
def test_minimum(calculator, a, b, expected):
answer = calculator.minimum(a, b)
verify_answer(expected, answer, calculator.last_answer)

view raw
hosted with ❤ by GitHub

Personally, I prefer to write pytests as functions because they are usually cleaner and more flexible than classes. Plus, test functions appeal to my affinity for functional programming.

Test Launch

Basic Test Execution

pytest has a very powerful command line for launching tests. Simply run the pytest module from within the project root directory, and pytest will automatically discover tests.

# Find and run all pytests from the current directory
python -m pytest

# Run pytests under a given path
python -m pytest 

# Run pytests in a specific module
python -m pytest tests/

# Generate JUnit-style XML test reports
python -m pytest --junitxml=[path-to-file]

# Get command help
python -m pytest -h

The terminal output looks like this:

python -m pytest
=============================== test session starts ===============================
platform darwin -- Python 2.7.13, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /Users/andylpk247/Programming/automation-panda/python-testing-101/example-py-pytest, inifile: pytest.ini
plugins: cov-2.4.0
collected 25 items

tests/ .............
tests/ ............

============================ 25 passed in 0.11 seconds ============================

pytest also provides shorter “pytest” and “py.test” command that may be run instead of the longer “python -m pytest” module form. However, the shorter commands do not append the current path to PYTHONPATH, meaning modules under test may not be importable. Make sure to update PYTHONPATH before using the shorter commands.

# Update the Python path

# Discover and run tests using the shorter command

Code Coverage

To run code coverage with the pytest-cov plugin module, use the following command. The report types are optional, but all four types are show below. Specific paths for each report may be appended using “:”.

# Run tests with code coverage
python -m pytest [test-path] [other-options] \
      --cov= \
      --cov-report=annotate \
      --cov-report=html \
      --cov-report=term \

Code coverage output on the terminal (“term” cov-report) looks like this:

python -m pytest --cov=com --cov-report=term
============================= test session starts ==============================
platform darwin -- Python 3.6.5, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /Users/andylpk247/Programming/automation-panda/python-testing-101/example-py-pytest, inifile: pytest.ini
plugins: cov-2.4.0
collected 25 items 

tests/ .............
tests/ ............

---------- coverage: platform darwin, python 3.6.5-final-0 -----------
Name                                        Stmts   Miss  Cover
com/                                 0      0   100%
com/automationpanda/                 0      0   100%
com/automationpanda/example/         0      0   100%
com/automationpanda/example/      21      0   100%
com/automationpanda/example/       12      0   100%
TOTAL                                          33      0   100%

========================== 25 passed in 0.11 seconds ===========================<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>

Parallel Testing

Parallel testing is vital for more intense testing, such as web testing. The pytest-xdist plugin makes it possible both to scale-up tests by running more than one test process and to scale-out by running tests on other machines. (As a prerequisite, machines need rsync and SSH.) The command below shows how to run multiple test sub-processes; refer to official documentation for multi-machine setup

python -m pytest -n 4
============================= test session starts ==============================
platform darwin -- Python 3.6.5, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /Users/andylpk247/Programming/automation-panda/python-testing-101/example-py-pytest, inifile: pytest.ini
plugins: xdist-1.22.2, forked-0.2, cov-2.4.0
gw0 [25] / gw1 [25] / gw2 [25] / gw3 [25]
scheduling tests via LoadScheduling
========================== 25 passed in 1.30 seconds ===========================

Pros and Cons

I’ll say it again: pytest is awesome. It is a powerful test framework with many features, yet its tests are concise and readable. It is very popular and actively supported for both versions of Python. It can handle testing at the unit, integration, and end-to-end levels. It can also be extended with plugins, notably ones for code coverage, parallel execution, and BDD. The only challenge with pytest is that advanced features (namely fixtures) have a learning curve.

My recommendation is to use pytest for standard functional testing in Python. It is one of the best and most popular test frameworks available, and it beats the pants off of alternatives like unittest and nosepytest is my go-to Python framework, period.

This article is meant to be an introduction. Check out Python Testing with pytest by Brian Okken for deeper study.


Update: On 4/21/2018, I added pytest-xdist and pytest-bdd plugins, and I made some cosmetic changes.

Update: On 7/29/2018, I added the book recommendation for “Python Testing with pytest.”

BDD 101: Frameworks

Every major programming language has a BDD automation framework. Some even have multiple choices. Building upon the structural basics from the previous post, this post provides a survey of the major frameworks available today. Since I cannot possibly cover every BDD framework in depth in this 101 series, my goal is to empower you, the reader, to pick the best framework for your needs. Each framework has support documentation online justifying its unique goodness and detailing how to use it, and I would prefer not to duplicate documentation. Use this post primarily as a reference. (Check the Automation Panda BDD page for the full table of contents.)

Major Frameworks

Most BDD frameworks are Cucumber versions, JBehave derivatives inspired by Dan North, or non-Gherkin spec runners. Some put behavior scenarios into separate files, while others put them directly into the source code.

C# and Microsoft .NET

SpecFlow, created by Gáspár Nagy, is arguably the most popular BDD framework for Microsoft .NET languages. Its tagline is “Cucumber for .NET” – thus fully compliant with Gherkin. SpecFlow also has polished, well-designed hookscontext injection, and parallel execution (especially with test thread affinity). The basic package is free and open source, but SpecFlow also sells licenses for SpecFlow+ extensions. The free version requires a unit test runner like MsTest, NUnit, or in order to run scenarios. This makes SpecFlow flexible but also feels jury-rigged and inelegant. The licensed version provides a slick runner named SpecFlow+ Runner (which is BDD-friendly) and a Microsoft Excel integration tool named SpecFlow+ Excel. Microsoft Visual Studio has extensions for SpecFlow to make development easier.

There are plenty of other BDD frameworks for C# and .NET, too. is an alternative that pairs nicely with A major difference of is that scenario steps are written directly in the code, instead of in separate text (feature) files. LightBDD bills itself as being more lightweight than other frameworks and basically does some tricks with partial classes to make the code more readable. NSpec is similar to RSpec and Mocha and uses lambda expressions heavily. Concordion offers some interesting ways to write specs, too. NBehave is a JBehave descendant, but the project appears to be dead without any updates since 2014.

Java and JVM Languages

The main Java rivalry is between Cucumber-JVM and JBehave. Cucumber-JVM is the official Cucumber version for Java and other JVM languages (Groovy, Scala, Clojure, etc.). It is fully compliant with Gherkin and generates beautiful reports. The Cucumber-JVM driver can be customized, as well. JBehave is one of the first and foremost BDD frameworks available. It was originally developed by Dan North, the “father of BDD.” However, JBehave is missing key Gherkin features like backgrounds, doc strings, and tags. It was also a pure-Java implementation before Cucumber-JVM existed. Both frameworks are widely used, have plugins for major IDEs, and distribute Maven packages. This popular but older article compares the two in slight favor of JBehave, but I think Cucumber-JVM is better, given its features and support.

The Automation panda article Cucumber-JVM for Java is a thorough guide for the Cucumber-JVM framework.

Java also has a number of other BDD frameworks. JGiven uses a fluent API to spell out scenarios, and pretty HTML reports print the scenarios with the results. It is fairly clean and concise. Spock and JDave are spec frameworks, but JDave has been inactive for years. Scalatest for Scala also has spec-oriented features. Concordion also provides a Java implementation.


Almost all JavaScript BDD frameworks run on Node.js. Jasmine and Mocha are two of the most popular general-purpose JS test frameworks. They differ in that Jasmine has many features included (like assertions and spies) that Mocha does not. This makes Jasmine easier to get started (good for beginners) but makes Mocha more customizable (good for power users). Both claim to be behavior-driven because they structure tests using “describe” and “it-should” phrases in the code, but they do not have the advantage of separate, reusable steps like Gherkin. Personally, I consider Jasmine and Mocha to be behavior-inspired but not fully behavior-driven.

Other BDD frameworks are more true to form. Cucumber provides Cucumber.js for Gherkin-compliant happiness. Yadda is Gherkin-like but with a more flexible syntax. Vows provides a different way to approach behavior using more formalized phrase partitions for a unique form of reusability. The Cucumber blog argues that Cucumber.js is best due to its focus on good communication through plain language steps, whereas other JavaScript BDD frameworks are more code-y. (Keep in mind, though, that Cucumber would naturally boast of its own framework.) Other comparisons are posted here, here, here, and here.


The two major BDD frameworks for PHP are Behat and Codeception. Behat is the official Cucumber version for PHP, and as such is seen as the more “pure” BDD framework. Codeception is more programmer-focused and can handle other styles of testing. There are plenty of articles comparing the two – here, here, and here (although the last one seems out of date). Both seem like good choices, but Codeception seems more flexible.


Python has a plethora of test frameworks, and many are BDD. behave and lettuce are probably the two most popular players. Feature comparison is analogous to Cucumber-JVM versus JBehave, respectively: behave is practically Gherkin compliant, while lettuce lacks a few language elements. Both have plugins for major IDEs. pytest-bdd is on the rise because it integrates with all the wonderful features of pytestradish is another framework that extends the Gherkin language to include scenario loops, scenario preconditions, and variables. All these frameworks put scenarios into separate feature files. They all also implement step definitions as functions instead of classes, which not only makes steps feel simpler and more independent, but also avoids unnecessary object construction.

Other Python frameworks exist as well. pyspecs is a spec-oriented framework. Freshen was a BDD plugin for Nose, but both Freshen and Nose are discontinued projects.


Cucumber, the gold standard for BDD frameworks, was first implemented in Ruby. Cucumber maintains the official Gherkin language standard, and all Cucumber versions are inspired by the original Ruby version. Spinach bills itself as an enhancement to Cucumber by encapsulating steps better. RSpec is a spec-oriented framework that does not use Gherkin.

Which One is Best?

There is no right answer – the best BDD framework is the one that best fits your needs. However, there are a few points to consider when weighing your options:

  • What programming language should I use for test automation?
  • Is it a popular framework that many others use?
  • Is the framework actively supported?
  • Is the spec language compliant with Gherkin?
  • What type of testing will you do with the framework?
  • What are the limitations as compared to other frameworks?

Frameworks that separate scenario text from implementation code are best for shift-left testing. Frameworks that put scenario text directly into the source code are better for white box testing, but they may look confusing to less experienced programmers.

Personally, my favorites are SpecFlow and pytest-bdd. At LexisNexis, I used SpecFlow and Cucumber-JVM. For Python, I used behave at MaxPoint, but I have since fallen in love with pytest-bdd since it piggybacks on the wonderfulness of pytest. (I can’t wait for this open ticket to add pytest-bdd support in PyCharm.) For skill transferability, I recommend Gherkin compliance, as well.

Reference Table

The table below categorizes BDD frameworks by language and type for quick reference. It also includes frameworks in languages not described above. Recommended frameworks are denoted with an asterisk (*). Inactive projects are denoted with an X (x).

Language Framework Type
C Catch In-line Spec
C++ Igloo In-line Spec
C# and .NET Concordion
NBehave x
SpecFlow *
In-line Spec
In-line Gherkin
Separated semi-Gherkin
In-line Spec
Separated Gherkin
In-line Gherkin
Golang Ginkgo In-line Spec
Java and JVM Cucumber-JVM *
JDave x
JGiven *
Separated Gherkin
Separated semi-Gherkin
In-line Spec
In-line Gherkin
In-line Spec
In-line Spec
JavaScript Cucumber.js *
Separated Gherkin
Separated semi-Gherkin
In-line Spec
In-line Spec
In-line Spec
Perl Test::BDD::Cucumber Separated Gherkin
PHP Behat
Codeception *
Separated Gherkin
Separated or In-line
Python behave *
freshen x
pytest-bdd *
Separated Gherkin
Separated Gherkin
Separated semi-Gherkin
In-line Spec
Separated semi-Gherkin
Separated Gherkin-plus
Ruby Cucumber *
Separated Gherkin
In-line Spec
Separated Gherkin
Swift / Objective C Quick In-line Spec


[4/22/2018] Update: I updated info for C# and Python frameworks.