series

Book Review: pytest Quick Start Guide

tl;dr

Title	pytest Quick Start Guide (available from Packt and Amazon)
Author	Bruno Oliveira (@nicoddemus)
Publication	2018 Packt Publishing
Summary	“Write better Python code with simple and maintainable tests” – a very readable guide for pytest’s main features.
Prerequisites	Intermediate-level Python programming.

Summary

pytest Quick Start Guide is a new book about using pytest for Python test automation. Bruno Oliveira explains not only how to use pytest but also why its features are useful. Even though this book is written as a “quick start” introduction, it nevertheless dives deep into pytest’s major features. It covers:

An introduction to pytest
Why pytest is superior to unittest
Setting up pytest with pip and virtualenv
Writing basic tests
How assertions really work
Common command line arguments
Marks and parametrization
Using and writing fixtures
Popular plugins
Converting unittest suites to pytest

Example code is hosted on GitHub at PacktPublishing/pytest-Quick-Start-Guide.

Praises

This book is an easy introduction to test automation in Python with pytest. Readers should have intermediate-level Python skills, but they do not need previous testing or automation skills. The progression of chapters makes it easy to start quick and then go deeper. Oliveira has a very accessible writing style, too.

The unittest refactoring guide is a hidden gem. The unittest module is great because it comes bundled with Python, but it is also clunky and not very Pythonic. Not all teams know the best way to modernize their test suites. Oliveira provides many pieces of practical advice for making the change, at varying degrees of conversion. The big changes use pytest’s fixtures and assertions.

A Comparison to Python Testing with pytest

Python Testing with pytest by Brian Okken is another popular pytest book. Both books are great resources for learning pytest, but each approaches the framework from a unique perspective. How do they compare? Here’s what I saw:

Oliveira’s book	Okken’s book
Oliveira’s book is a great introductory guide for pytest. Oliveira’s writing style makes the reader feel like the author is almost teaching in person. The book’s main theme is getting the reader to use the main features of pytest pragmatically for general testing needs. The main differences in content are the unittest refactoring guide and some of the plugins and fixtures covered. This book is probably the best choice for beginners who want to learn the basics of pytest quickly.	Okken’s book is introductory but also a great manual for future reference. Okken’s writing style is direct and concise, which covers more material in fewer pages. The format for each chapter is consistent: for each idea: idea → code → output → explanation; exercises at the end. Okken also covers how to create and share custom pytest plugins. This book is probably the best choice for people who want to master the ins and outs of pytest.

Oliveira’s book

Okken’s book

Oliveira’s book is a great introductory guide for pytest. Oliveira’s writing style makes the reader feel like the author is almost teaching in person. The book’s main theme is getting the reader to use the main features of pytest pragmatically for general testing needs. The main differences in content are the unittest refactoring guide and some of the plugins and fixtures covered. This book is probably the best choice for beginners who want to learn the basics of pytest quickly.

Okken’s book is introductory but also a great manual for future reference. Okken’s writing style is direct and concise, which covers more material in fewer pages. The format for each chapter is consistent: for each idea: idea → code → output → explanation; exercises at the end. Okken also covers how to create and share custom pytest plugins. This book is probably the best choice for people who want to master the ins and outs of pytest.

Ultimately, I recommend both books because they are both excellent.

Takeaways

Reading books on the same subject by different authors helps the reader learn the subject better. I’ve used pytest quite a lot myself, but I was able to learn new things from both pytest Quick Start Guide and Python Testing with pytest. Reading how experts use and think about the framework makes me a better engineer. Different writing styles and different opinions also challenge my own understandings. (It’s also funny that the authors of both pytest books have the same initials – “B.O.”)

pytest is really popular. There are now multiple good books on the subject. It’s becoming the de facto test automation framework for Python, outpacing unittest, nose, and others. These days, it seems more popular to write a pytest plugin than to create a new framework.

Book Review: Python Testing with pytest

tl;dr

Title	Python Testing with pytest
Author	Brian Okken (@brianokken)
Publication	2017 (The Pragmatic Programmers)
Summary	How to use all the features of pytest for Python test automation – “simple, rapid, effective, and scalable.”
Prerequisites	Intermediate-level Python programming.

Summary

Python Testing with pytest is the book on pytest*. Brian Okken covers all the ins and outs of the framework. The book is useful both as tutorial for learning pytest as well as a reference for specific framework features. It covers:

Getting started with pytest
Writing simple tests as functions
Writing more interesting tests with assertions, exceptions, and parameters
Using all the different execution options
Writing fixtures to flexibly separate concerns and reuse code
Using built-in fixtures like tmpdir, pytestconfig, and monkeypatch
Using configuration files to control execution
Integrating pytest with other tools like pdb, tox, and Jenkins

Appendices also cover:

Using Python virtual environments
Installing packages with pip
An overview of popular plugins like pytest-xdist and pytest-cov
Packaging and distributing Python packages

[* Update on 9/27/2018: Check out my review on another excellent pytest book, pytest Quick Start Guide by Bruno Oliveira.]

Praises

This book is a comprehensive guide to pytest. It thoroughly covers the framework’s features and gives pointers to more info elsewhere. Even though pytest has excellent online documentation, I still recommend this book to anyone who wants to become a pytest master. Online docs tend to be fragmented with each piece limited in scope, whereas books like this one are designed to be read progressively and orderly for maximal understanding of the material.

I love how this book is example-driven. Each section follows a simple yet powerful outline: idea → code → output → explanation. Having real code with real output truly cements the point of each mini-lesson. New topics are carefully unfolded so that they build upon previous topics, making the book read like a collection of tutorials. Examples at the end of every chapter challenge the readers to practice what they learn. The formatting of each section also looks great.

The extra info on related topics like pip and virtualenv is also a nice touch. Python pros probably don’t need it, but beginners might get stuck without it.

The rocket ship logo on the cover is also really cool!

Takeaways

pytest is one of the best functional test frameworks currently available in any language. It breaks the clunky xUnit mold, in which class structures were awkwardly superimposed over test cases as if one size fits all. Instead, pytest is lightweight and minimalist because it relies on functions and fixtures. Scope is much easier to manage, code is more reusable, and side effects can more easily be avoided. pytest has taken over Python testing because it is so Pythonic.

Brian’s concise writing style has also inspired me to be more direct in my own writing. I tend to be rather verbose out of my desire to be descriptive. However, fewer words often leave a more powerful impression. They also make the message easier to comprehend. Python is beloved for its concise expressiveness, and as a Pythonista, it would be fitting for me to adopt that trait into my English.

If I had a wish list for a second edition, I’d love to see more info about assertions and other plugins (namely pytest-bdd). I think an appendix with comparisons to other Python test frameworks would also be valuable.

A Warning

I ordered a physical copy of this book directly from Amazon (not a third-party seller). Unfortunately, that copy was missing all the introductory content: the table of contents, the acknowledgements, and the preface. The first page after the front cover was Chapter 1. Befuddled, I reached out to Brian Okken (who I personally met at PyCon 2018). We suspected that it was either a misprint or a bootleg copy. Either way, we sent the evidence to the publisher, and Amazon graciously exchanged my defective copy for the real deal. Please look out for this sort of problem if you purchase a printed copy of this book for yourself!

If you want to learn more about pytest, please read my article Python Testing 101: pytest.

Book Review: Hands-On Enterprise Automation with Python

tl;dr

Title	Hands-On Enterprise Automation with Python (available from Amazon and Packt)
Author	Bassem Aly
Publication	2018 (Packt Publishing)
Summary	Using Python packages for network, system, and infrastructure automation.
Prerequisites	Intermediate Python programming. Intermediate administrative skills.

Summary

Hands-On Enterprise Automation with Python is an excellent resource for learning how to automate common administrative tasks like running commands, scraping network config, and setting up systems. Python is a natural fit for such tasks with its impressive package library, its easy learning curve, and its concise syntax.

The number of Python tools and modules this book covers is stunning:

Developing Python code with PyCharm
Managing network devices with paramiko, netmiko, and telnetlib
Using regular expressions with re
Charting data with matplotlib
Templating YAML files with Jinja2
Multiprocessing with multiprocessing
Running local system commands with subprocess
Running remote system commands with fabric
Getting system info with platform
Sending emails with smtplib
System administration with Ansible
Interacting with a MySQL database using MySQLdb
Storing files in Amazon S3 using boto3
Packet sniffing and manipulating with scapy

Each new topic is introduced with background information, setup steps, and example code. Instructions are given for the reader to set up their own test environment and try things out. Later chapters also show how to use modules together to build more powerful automation. Most examples favor Python 2 but can be made compatible with Python 3.

Praises

The best thing about this book is how it covers an incredible breadth of topics in such a readable way. Rather than being dry manual pages pulled from a cryptic doc site, each chapter is a tutorial with explanations and real-world code examples. Readers can easily read through the book cover-to-cover or seek topics directly as a reference.

Another great thing is that the author always introduces new concepts before applying them. While intermediate skills with Python and administration are presumed as a prerequisite, the introductory chapters nevertheless show how to set up a full Python development environment with a network lab for testing. Before showing how to use any particular module, the author explains what the problem is and why the module should be used. This makes the material very accessible, especially for non-sysadmins.

Comments

The tasks showcased are network-heavy.
The setup primarily relies on Linux.
Many pages are dedicated to workbench setup.
Modules are covered at an introductory level and not in depth.

Takeaways

I thoroughly enjoyed reading Hands-On Enterprise Automation with Python. As a Software Engineer in Test, the word “automation” almost always means “test automation.” Reading this book was a healthy glimpse at other no-less-important types of automation oriented more toward admin tools and scripts. I could definitely leverage many of the modules covered in this book for my own work, too – several of them cross domains.

I really liked the three main reasons the author gave for using Python for automation: it is readable, it has so many libraries, and it has power in its conciseness. These reasons ring true for many applications, especially the point about modules. Since there’s a module for nearly everything, Python programming often simplifies to recipes for using those modules. Arguably, this book is a cookbook full of sysadmin recipes.

Reading this book also made me reminiscent of my days working at MaxPoint, where I first learned Python to build a test automation framework. I used many of the same modules shared in this book for the same tasks. I felt comfort in my familiarity and validation in my past efforts.

Python Testing 101: behave

Warning: If you are new to BDD, then I strongly recommend reading the BDD 101 series before trying to use the behave framework.

Overview

behave is a behavior-driven (BDD) test framework that is very similar to Cucumber, Cucumber-JVM, and SpecFlow. BDD frameworks are unique in that test cases are not written in raw programming code but rather in plain specification language that is then “glued” to code. The “behavior specs” help to define what the behavior is, and steps can be reused by multiple test cases (or “scenarios”). This is very different from more traditional frameworks like unittest and pytest. Although behave is not an official Cucumber variant, it still uses the Gherkin language (“Given-When-Then”) for behavior specification.

Test scenarios are written in Gherkin “.feature” files. Each Given, When, and Then step is “glued” to a step definition – a Python function decorated by a matching string in a step definition module. The behave framework essentially runs feature files like test scripts. Hooks (in “environment.py”) and fixtures can also insert helper logic for test execution.

behave is officially supported for Python 2, but it seems to run just fine using Python 3.

Installation

Use pip to install the behave module.

pip install behave

Project Structure

Since behave is an opinionated framework, it has a very opinionated project structure. All code must be located under a directory named “features”. Gherkin feature files and the “environment.py” file for hooks must appear under “features”, and step definition modules must appear under “features/steps”. Configuration files can store common execution settings and even override the path to the “features” directory.

Note: Step definition module names do not need to be the same as feature file names. Any step definition can be used by any feature file within the same project.

[project root directory]
|‐‐ [product code packages]
|-- features
|   |-- environment.py
|   |-- *.feature
|   `-- steps
|       `-- *_steps.py
`-- [behave.ini|.behaverc|tox.ini|setup.cfg]

Example Code

An example project named behavior-driven-python located in GitHub shows how to write tests using behave. This section will explain how the Web tests are designed.

The top layer in a behave project is the set of Gherkin feature files. Notice how the scenario below is concise, focused, meaningful, and declarative:

@web @duckduckgo
Feature: DuckDuckGo Web Browsing
  As a web surfer,
  I want to find information online,
  So I can learn new things and get tasks done.

  # The "@" annotations are tags
  # One feature can have multiple scenarios
  # The lines immediately after the feature title are just comments

  Scenario: Basic DuckDuckGo Search
    Given the DuckDuckGo home page is displayed
    When the user searches for "panda"
    Then results are shown for "panda"

Each scenario step is “glued” to a decorated Python function called a step definition. Step defs can use different types of step matchers and can also take parametrized inputs:

from behave import *
from selenium.webdriver.common.keys import Keys

DUCKDUCKGO_HOME = 'https://duckduckgo.com/'

@given('the DuckDuckGo home page is displayed')
def step_impl(context):
  context.browser.get(DUCKDUCKGO_HOME)

@when('the user searches for "{phrase}"')
def step_impl(context, phrase):
  search_input = context.browser.find_element_by_name('q')
  search_input.send_keys(phrase + Keys.RETURN)

@then('results are shown for "{phrase}"')
def step_impl(context, phrase):
  links_div = context.browser.find_element_by_id('links')
  assert len(links_div.find_elements_by_xpath('//div')) > 0
  search_input = context.browser.find_element_by_name('q')
  assert search_input.get_attribute('value') == phrase

The “environment.py” file can specify hooks to execute additional logic before and after steps, scenarios, features, and even the whole test suite. Hooks should handle automation concerns that should not be exposed through Gherkin. For example, Selenium WebDriver setup and cleanup should be handled by hooks instead of step definitions because after hooks always get run despite failures, while steps after an abortive failure will not get run.

from selenium import webdriver

def before_scenario(context, scenario):
  if 'web' in context.tags:
    context.browser = webdriver.Firefox()
    context.browser.implicitly_wait(10)

def after_scenario(context, scenario):
  if 'web' in context.tags:
    context.browser.quit()

Test Launch

behave boasts a powerful command line with many options. Below are common use case examples when running tests from the project root directory:

# Run all scenarios in the project
behave

# Run all scenarios in a specific feature file
behave features/web.feature

# Filter tests by tag
behave --tags-help
behave --tags @duckduckgo
behave --tags ~@unit
behave --tags @basket --tags @add,@remove

# Write a JUnit report (useful for Jenkins and other CI tools)
behave --junit

# Don't print skipped scenarios
behave -k

Pros and Cons

Like all BDD test frameworks, behave is opinionated. It works best for black box testing due to its behavior focus. Web testing would be a great use case because user interactions can easily be described using plain language. Reusable steps also foster a snowball effect for automation development. However, behave would not be good for unit testing or low-level integration testing – the verbosity would become more of a hindrance than a helper.

My recommendation is to use behave for black box testing if the team has bought into BDD. I would also strongly consider pytest-bdd as an alternative BDD framework because it leverages all the goodness of pytest.

The Airing of Grievances: BDD

Behavior-Driven Development – one of my favorite blog topics. When done right, it’s a wonderful way to foster better collaboration and automation. When it’s not… well, let’s just say I got a lot of problems with bad BDD practices, and now you’re gonna hear about it!

PROCESS

Treating BDD as a Tool and Not as a Process

BDD is a process – it is a set of tools and practices designed to help teams deliver better software. BDD is not just a test automation framework; the framework is just one of the tools that support BDD. Heck, the word “development” is in the name!

Complaining that Gherkin is Too Technical

Really? Really!? Gherkin is basically just plain language with some buzzwords mixed in! It is specifically designed for non-technical people to handle it! It is not a full-fledged programming language – it is essentially a simple format for behavior specification that automation frameworks can easily parse. The steps are meant to be read like plain English (or any other spoken language) so that better collaboration can happen. If Gherkin is “too technical” for you, then I hate to know what isn’t.

No Buy-In from All Roles

The three major roles on an Agile team, a.k.a the “Three Amigos,” are biz, dev, and test (regardless of fancy names or assignments). For BDD to work well, all three role types must embrace it. Otherwise, collaboration will suffer. BDD is not just a QA thing, it’s for everyone. Biz gets better features in shorter time because requirements were communicated better. Dev wastes less time figuring out what biz wants and gets tests faster. Test can start automating right away since test scenarios are defined from the start in Gherkin. Everybody wins if everybody contributes.

No Three Amigos Meetings

Three Amigos meetings are like dietary fiber supplements: they help a team stay regular with collaboration, or else development gets constipated as engineers start building crap instead of the intended behaviors. Then the crap gets blocked up as the team must rework it, meaning it could be another sprint before there’s a healthy flush of new features. Open conversations in regularly scheduled Three Amigos meetings would have avoided the whole obstruction.

Forcing QA to Write All Behavior Scenarios

BDD is not just QA thing – it is for all roles. Pigeonholing the responsibility of writing behavior scenarios onto QA is not only unfair, it is anti-collaborative. The whole reason for writing scenarios in plain language with Gherkin is to let everyone contribute to feature behavior. Scenarios are primarily about capturing behavior, not writing tests. If tests were the main focus, then engineers could just write test cases using traditional automation frameworks directly in general purpose programming languages like Java or Python. BDD offers the benefits of process efficiency and shifting left when the whole team helps to write behavior scenarios.

GHERKIN

Bad Gherkin

Only you can prevent bad Gherkin. Or I can – via rejected code reviews.

Typos, Poor Grammar, and Inconsistent Formatting

Gherkin needs to be readable. Steps with typos, poor grammar, and inconsistent formatting will still run fine for test automation, but they make it tough to understand the behaviors they describe. Sometimes, they can even make the meaning ambiguous.

No Double-Quotes Around Step Parameters

How do you know if something is a step parameter? “Double quotes” make it easy. However, Gherkin does not enforce double quotes around parameters. It is merely by programmer’s convention, but it’s a really helpful convention indeed.

No Tags

Tags make it super easy to filter scenarios at runtime. No tags? Good luck remembering long paths and names at runtime, or running related scenarios across different feature files together.

More Than 120 Characters per Line

Any longer is too much to comprehend. Either write the step more concisely, or split it apart. Plus, the line may go off the edge of the screen.

More Than 10 Steps per Scenario

Again, any longer is too much to comprehend. Scenarios should be short and sweet – they should concisely describe behavior. Too many steps means the scenario is too imperative or covers more than one behavior.

Multiple Behaviors per Scenario

Scenarios should not have multiple personality disorder: one scenario, one behavior. Don’t break the Cardinal Rule of BDD! So many people break this rule when they first start BDD because they are locked into procedure-driven thinking. Then, when tests fail, nobody knows exactly what behavior is the culprit. One scenario, one behavior.

Out-of-Order Step Types

Givens, Whens, and Thens each serve a specific, ordered purpose: Given some initial state, When actions are taken, Then verify an expected outcome. Jumbling them up ruins their meaning. Furthermore, duplicate When-Then pairs indicate multiple behaviors per scenario. And don’t just reassign step types to skirt the strict-ordering rule. Do it right – put integrity into the steps!

Gigantic Tables

Have you ever seen an Examples table with 13 columns? Or maybe 517 rows? I have. The horror, the horror! Tables that big make scenarios lose any semblance of specification-by-example. Make sure table rows and columns are actually needed. Use key-value lookups if the data is too gritty.

Being Imperative Rather Than Declarative

Given I’m logged into the app, when I click here, and I click there, and I type P, and I type L, and I type E, and I type A, and I type S, and I type E, and I type D, and I type O, and I type N, and I type T, and I type W, and I type R, and I type I, and I type T, and I type E, and I type S, and I type C, and I type E, and I type N, and I type A, and I type R, and I type I, and I type O, and I type S, and I type L, and I type I, and I type K, and I type E, and I type T, and I type H, and I type I, and I type S, then go directly to jail, and do not pass GO, and do not collect $200. Steps should focus more on what than how.

Prefixing Existing Test Procedure Steps with Gherkin Buzzwords

Let’s just take our existing test procedures from a tool like HP QualityCenter or ALM and put the words “Given,” “When,” and “Then” in front of every step. Ta-da! We’re now doing BDD! …WRONG!! I kid you not, I have see this happen. These people clearly never took BDD 101. It hurts to see.

AUTOMATION

Unorganized Step Definitions

Programmers like to throw their step definition methods anywhere. Add ’em to an unrelated existing class? Create a whole new class for only two new steps? Mix up the types? Who cares! Don’t bother to alphabetize them, either. Well, that’s how tech debt happens. That’s how duplicate steps get written, because originals can’t be found. Imagine a library without the Dewey Decimal System – that’s what an unorganized step def collection will be.

Putting Cleanup Code in Then Steps

Cleanup code belongs in After hooks, where it will be run no matter what fails during the scenario. Writing Then steps to do cleanup not only breaks step type integrity, the cleanup code will not run if a previous step aborts!

Catching and Burying All Exceptions

Here’s something I see all the time in automation code (and not just for BDD):

// FYI - This is Java, but the same thing can happen in any language
@When("^do something$")
public void doSomething() throws Throwable {
  try {
    callStuff();
  }
  catch (Exception e) {
    System.out.println(e.getMessage());
  }
}

The entirety of a step (or even a whole test) is surrounded by a try-catch that catches every exception. THIS STEP CAN NEVER REGISTER A FAILURE! Even if there was a failed assertion or, worse, an exception that ought to abort the test, it will get caught and buried with not much more than a slight whimper in the log. In this case, the test will carry forth to the next step, which will probably not work, either. I’ve seen projects with this sort of exception handling around every single step definition. In modern test frameworks, the framework will catch all exceptions at the highest level, register the test as failed, and move on safely to the next test. There is no need to catch any exception, unless the test can be recovered.

Changing Steps Without Testing Affected Scenarios

Sharing steps is a wonderful thing, but changing steps without testing all scenarios that use them is a terrible thing. I’ve seen people change step text or step def code and test only their new scenarios. Meanwhile, in the continuous integration environment, a dozen other tests using those steps started failing. (Hell, I’ve seen people push code that doesn’t even compile, but that’s another grievance.)

Multiple Names for the Same Step

Just because you can do something doesn’t mean you should. Different names for the same step may be useful for readability, but please keep the name variants limited.

No Dependency Injection

Dependency injection is the best way to share objects in an automation framework. (Singletons work well, too, but DI allows more careful control of scope.) Many frameworks like Cucumber-JVM even integrate with existing DI frameworks like PicoContainer and Spring. DON’T MAKE NON-CONSTANT VARIABLES GLOBAL! DON’T BLINDLY MAKE THINGS “STATIC” JUST TO SHARE THEM! Globals (or “statics” in Java/C# like languages) are dangerous: they can be easily misused, they are a nightmare to trace, and they can break multithreaded execution. Just use the appropriate design pattern: dependency injection.

The Airing of Grievances: Agile

Agile has essentially replaced the Waterfall model as the “right” software development methodology. It’s a really great process when it’s done right, but people ruin it when they do it wrong. And, oh, how badly it can go wrong. I got a lot of problems with bad Agile practices, and now you’re gonna hear about it!

Breaking the Rules

Agile is a lot like the board game Monopoly. The rules are long and complicated, but they are designed to make the game efficient. However, for some reason, everyone insists on making up their own rules for the game, rather than following the official instructions. For example, players won’t put a property up for auction when they land on it and refuse to buy it, or they will build houses before securing a monopoly. Then, as a result, the game goes on forever and loses its fun. In Agile, every organization seems to want to do things their own special way (as many of these grievances describe), and it almost never goes well when they do. The rules are not meant to be broken, and if they are, there will be consequences.

Going Rogue

Agile is meant to keep people focused on the most important tasks. Much time is spent planning and pivoting to stay on top of priorities. Team members should not deviate from committed work. Don’t go rogue! Don’t work on uncommitted tasks! If something is absolutely pressing, then talk with the scrum master to change the commitments.

Teams that are Too Big

How big is your Agile team? If the answer has more than one digit, then the team is too darn big. The ideal size is 5-9 people because communication becomes too hard with more. Large teams just don’t scale – it’s the law of diminishing returns.

Long Meetings

Nobody wants to be stuck in a long, boring meeting. While there are many Agile ceremonies (planning, grooming, stand-up, review, and retrospective), their meetings are meant to be efficient and productive. Stand-ups should be 15 minutes tops – nobody should ever need to give more than three sentences for their status, and nobody really wants to hear anything longer anyway! People should come prepared for planning and grooming so they don’t literally take all day. Demos should be short and sweet. Keep things moving!

Putting People on More Than One Team

Nobody should be cursed to provide deliverables for more than one Agile team. That’s not fair to the individual, who must spend double-duty in meetings, nor is it fair to the teams, who don’t have a dedicated resource for their work. It applies to every role: developer, tester, product owner, or scrum master. It also burns people out very quickly.

Too Many Top Priorities

I was once part of an Agile team where the product owner issued about a dozen “top priorities.” For. Every. Sprint. Our team had no clue what was really important.

Agonizing Over Story Points

Story points are meant to be sizing estimates for velocity. They don’t need to be perfectly accurate. They shouldn’t track hours. Don’t make big fights over it. Don’t go back and change values. Don’t twist planning poker into a political gambit. PLEASE!

Missing User Story Descriptions

The user story is the primary work artifact. It tells how a new feature should work from the perspective of the user… or, at least it should. If your user story contains just one line (like saying “Build the profile page”), then you just might be doing it wrong. Write user stories in the “As a ___, I want ___, so that ___” format, and provide extra descriptions to help the team understand what the story covers. Non-descriptive stories lead to poorly developed features.

Missing Acceptance Criteria

How do we know when a story is complete? If there’s no acceptance criteria, we don’t! Testers also won’t know what to check. Please write helpful acceptance criteria. A bullet list is fine, and Gherkin would be even better.

Not Including Testing and Automation in the Definition of Done

No. No. No. No. No. No. NO! A story is not complete if it is not tested. It must not be accepted without tests passing and automated. Otherwise, be prepared for an avalanche of technical debt as bugs pile up and the team can’t keep up. The premise of Agile is to deliver small, working features in iterations. Testing must be included! Don’t create separate stories for testing. Don’t push it off to the next sprint. If a team cannot get testing done, then perhaps it should increase story point sizings to include testing and/or commit to less work during a sprint.

Blaming QA for Incomplete Stories

I once heard a developer say bluntly to my automation team, “QA is the bottleneck.” Don’t shoot the messenger! Tests fail because the product under test has problems. Many times, testers don’t even receive builds until very late in the sprint. When stories don’t get done, don’t start a blame game – it’s the whole team’s fault. Try shifting left (perhaps by using BDD) or committing to less work per sprint.

Ignoring Technical Debt

Technical debt is the cost of consequences from poor development decisions. Examples may include: using single-threading when multi-threading is needed, avoiding design patterns, and even building up a test automation framework. Product owners don’t seem to like tech debt tasks because they don’t deliver new features. Unfortunately, tech debt will often cripple a team’s ability to deliver new features – pay now or pay later. Don’t ignore tech debt!

Confusing Agile with “Short Waterfall”

Agile is meant to be a process paradigm shift. It is not meant to be a condensed version of the Waterfall model. Sprints should be short. Responsibilities should be shared. Teams should be self-empowered. Break down silos and become truly Agile!

Using “Agile” and “Lean” Interchangeably

The Lean Startup is a methodology for starting a new business using minimal overhead and reacting quickly to lessons learned. It involves using Agile for product development, but it encompasses so much more than just Agile. Don’t use the terms interchangeably! Get on point with your buzzword bingo game.

Misusing the Term “Continuous Integration”

A nightly build is not CI. A weekly regression run is not CI. Manually-triggered tests are not CI. Manual deployments are not CI. Hand-written test reports are not CI. Don’t lie to yourself – CI is continuous integration, and everything must be automatic.

Forcing Scrum When Kanban May Be Better

Scrum is probably the most widely used Agile process, to the point where most people presume “Agile” means “Scrum.” However, Scrum is not appropriate for all teams. Kanban is a much better process when work items must be done “just in time” – like tech support tickets, build deployments, system maintenance, or emergency recoveries. Good candidates for Kanban are IT help desks and DevOps teams. I’ve used Kanban on automation tools/frameworks teams very successfully. Don’t shoehorn everyone into Scrum.

Hanging Agile Manifesto Posters on the Wall

What are you, Communist?

Complaining about Agile

Complaining doesn’t make it better! Honestly, in my experience, the worst complainers are old-school people who just don’t like change. Then, problems become a self-fulfilling prophecy. Or, they try to break rules and then gripe when things don’t work. If your complaint is about Agile in general, then go take a long, hard look in the mirror. However, if you find a problem in how your team is doing Agile, then bring it up during the retrospective – that’s Agile’s auto-correct mechanism. Complaining for complaint’s sake drags everybody down.

The Airing of Grievances: Selenium WebDriver

Selenium WebDriver is the de facto standard for Web UI automation. It’s a great tool, but like anything good, it can also be misused. And that’s where I have grievances. I got a lot of problems with Selenium WebDriver abuses, and now you’re gonna hear about it!

WebDriver “Unit Tests”

“WebDriver unit tests” are like square circles – definitionally, they are logical fallacies. In my books, a unit test must be white box, meaning it has direct access to the product code. However, Web UI tests using WebDriver are inherently black box tests because they are interacting with an actively running website. Thus, they must be above-unit tests by definition. Don’t call them unit tests!

Making Every Test a Web Test

NO! The Testing Pyramid is vital to a healthy overall testing strategy. Web tests are great because they test a website in the ways a user would interact with it, but they have a significant cost. As compared to lower-level tests, they are more fragile, they require more development resources, and they take much more time to run. Browser differences may also affect testing. Furthermore, problems in lower level components should be caught at those lower levels! Sure, HTTP 400s and 500s will appear at the web app layer, but they would be much faster to find and fix with service layer tests. Different layers of testing mitigate risk at their optimal returns-on-investment.

No WebDriver Cleanup

Every WebDriver instance spawns a new system process for “driving” web browser interactions. When the test automation process completes, the WebDriver process may not necessary terminate with it. It is imperative that test automation quits the WebDriver instance once testing is complete. Make sure cleanup happens even when abortive exceptions occur! Otherwise, zombie WebDriver processes may continue on the test machine, causing any number of problems: locked files and directories, high memory usage, wasted CPU cycles, and blocked network ports. These problems can cripple a system and even break future test runs, especially on shared testing machines (like Jenkins nodes). Please, only you can stop the zombie apocalypse – always quit WebDriver instances!

Using “Close” Instead of “Quit”

Regardless of programming language, the WebDriver class has both “close” and “quit” methods. “Close” will close the current browser tab or window, while “quit” will close all windows and terminate the WebDriver process. Make sure to quit during final cleanup. Doing only a close may result in zombie WebDriver processes. It’s a rookie mistake.

Not Optimizing Setup/Cleanup with Service Calls

Web tests are notoriously slow. Whenever you can speed them up, do it! Some tests can be optimized by preparing initial state with service calls. For example, let’s say a user visiting a car dealership website needs to have favorite cars pre-selected for a comparison page test. Rather than navigating to a bunch of car pages and clicking a “favorite” icon, make a setup routine that calls a service to select favorites. Not all tests can do this sort of optimization, but definitely do it for those that can!

Web Elements with No ID

Developers, we need to talk – give every significant element a unique ID. PLEASE! WebDriver calls are so much easier to write and so much more robust to run when locator queries can use IDs instead of CSS selectors or XPaths. Let’s pick ID names during our Three Amigos meetings so that I can program the tests while you develop the features. Determining what elements are import should be easy based on our wireframes. You will save us automators so much time and frustration, since we won’t need to dig through HTML and wonder why our XPaths don’t work.

Changing Web Elements Without Warning

Hey, another thing, developers – don’t change the web page structure without telling us! WebDriver locator queries will break if you change the web elements. Even a seemingly innocuous change could wipe out hundreds of tests. Automation effort is non-trivial. Changes must be planned and sized with automation considerations in mind.

Not Using the Page Object Model

The Page Object Model is a widely-used design pattern for modeling a web page (or components on a web page) as an object in terms of its web elements and user interactions with it. It abstracts Web UI interactions into a common layer that can be reused by many different tests. (The Screenplay pattern, also good, is an evolution of the Page Object Model; tutorial here.) Not using the Page Object Model is Selenium suicide. It will result in rampant code duplication.

Demonizing XPath

XPaths have long been criticized for being slower than CSS selectors. That claim is outdated baloney. In many cases, XPaths outperform CSS selectors – see here, here, and here. Another common complaint is that XPath syntax is more complicated than CSS selector syntax. Honestly, I think they’re about the same in terms of learning curve. XPaths are also more powerful that CSS selectors because they can uniquely pinpoint any element on the page.

Inefficient Web Element Access

Web element IDs make access extremely efficient. However, when IDs are not provided, other locator query types are needed. It is always better to use locator queries to pinpoint elements, rather than to get a list of elements (or even a parent/child chain) to traverse using programming code. For example, I often see code reviews in which an XPath returns a list of results with text labels, and then the programming code (C# or Java or whatever) has a for loop that iterates over each element in the list and exits when the element with the desired label is found. Just add “[text()=’desired text’]” or “[contains(text(), ‘desired text’)]” to the XPath! Use locator queries for all they’re worth.

Interacting with Web Elements Before the Page is Ready

Web UI test automation is inherently full of race conditions. Make sure the elements are ready before calling them, or else face a bunch of “element not found” exceptions. Use WebDriver waits for efficient waiting. Do not use hard sleeps (like Java’s Thread.sleep).

Untuned Timeouts

WebDriver calls need timeouts, or else they could hang forever if there is a problem. (Check online docs for default timeout values.) Timeout value ought to be tuned appropriately for different test environments and different websites. Timeouts that are too short will unnecessarily abort tests, while timeouts that are too long will lengthen precious test runtime.

The Airing of Grievances: Test Automation Code

More grievances! Test code ought to be developed with the same high standards as product code, but often it is neglected. That bothers me so much, and it costs teams a significant amount of time and money through its bad consequences. I got a lot of problems with bad test automation code, and now you’re gonna hear about it!

Copypasta

Code duplication is code cancer. It is particularly rampant in test automation because test steps are often repetitive. That’s no excuse, however, to allow it. Use better programming practices, or face my code review rejections!

Hard-Coded Configuration Data

Automation should be able to run on any environment without hassle. Don’t hard-code URLs, usernames, passwords, and other config-specific values! Read them in as inputs or config files. Nothing is more frustrating than switching environments but – surprise! – not running with the correct values.

Re-reading Inputs for Every Single Test

Config files and input values should not change during a test suite run. So, don’t repeatedly read them! That’s inefficient. Read them once, and hold them in memory in a centrally-accessible location.

Leaving Temporary Files or Settings

Clean up your mess! Don’t leave temporary files or settings after a test completes. Not doing proper cleanup can harm future tests. Files can also build up and eventually run the system out of storage space. It’s a Boy Scout principle: Leave No Trace!

Incorrect Test Results

Test results need integrity to be trusted. Watch out for false positives and false negatives. If there’s a problem during a test, handle it – don’t ignore it. Don’t skip assertion calls. Business decisions are made based on test results.

Uninformative Failure Messages

Here’s one: “Failed.” That tells me nothing. Why did the test fail? Was something missing from a web page? Was there an unexpected exception? Did a service call return 500? TELL ME! Otherwise, I need to waste time rerunning and even debugging the test. The failure may also be intermittent. Tell me what the problem is right when it happens.

Making Assertion Calls in Automation Support Code

Every automated test case must make assertions to verify goodness or badness of system conditions. But, as a best practice, those assertion calls should be written only in the test case code – not in support code like page objects or framework-level classes. Assertions are test-specific, while support code is generic. Support code should simply give system state, while test cases use assertions to validate system state. If support code has assertion calls, then it is far less reusable and traceable.

Treating Boolean Methods like Assertions

This is an all-too-common rookie mistake. Boolean methods simply return a true/false value. They do not (or, as least according to the previous grievance, should not) contain assertion calls. However, I’ve seen many code reviews where programmers call a Boolean method but do nothing with the result. Whoops!

Burying Exceptions

Exceptions mean problems. Test automation is meant to expose problems. Don’t blindly catch all exceptions, log a message, and continue on with a test like nothing’s wrong. Let the exception rise! Most modern test frameworks will catch all exceptions at the test case level, automatically report them as failures, and safely move on to the next test. There is no need to catch exceptions yourself if they ought to abort the test, anyway – that adds a lot of unnecessary code. Catch exceptions only if they are recoverable!

No Automatic Recovery

True automation means no manual intervention. An automation framework should have built-in ways to automatically recover from common problems. For example, if a network connection breaks, automatically attempt to reconnect. If a test fails, retry it. If too many failures happen, either end the run early or pause for some time. And make sure recovery mechanisms are built into the framework, rather than appearing as copypasta throughout the codebase. The purpose of retries (especially for tests) is not to blindly run tests until they turn from red to green, but rather to overcome unexpected interruptions and also to gather more data to better understand failure reasons. Interruptions will happen – handle them when they do. And be sure to log any failures that do happen so that the reasons may be investigated.

No Logging

Please leave a helpful trail of log messages. Tracing through automation code can be difficult after a failure, especially when you’re not the original author. Logging helps everyone quickly get to the root cause of failures. It also really helps to create reproduction scenarios. If I can copy the log from a failing tests into a bug report and be done, then AMEN for an easy triage!

Hard Waits

Ain’t nobody got time for that! Automation needs to be fast, because time is money. Forcing Thread.sleep (or other such equivalent in your language of choice) shows either laziness or desperation. It unnecessarily wastes precious runtime. It also creates race conditions: what if the wait time ends before the thing is ready? Always use “smart” waits – actively and repeatedly check the system at short intervals for the thing to be ready, and abort after a healthy timeout value.

The Airing of Grievances: Test Automation Process

Test automation is a big deal for me. It is my chosen specialty within the broad field of software. When I see things done wrong, or when people just don’t get what it’s about, it really grinds my gears. I got a lot of problems with bad test automation processes, and now you’re gonna hear about it!

Saying “They’re Just Test Scripts”

Test automation is not just a bunch of test scripts: it is a full technology stack that requires design, integration, and expertise. Test automation development is a discipline. Saying it is just a bunch of test scripts is derogatory and demeaning. It devalues the effort test automation requires, which can lead to poor work item sizings and an “us vs. them” attitude between developers and QA.

Not Applying the Same Software Development Best Practices

Test automation is software development, and all the same best practices should thus apply. Write clean, well-designed code. Use version control with code reviews. Add comments and doc. Don’t get lazy because “they’re just test scripts” – wrong attitude!

Lip Service

Don’t say automation is important but then never dedicate time or resources to work on it. Don’t leave automation as a task to complete only if there’s time after manual testing is done. Make automation a priority, or else it will never get done! I once worked on an Agile team where automation framework stories were never included into the sprint because there weren’t “enough points to go around.” So, even though this company hired me explicitly to do test automation, I always got shunted into a manual testing scramble every sprint.

Confusing Test Automation with Deployment Automation

Test automation is the automation of test scenarios (for either functional or performance tests). Deployment automation is the automation of product build distribution and installation in a software environment. They are two different concerns. Cucumber is not Ansible.

Forcing 100% Automation

Some people think that automation will totally eliminate the need for any manual testing. That’s simply not true. Automation and manual testing are complementary. Automation should handle deterministic scenarios with a worthwhile return-on-investment to automate, while manual testing should focus on exploratory testing, user experience (UX), and tests that are too complicated to automate properly. Forcing 100% automation will make teams focus on metrics instead of quality and effectiveness.

Downsizing or Eliminating QA

Test automation doesn’t reduce or eliminate the need for testers. On the contrary, test automation requires even more advanced skills than old-school manual testing. There is still a need for testing roles, whether as a dedicated position or as shared collectively by a bunch of developers. The work done by that testing role just becomes more technical.

Saying Product Code and Test Code Must Use the Same Language

For unit tests, this is true, but for above-unit tests, it is simply false. Any general purpose programming language could be used to automate black-box tests. For example, Python tests could run against an Angular web app. A team may choose to use the same language for product and test code for simplicity, but it is not mandatory.

Not Classifying Test Types

Not all tests are the same. You can play buzzword bingo with all the different test type names: unit, integration, end-to-end, functional, performance, system, contract, exploratory, stress, limits, longevity, test-to-break, etc. Different tests need different tools or frameworks. Tests should also be written at the appropriate Testing Pyramid level.

Assuming All Tests Are Equal

Again, not all tests are the same, even within the same test type. Tests vary in development time, runtime, and maintenance time. It’s not accurate to compare individuals or teams merely on test numbers.

Not Prioritizing Tests to Automate

There’s never enough time to automate everything. Pick the most important ones to automate first – typically the core, highest-priority features. Don’t dilly-dally on unimportant tests.

Not Running Tests Regularly

Automated tests need to run at least once daily, if not continuously in Continuous Integration. Otherwise, the return-on-investment is just too low to justify the automation work. I once worked on a QA team that would run an automated test only once or twice during a 2-year release! How wasteful.

HP Quality Center / ALM

This tool f*&@$#! sucks. Don’t use it for automated tests. Pocket the money and just develop a good codebase with decent doc in the code, and rely upon other dashboards (like Jenkins or Kibana) for test reporting.

The Airing of Grievances: Version Control

Let the Airing of Grievances series continue: I got a lot of problems with version control misuse, and now you’re gonna hear about it!

Not Using Version Control

You’ve got to be some special kind of stupid to not use a version control system. Software is just too dang fragile to go without protection.

Using Outdated Version Control Systems

Still using CVS like it’s 1999? How about Rational ClearCase? If so, it’s time to upgrade. Git seems to be the go-to standard these days, though Subversion still has a place for projects where centralized control is better than distributed control. My opinion? Just use Git.

Gigantic Commits

Code changes ought to be incremental and small, and they ought to be committed in frequent intervals. However, some people like to make one giant, killer commit at a time with bajillions of lines of changes. Nobody wants to review those pull requests. Break things up into smaller pieces!

No Comments with Commits

Look back in your version control history to see how many messages look like this:

.
updated
fixed
done

Really? How is this helpful? Please give a meaningful message. It doesn’t need to be long – a one-liner is fine. But describe what makes the commit significant. Otherwise, tracing through history is dang near impossible!

Never Committing Changes

For whatever reason, some people will check out code, make changes, run locally, and NEVER commit the changes back to the repository. This happened frequently at a previous job with offshore test automation contractors. They would add new tests and fix bugs but never share them back with the team. They’d even email code back and forth, rather than commit! I just can’t even.

Committing Code that Doesn’t Even Compile

Never Pushing Changes

In Git, there is a difference between “commit” (which commits the change locally) and “push” (which pushes all local commits to the remote repository). Some people never push. Then, they wonder why they can’t open pull requests. Or, they lose their code after a Blue Screen wipes out their machine. Make it a habit to push at the end of the workday, whether it’s needed or not.

No Branching

Branches are like swim lanes: every contributor (or group) can develop code without interfering with others. They also make concurrent release work possible. Using only one branch doesn’t “simplify” development – it just causes an integration mess. For example, I once worked in an organization where the QA architect insisted that test automation should not use multiple branches and, instead, have if/else conditions for differences between release branches. (I fought tooth-and-nail against it and lost.) Code duplication became rampant. Always adopt a good branching strategy, even for test automation. Gitflow is a good example workflow.

Stale Branches

Stale branches mean old code and merge conflicts. Nobody wants those. Keep your local branches up-to-date.

Not Deleting Feature Branches

In Git, feature branches are meant to have a short lifespan: you create one to develop a new feature or fix a bug or whatever, and then you delete it after the pull request is completed. However, some people don’t delete those old branches. Over time on a team, those old branches really add up and pollute the repository. Delete them as a common courtesy.

Botched Merge Conflict Resolution

Merge conflicts themselves are not the grievance. Nobody likes them, but they are inevitable on any team. However, botched merge conflict resolution is a HUGE grievance. Would you be mad if you spent a lot of time fixing a problem, only to have some other schmuck accidentally undo your code change with an overwrite? Please be careful when merging. If you aren’t sure that your merge will be good, there’s no shame in asking for help.

Not Re-compiling and Re-testing after Merging

What’s worse than messing up a merge? Committing the changes without testing them first! Merges are risky, and mistakes happen – but that’s why it is imperative to make sure everything is still good after a merge. I’ve seen people blindly post pull request updates after resolving merge conflicts, only to have the build fail with compiler warnings. Don’t do that.

Granting Everyone Full Repository Permissions

While everyone on the team should contribute, not everyone should contribute in the same ways. If there are no security policies set up, then users will do dangerous things, whether accidentally or deliberately. They could circumvent the review process. They could rename things unexpectedly. One time, I saw a guy delete the remote master branch! (Thank goodness we recovered it quickly.) Put permissions into place before bad things happen.