testing

Grace Hopper Bug

Writing Good Bug Reports

Bugs, bugs, bugs! Talking about software development is impossible without also talking about bugs. At first, the term “bug” may seem like strange slang for “defect.” Are there creepy-crawlies running about our code and computers? Not usually, sometimes yes! In 1947, Grace Hopper found a dead moth stuck in a relay in Harvard’s Mark II computer, and her “bug” report (pictured above) joked about finding a real bug behind a computer defect. Even though inventors like Thomas Edison had used the term “bug” to describe technological glitches for years beforehand, Grace Hopper’s bug cemented the terminology for computers and software.

Bugs happen. Why? Nobody is perfect, and therefore, no software is perfect. Building software of high quality requires good designs to resist bugs, good implementations to avoid bugs, and good feedback to report bugs when they inevitably appear. This article covers best practices for writing good bug reports when they do happen.

What is a bug “report”?

A “bug” is a defect, plain and simple. The term refers specifically to an issue in the software. However, a bug report (or ticket) is a written record detailing the defect. Bug reports are typically written in a project management tool like Jira. The bug and its report are two separate entities. Certainly, undetected bugs can exist in a software product without having associated reports.

When should a bug report be written?

A bug report should be written whenever a new problem that appears to be a defect is discovered. Problems can be discovered during testing activities like automated test runs or exploratory manual testing. They can also be discovered while developing new features. In the worst case, customers will find problems and submit complaints!

However, notice how I used the term “problem” and not “defect.” All problems need solutions, but not all problems are truly defects. Sometimes, the user who reported the problem doesn’t know how a feature should work. Other times, the environment in which the problem occurred is improperly configured. The team member who first discovered the problem or received the customer complaint should initially do a light investigation to make sure the problem looks like a genuine software defect. Initial investigation should be done expediently while context is fresh.

If the problem appears to be a real defect and not a misunderstanding or misconfiguration, then the investigator should search existing bug reports for the same issue. Someone else on the team might have recently discovered the same issue or a similar issue. Bugs also can reappear even after being “fixed.” Adding information to existing reports is typically a better practice than creating duplicative reports.

What if the problem is unclear? Whenever I’m not sure if a problem is a bug or another type of issue, I ask others on my team for their thoughts. I try to ask questions like, “Does this look right? What could cause this behavior? Did I do something incorrectly?” Blindly opening bug reports for every problem is akin to “the boy who cried wolf” – it can desensitize a team to warnings of real, important bugs. Doing just a bit of investigation shows good intentions and, in many cases, spares the team extra work later. Nevertheless, when in doubt, creating a report is better than not creating a report. A little churn from false positives is better than risking real problems.

Why should bug reports be written?

Whenever a real bug is discovered, a team should write a report for it. Simply talking about the bug might seem like an easier, faster approach, especially for smaller teams, but the act of writing a report for the bug is important for the integrity of the software development process. A written record is an artifact that requires resolution:

  • A report provides a feedback loop to developers.
  • A report contains all bug information in a single source.
  • A report can be tracked in a project management tool.
  • A report can be sized and prioritized with development work.
  • A report records work history for the bug.

Bug reports help make bug fixes part of the development process. They bring attention to bugs, and they cannot be ignored or overlooked easily.

What goes into a bug report?

Regardless of tool or team process, good bug reports contain the following information:

  • Problem Summary
    • A brief, one-line description of the defect
    • Clearly states what it defective
    • Should be used as the title of the report
  • Report Identifier
    • A unique identifier for the bug report
    • Typically generated automatically by the management tool (like Jira)
  • Full Description
    • A longer description of the problem
    • Explain any relevant information
    • Use clear, plain language
  • Steps to Reproduce
    • A clear procedure for manually reproducing the failure
    • Could be steps from a failing test case
    • Include actual vs. expected results
  • Occurrences
    • Cases when the defect does and does not appear
    • Share product version, code branch, environment name, system configuration, etc.
    • Does the defect appear consistently or intermittently?
  • Artifacts
    • Attach logs, screenshots, files, links, etc.
  • Impact
    • How does the defect affect the customer?
    • Does the defect block any development work?
    • What test cases will fail due to this defect?
  • Root Cause Analysis
    • If known, explain why the defect happened
    • If unknown, offer possible reasons for the defect
    • Warning: clearly denote proof vs. speculation!
  • Triage
    • Assign an owner if possible
    • Assign a severity or priority based on guidelines and common sense
    • Assign a deadline if applicable
    • Assign any other information the team needs

I use this list as a template whenever I write bug reports. For example, in a Jira bug ticket, I’ll make each item a heading in the ticket’s “Description” field. Sometimes, I might skip sections if I don’t have the information. However, I typically don’t open a bug report until I have most of this information for the defect.

How should bug reports be handled?

One word: professionally. Handle bug reports professionally. What does that mean?

Provide as much information as possible in bug reports. Bug reports are a form of communication and record. Saying little more than, “It dun broke,” doesn’t help anyone fix the problem. Provide useful, accurate information so that others who didn’t discover the bug have enough context to help.

Triage bugs expediently. When you uncover a problem, investigate it. When you need a second opinion, ask for it. When someone sends a bug report to you or your team, triage it, fix it, and reply to the person who reported it. Don’t ignore problems, and don’t let them fester.

Treat bug reports as unfolding stories. Bugs are usually unexpected, tricky surprises. The information in a bug report can be incomplete or even incorrect because it represents best-guess theories about the defect. The report artifact should be treated as a living document. Information can be added or updated as work proceeds. Team members should be gracious to each other regarding available information.

Do not shame or be shamed. Bugs happen. Even the best developers make mistakes. A mature, healthy team should faithfully report bugs, quickly resolve them, and take steps to avoid similar problems in the future. Developers should not stigmatize bugs or try to censor bug counts. Testers should not brag about the number of bugs they find. Language used in bug reports should focus on software, not people. Gossiping and public shaming over bugs should not happen. Any shame associated with bugs can drive a team to do bad practices. Any recurring issues should be addressed with individuals directly or with the help of management.

Good bug reports matter

Writing bug reports well is vital for team collaboration. Organized, accurate information can save hours of time wasted on fruitless attempts to reproduce issues or attempt fixes. Give these practices a try the next time you discover a bug!

Beyond Unit Tests: End-to-End Web UI Testing

On October 4, 2019, I gave a talk entitled Beyond Unit Tests: End-to-End Web UI Testing at PyGotham 2019. Check it out below! I show how to write a concise-yet-complete test solution for Web UI test cases using Python, pytest, and Selenium WebDriver.

This talk is a condensed version of my Hands-On Web UI Testing tutorials that I delivered at DjangoCon 2019 and PyOhio 2019. If you’d like to take the full tutorial, check out https://github.com/AndyLPK247/djangocon-2019-web-ui-testing. Full instructions are in the README.

Be sure to check out the other PyGotham 2019 talks, too. My favorite was Dungeons & Dragons & Python: Epic Adventures with Prompt-Toolkit and Friends by Mike Pirnat.

Hands-On UI Testing with Python (SmartBear Webinar)

On August 14, 2019, I teamed up with SmartBear to deliver a one-hour webinar about Web UI testing with Python! It was an honor to work with Nicholas Brown, Digital Marketing Manager for CrossBrowserTesting at SmartBear Software, to make this webinar happen.

The Webinar

Source: https://crossbrowsertesting.com/resources/webinars/testing-with-python

In the webinar, I showed how to build a basic Web UI test automation solution using Python, pytest, and Selenium WebDriver. The tutorial covered automating one test, a simple DuckDuckGo search, from inception to automation. It also showed how to use CrossBrowserTesting to scale the solution so that it can run tests on any browser, any platform, and any version in the cloud as a service!

The example test project for the webinar is hosted in Github here: https://github.com/AndyLPK247/smartbear-hands-on-ui-testing-python

I encourage you to clone the Github repository and try to run the example test on your own! Make sure to get a CrossBrowserTesting trial license so you can try different browsers. You can also try to write new tests of your own. All instructions are in the README. Have fun with it!

The Q&A

After the tutorial, we took questions from the audience. Here are answers to the top questions:

How can we automate UI interactions for CAPTCHA?

CAPTCHA is a feature many websites use to determine whether or not a user is human. Most CAPTCHAs require the user to read obscured text from an image, but there are other variations. By their very nature, CAPTCHAs are designed to thwart UI automation.

When someone asked this question during the webinar, I didn’t have an answer, so I did some research afterwards. Unfortunately, it looks like there’s no easy solution. The best workarounds involve driving apps through their APIs to avoid CAPTCHAs. I also saw some services that offer to solve CAPTCHAs.

Are there any standard Page Object Pattern implementations in Python?

Not really. Mozilla maintains the PyPOM project, but I personally haven’t used it. I like to keep my page objects pretty simple, as shown in the tutorial. I also recommend the Screenplay Pattern, which handles concerns better as test automation solutions grow larger. I’m actually working on a Pythonic implementation of the Screenplay Pattern that I hope to release soon!

How can I run Python tests that use Selenium WebDriver and pytest from Jenkins?

Any major Continuous Integration tool like Jenkins can easily run Web UI tests in any major language. First, make sure the nodes are properly configured to run the tests – they’ll need Python with the appropriate packages. If you plan to use local browsers, make sure the nodes have the browsers and WebDriver executables properly installed. If you plan to use remote browsers (like with CrossBrowserTesting), make sure your CI environment can call out to the remote service. Test jobs can simply call pytest from the command line to launch the tests. I also recommend the “JUnit” pytest option to generate a JUnit-style XML test report because most CI tools require that format for displaying and tracking test results.

How can I combine API and database testing with Web UI testing?

One way to handle API and database testing is to write integration tests separate from Web UI tests. You can still use pytest, but you’d use a library like requests for APIs and SQLAlchemy for databases.

Another approach is to write “hybrid” tests that use APIs and database calls to help Web UI testing. Browsers are notoriously slow compared to direct back-end calls. For example, database calls could pre-populate data so that, upon login, the website already displays stuff to test. Hybrid tests can make tests much faster and much safer.

How can we test mobile apps and browsers using Python?

Even though our tutorial covered desktop-based browser UI interactions, the strategy for testing mobile apps and browsers is the same. Mobile tests need Appium, which is like a special version of WebDriver for mobile features. The Page Object Pattern (or Screenplay Pattern) still applies. CrossBrowserTesting provides mobile platforms, too!

Tutorial: Web Testing Made Easy with Python

Have you ever discovered a bug in a web app? Yuck! Almost everyone has. Bugs look bad, interrupt the user’s experience, and cheapen the web app’s value. Severe bugs can incur serious business costs and tarnish the provider’s reputation.

So, how can we prevent these bugs from reaching users? The best way to catch bugs is to test the web app. However, web UI testing can be difficult: it requires more effort than unit testing, and it has a bad rap for being flaky.

Never fear! Recently, I teamed up with the awesome folks at TestProject to develop a helpful tutorial that makes web UI test automation easy with the power of Python! The tutorial is named Web Testing Made Easy with Python, Pytest and Selenium WebDriver. It is available for free as a set of TestProject blog articles together with a GitHub example project.

In our tutorial, we will build a simple yet robust web UI test solution using Pythonpytest, and Selenium WebDriver. We cover strategies for good test design as well as patterns for good automation code. By the end of the tutorial, you’ll be a web test automation champ! Your Python test project can be the foundation for your own test cases, too.

How can you take the tutorial? Start reading here, and follow the instructions: https://blog.testproject.io/2019/07/16/open-source-test-automation-python-pytest-selenium-webdriver/

I personally want to thank TestProject for this collaboration. TestProject provides helpful tools that can supercharge your test automation. They offer a smart test recorder, a bunch of add-ons that act like test case building blocks, an SDK that can make test automation coding easier, and beautiful analytics to see exactly what the tests are doing. Not only is TestProject a cool platform, but the people with whom I’ve worked there are great. Be sure to check it out!

Should We Rewrite Our Test Automation in Another Language?

A Twitter friend recently asked me the following question:

I work in a Microsoft shop. We have 40 developers who use .NET (C#). We also have several manual testers and 5 automation engineers who developed our test automation solution in Python. However, our leadership wants to move everything completely to C#.

Would it be better to (a) train 40 .NET developers in Python to use the existing test solution or (b) train the testers in .NET and port the tests to C#?

This is a very tough question. It’s not as simple as asking for the best test automation language because there are people, positions, and solutions already in place. Honestly, I can’t give a conclusive answer without more context, but I can offer five points of advice.

What is the state of the Python test solution?

How big and how bad is the existing Python test automation solution? Rewriting tests that already work fine has low return-on-investment. However, rewriting tests that have problems like flakiness or false positives might be worthwhile. More tests means more time, too. Please read my article, Our Test Automation Has Problems. Should We Start Over?, to learn what problems would warrant a rewrite.

Why not have two test solutions?

If the existing Python tests are fine, then rewriting them is a huge opportunity cost. Instead of rewriting existing tests, developers and testers could spend their time writing only the new tests in a new C# solution. The Python solution would be “legacy” and would not have any new tests added to it. Old tests would disappear with deprecated features, too. Eventually, the C# tests would take over. The main drawback for this possibility is the continued maintenance of a Python stack.

Do the manual testers have any programming experience?

Many manual testers do not have strong programming skills. Some may not have any programming skills at all! They will have a big learning curve when training to do test automation. Python would be a much easier language for them to learn than C# because it is concise, readable, and friendly for beginners. Conversely, Python would be fairly easy for C# developers to learn as they go.

What advantages will conformity bring?

Retraining workers and rewriting code is no small task. From a business perspective, they are investment costs. There must be significant returns that outweigh the cost of the transition. Make sure those returns are known and real.

Will developers also automate tests?

Many teams choose to write their test automation code in the same language as the product code so that developers can more easily automate tests. However, in my experience, developers typically don’t write many tests, especially when others on the team are dedicated testers. Test automation is difficult and has unique challenges. Some developers have bad attitudes about testing, too. Changing the language probably won’t change the deeper issues.

Final Thoughts

The decision to choose between C# and Python for test automation is very personal for me. I faced this choice directly when I started working at PrecisionLender. Even though I deeply love Python, we chose to use C#. It was the right choice: we were a Microsoft shop with no test solution (yet) and no Python stack in place. My team and I have no regrets.

There is nothing with test automation that either language can’t do. Both are solid choices. The best choice for a team depends more upon the team’s situation than differences between these languages.

How Do We Write Good Gherkin as Part of BDD? (Webinar + Q&A)

On July 23, 2019, I gave a webinar entitled, “How Do We Write Good Gherkin as Part of BDD?” in collaboration with Paul Merrill and his company, Beaufort Fairmont. This webinar was the follow-up to a previous webinar, What Is BDD, and How Do We Practice It? It was an honor to partner with Paul again to go further into BDD practices. (If you want to learn more about BDD, check out Beaufort Fairmont’s two-day BDD training offering, as well as their blog and other webinars.)

To see my webinar recording, register here. Definitely watch the previous webinar first.

Just like last time, attendees asks several great questions that we simply could not answer live. I categorized all questions we received and answered them below. Please note that some questions might be rephrased or combined with others.

Questions about BDD

What is BDD?

Behavior-Driven Development! Read more here.

In a typical Agile development process, who should write feature files?

The Three Amigos! Product owners, developers, and testers should all come together to figure out behaviors. I recommend doing Example Mapping to formulate before writing Gherkin scenarios. The green example cards should be turned into feature files. The specific person who writes the feature files is up to team preference. It could be a collaborative effort, or it could be divided-and-conquered. Any one of the Three Amigos can do it.

How can we apply BDD to SAFe (Scaled Agile Framework) teams?

BDD practices like Three Amigos meetings, Example Mapping, Behavior Specification with Gherkin, and Behavior Implementation can become part of any process. All of these practices happen at the level of the development teams. Teams could even share Gherkin steps and test frameworks wherever sharing makes sense. Check out BDD 101: Behavior-Driven Agile.

What advice can you give to teams that use BDD tests frameworks solely as an automation tool and not part of a greater BDD process?

Do the best with what you’ve got. Try to show how other BDD practices can pragmatically improve your team’s development and delivery work. See also:

Questions about Gherkin Syntax

What is the difference between a scenario and a scenario outline?

A scenario is a procedure of Given-When-Then steps that covers one example for one behavior. If there are any parameters for steps, then a scenario has exactly one combination of possible inputs. A scenario outline is a Given-When-Then procedure that can have multiple examples of one behavior provided as a table of input combos. Each input row will run the same steps once, just with different parameter inputs. See BDD 101: Gherkin by Example to see examples.

What do you think about long tables in scenarios?

Long tables in Gherkin usually look terrible. They’re hard to read, and they create a wall of text. They may also include unnecessary variations. Stick to the Unique Example rule.

Are Given steps mandatory, or can scenarios start directly with When steps?

None of the step types are mandatory. It is valid to write a scenario that skips the Given and has only When-Then steps. It is also valid to write scenarios that are Given-Then or Given-When. In fact, it is syntactically valid to put steps in any order. However, I strongly recommend keeping Given-When-Then step order to properly frame behaviors.

Are quotation marks required for parameters?

No, quotation marks are not required for parameters, but they are a popular convention, and one that I recommend. Quotes make parameters easy to identify.

Questions about Gherkin Scenarios

How do we make sure each scenario focuses on an individual, independent behavior?

Do Example Mapping first as a team. Write scenarios together, or review them with others. Ask, “What makes this behavior unique?” Make sure to use strict Given-When-Then step order when defining the behavior. Rethink the scenario if it is more than 10 lines long. Look out for unnecessary complication.

What does it mean for a scenario to be “chronological”?

Scenario steps should be written as if they were on a timeline. Each step will be executed after the previous one, so its description must start where the previous one ended. Remember, steps will be automated as if they were scripts.

How do we write a very low-level scenario without having a wall of text?

Don’t write low-level scenarios! Gherkin is best for feature testing, not unit testing. Steps should focus on intention and business value. Instead of writing “type, type, click, wait,” write “log into the app.” If you absolutely must write a low-level scenario, remember that the same principles apply. Be intuitively descriptive. Focus on individual behaviors. Keep scenarios concise.

If all scenarios in a feature file have only one user, is it okay to use first-person perspective instead of third-person?

In my opinion, no. I favor third-person perspective universally. Trying to limit usage to one feature file won’t work because any step can be used by any feature file within a test project. The entire solution must be either first-person or third-person. There’s no middle ground.

Can we write Gherkin scenarios with personas?

Yes! Personas can make scenarios more meaningful and understandable. Make sure to define the personas well – they could be described under the Feature section or in a separate text file.

How do we write Gherkin scenarios that need to validate lots of information on a page?

Pick the most important pieces of information to check. You could write separate Then steps for each assertion, or you could push small-but-similar validations down to the automation level to avoid Gherkin clutter.

How do we write Gherkin scenarios for validating Web UI fields?

Typically, I treat each field validation as an independent behavior, and thus I write separate scenarios to check each field. If the scenario steps simply enter a textual value and verify a specific message, then I might make a Scenario Outline with example rows for each equivalence class of inputs.

How do we write Gherkin scenarios that have multiple inputs and setup steps? (Example: an API with ten parameters)

Gherkin allows multiple steps of the same type to be written using “And” and “But” keywords. It’s not a problem to have “Given-And-And” or “When-And-And”. If you discover that different scenarios repeat the same setup steps, then I recommend either moving those common steps to a Background section or writing a new step that covers multiple calls (for conciseness).

One example from the webinar showed searching for shoes and adding them to a shopping cart as part of one scenario. Aren’t those two different behaviors?

Here’s the scenario in question:

Scenario: Add shoes to the shopping cart
  Given the ShoeStore home page is displayed
  When the shopper searches for “red pumps”
  And the shopper adds the first result to the cart
  Then the cart has one pair of “red pumps”

We could have split this scenario into two. I just chose to define the behavior this way. This scenario is a bit more end-to-end because it covers a basic but typical workflow. It may also have leveraged existing steps, which expedites automation development. Overall, the scenario is still concise, chronological, and intuitively understandable. Remember, there is an art as well as a science to writing good Gherkin.

Questions about Automation

Do scenarios need to be independent of each other?

Yes, unequivocally. Tests that are not independent could interfere with each other and cause unexpected failures. Independence also reinforces singular behavioral focus.

How do we start a scenario “in media res” without it depending on other tests?

At the Gherkin level, write Given steps that define a new starting point for the behavior. For example, many teams develop Web apps. It’s common to think that the starting point for all tests is login. However, the starting point can be a few pages after login.

At the automation level, it may be useful to implement the Given steps by calling other steps. For example, if a Given step should start at a user’s profile page, then perhaps it could internally call the login step and the click-the-profile-link step. Test steps may repetitively do the same operations for different tests, but test case independence will be preserved, and unique failures will be reported.

What is the best way to handle preconditions like logging into a Web app?

The simplest way to handle preconditions is to write Given steps. If those Given steps are shared by all scenarios in a feature file, then move them to a Background section. Automation hooks can also perform common setup and cleanup actions, depending upon the test framework. Personally, I prefer to use hooks to do automatic login rather than repeat Given steps for many scenarios.

Is it better to set up and tear down new test objects for each test case, or is it better to use shared, pre-created objects?

That depends upon the object. Most objects like WebDrivers and page objects should have scenario scope, meaning they are created fresh for each scenario and then torn down when the scenario ends. The only time an object should be shared across scenarios is if it is immutable or very expensive to create. For example, configuration data could be read in once before all tests and then injected immutably into each scenario. The safe position is always to use fresh objects; justify why sharing is needed before trying it.

I want to use Serenity for BDD and testing. Should I use Cucumber-like Gherkin feature files, or should I use Serenity’s native methods?

That’s up to you and your team. Personally, I would still use Gherkin feature files with Serenity. I like to separate my test case from my test code. Everyone can read Gherkin feature files, but not everyone can read Java or JavaScript test methods.

If a company already has a large BDD test solution that is poorly implemented, would it be better to keep it going or try to change it?

This question can be applied to all software projects, not just BDD test solutions. The answer is situational. Personally, I favor doing things right, even if it means refactoring. Please read Our Test Automation Has Problems. Should We Start Over? for a thorough answer.

Final Questions

Why do you call yourself “Pandy” and the “Automation Panda”?

Pandas are awesome. Everybody loves them. And nobody forgets my moniker. The nickname “Pandy” came about in the Python community to distinguish me from other folks named “Andy.”

Where can I get team training in BDD?

Beaufort Fairmont provides a one- or two-day course in BDD and writing Gherkin. Sign up for more information here.

WebDriver Element Existence vs. Appearance

Web UI tests with Selenium WebDriver must interact with elements on a Web page. Locating elements can be tricky because expected elements may or may not be on the page. Furthermore, WebDriver might not be able to interact with some elements that exist on the page. That may seem crazy, but let’s understand why.

Web UI interactions universally follow these steps:

  1. Wait for an element to be ready.
  2. Get the element using a locator (ID, CSS selector, XPath, etc.).
  3. Send commands (like clicking or typing) or queries (like getting text) to the element.

Clearly, an element must be “ready” before interactions can happen. As humans, we intuitively define “ready” as, “The page is loaded, and the element is visible.” Automation code is a bit more technical because there are two different ways to define readiness:

  1. Existence: the element exists in the HTML structure of the page.
  2. Appearance: the element exists and it is visible on the page.

Existence can easily be determined by WebDriver’s “find elements” method. The plural “find elements” method will return a list of all elements matching a locator query. If no elements match the locator, then an empty list is returned. The singular “find element” method, on the other hand, will return the first element matching the locator or throw an exception if no elements are found. Thus, the plural version is more convenient to use for checking existence.

Here’s an example existence method in C#:

public bool Exists(IWebDriver driver, By locator) =>
    driver.FindElements(locator).Count > 0;

Checking for existence is the most basic level of readiness. If an element doesn’t exist, interactions with it simply cannot happen. However, existence alone may not be sufficient for interactions. Selenium WebDriver requires elements to not only exist but also to be displayed for interactions like sending clicks and scraping text. Existing elements may be scrolled out of view or even deliberately hidden. WebDriver calls to such elements will yield cryptic exceptions. That’s why waiting for appearance is usually the better readiness condition.

Here’s an example appearance method in C#:

// Assume that the locator targets one element, not multiple
public bool Appears(IWebDriver driver, By locator) =>
    Exists(driver, locator) && driver.FindElement(locator).Displayed;

Existence must be checked first, or else the “Displayed” call will throw an exception whenever existence is false.

Putting it all together, here’s what a button click interaction could look like in C#:

// Assume this is a method in a Page Object class
// Assume that "Driver" is the WebDriver instance
public void ClickThatButton()
{
    var button = By.Id("that-button");
    var wait = new WebDriverWait(Driver, new System.Timespan(0, 0, 15));
    wait.Until((driver) => Appears(driver, button));
    Driver.FindElement(button).Click();
}

It’s good practice to make explicit waits before locating and using elements. It’s also good practice to get fresh elements for every interaction call in order to avoid pesky stale element exceptions. Calls like these should be placed in Page Object methods or Screenplay Pattern tasks and questions so that interactions are safe and thorough.

Appearance may not always be the right choice. There may be times when a test should check if an element doesn’t exist or if an element exists but is hidden. Just think before you code.

Our Test Automation Has Problems. Should We Start Over?

Test automation is the cornerstone for continuous software delivery pipelines. Automation repeatedly hits new features with a barrage of tests that could never be completed manually in time. From my experiences, though, test automation code can be some of the worst code in the software industry. Teams frequently overlook its importance, its workload size, and its unique technical challenges. The resulting code can become a heapin’ mess! You might even call it “bankrupt.”

In this situation, should a team give up on their current test automation solution and just start over from scratch? Maybe, but maybe not. Don’t be quick to nuke everything and start over! No project is perfect, and some can be recovered. Starting a whole new test automation solution is not a light decision.

Red Herrings

Here are a few problems that should be addressed within the existing test automation solution instead of starting over:

  • Tests don’t add value? That’s easy to fix: just remove the low-value tests. The framework is separate from the test cases.
  • Tests are flaky? Find the root cause. Typically, flaky tests can be fixed with small updated to the test case or to an aspect of the framework. If the feature under test itself is flaky, then fix the feature or consider testing it manually instead of with automation.
  • The original authors are gone? Figure out their code before writing your own. Their code might be good once you understand it.
  • The team uses poor practices? Fix the practices before fixing the code. Otherwise, the new code won’t be any better than the old code.

Signs to Consider

Nevertheless, there are times when a team should start over. Look for these signs, and proceed thoughtfully.

  • The test frameworks and packages are deprecated. Although the tests may run today, they may not run tomorrow. Finding others to work on it will be difficult, too. For example, nose was once the Python framework of choice, but now it’s dead. Pick a more modern framework. Hopefully, parts of the old tests can be salvaged.
  • Test case independence is systemically violated. Each test case must be independent. It should not depend upon the outputs of any other tests. It should not interrupt any other tests. The litmus tests for independence is that tests should be able to run successfully in any random order. Interdependent tests are not scalable, difficult for reruns, and potentially dangerous. The only way to fix them is to completely rewrite them.
  • There is no separation between unit tests and feature tests. White-box unit tests cover code, whereas black-box feature tests cover live features. They occupy different Testing Pyramid layers and should happen at different stages in a CI/CD pipeline. An automation solution with no separation between these types of tests reveals a lack of planning and strategy, and continuous testing will be much tougher to achieve.
  • The framework lacks cohesive architecture, designs, and patterns. Good solutions are designed, not hacked. Designs scale. Hacks don’t. Building new tests on a shaky platform will yield shaky tests.
  • Critical fixes would require a majority of tests to be rewritten. Some issues are pervasive, especially when code is repeatedly duplicated. If framework problems are so widespread that tests would need to be rewritten anyway, then it might be easier to start fresh with a new solution.

The Nuclear Option

If you feel like you really do need to start a new test automation solution from scratch, make sure to do it right. You won’t want to redo everything again in a few years! Here’s some advice:

  1. Define your goals. What problems do you want to solve? How can testing and automation help? What can you reasonably achieve? Are you willing to make necessary changes to achieve these goals?
  2. Treat test automation as a project. Test automation is software and requires the same rules and practices. It takes time and expertise. Make sure to allocate time, money, and resources to its development and execution.
  3. Learn how to develop test automation well. It’s not “just writing scripts” – it’s a special domain. Take courses from Test Automation University. Read more Automation Panda articles. Attend webinars and conferences. Seek consulting help if necessary.
  4. Draw a line in the sand between the old and new solutions. Commit to writing all new tests in the new solution. Meanwhile, continue to run tests from the old solution as appropriate – there’s no need to axe its test coverage. Decide if migrating old tests to the new solution is worthwhile for your team. This could also be an opportunity to clean up old tests and cut out low-value areas.

Starting a new test solution is a lot of work, but it can also be rewarding. Good luck!

Python BDD Framework Comparison

Almost every major programming language has BDD test frameworks, and Python is no exception. In fact, Python has several! So, how do they compare, and which one is best? Let’s find out.

Head-to-Head Comparison

behave

behave is one of the most popular Python BDD frameworks. Although it is not officially part of the Cucumber project, it functions very similarly to Cucumber frameworks.

Pros

  • It fully supports the Gherkin language.
  • Environmental functions and fixtures make setup and cleanup easy.
  • It has Django and Flask integrations.
  • It is popular with Python BDD practitioners.
  • Online docs and tutorials are great.
  • It has PyCharm Professional Edition support.

Cons

  • There’s no support for parallel execution.
  • It’s a standalone framework.
  • Sharing steps between feature files can be a bit of a hassle.

pytest-bdd

pytest-bdd is a plugin for pytest that lets users write tests as Gherkin feature files rather than test functions. Because it integrates with pytest, it can work with any other pytest plugins, such as pytest-html for pretty reports and pytest-xdist for parallel testing. It also uses pytest fixtures for dependency injection.

Pros

  • It is fully compatible with pytest and major pytest plugins.
  • It benefits from pytest‘s community, growth, and goodness.
  • Fixtures are a great way to manage context between steps.
  • Tests can be filtered and executed together with other pytest tests.
  • Step definitions and hooks are easily shared using conftest.py.
  • Tabular data can be handled better for data-driven testing.
  • Online docs and tutorials are great.
  • It has PyCharm Professional Edition support.

Cons

  • Step definition modules must have explicit declarations for feature files (via “@scenario” or the “scenarios” function).
  • Scenario outline steps must be parsed differently.

radish

radish is a BDD framework with a twist: it adds new syntax to the Gherkin language. Language features like scenario loops, scenario preconditions, and constants make radish‘s Gherkin variant more programmatic for test cases.

Resources

Logo

Pros

  • Gherkin language extensions empower testers to write better tests.
  • The website, docs, and logo are on point.
  • Feature files and step definitions come out very clean.

Cons

  • It’s a standalone framework with limited extensions.
  • BDD purists may not like the additions to the Gherkin syntax.

lettuce

lettuce is another vegetable-themed Python BDD framework that’s been around for years. However, the website and the code haven’t been updated for a while.

Resources

Logo

../_images/flow.png

Pros

  • Its code is simpler.
  • It’s tried and true.

Cons

  • It lacks the feature richness of the other frameworks.
  • It doesn’t appear to have much active, ongoing support.

freshen

freshen was one of the first BDD test frameworks for Python. It was a plugin for nose. However, both freshen and nose are no longer maintained, and their doc pages explicitly tell readers to use other frameworks.

My Recommendations

None of these frameworks are perfect, but some have clear advantages. Overall, my top recommendation is pytest-bdd because it benefits from the strengths of pytest. I believe pytest is one of the best test frameworks in any language because of its conciseness, fixtures, assertions, and plugins. The 2018 Python Developers Survey showed that pytest is, by far, the most popular Python test framework, too. Even though pytest-bdd doesn’t feel as polished as behave, I think some TLC from the open source community could fix that.

Here are other recommendations:

  • Use behave if you want a robust, clean experience with the largest community.
  • Use pytest-bdd if you need to integrate with other plugins, already have a bunch of pytest tests, or want to run tests in parallel.
  • Use radish if you want more programmatic control of testing at the Gherkin layer.
  • Don’t use lettuce or freshen.

Web Element Locators for Test Automation

Do you want a full course? Check out Web Element Locator Strategies on Test Automation University!

If you do any Web UI test automation (like with Selenium WebDriver), then you probably spend a large chunk of your test development time finding elements on a page, like buttons, inputs, and divs. Finding the right elements, however, can be challenging, especially when they lack unique IDs or class names. This guide will show you how to locate any Web element like a pro.

What are Web elements?

A Web element is an individual entity rendered on a Web page. Everything a user sees on a Web page (and even some things they don’t see) are elements: title headers, okay buttons, input fields, text areas, and more. Elements are specified in HTML by tag name, attributes, and contents. They may also have child elements, such as a table containing rows. CSS may be applied to elements to style them with colors, sizes, position, etc. Programming languages typically access Web elements as nodes in the Document Object Model (DOM).

What are Web element locators?

Web elements and locators are two different things. A Web element locator is an object that finds and returns Web elements on a page using a given query. In short, locators find elements.

Why are locators needed? As human users, we interact with Web pages visually: We look, scroll, click, and type through a browser. However, test automation interacts with Web pages programmatically: it needs a coded way to find and manipulate those same elements. Traditional automation won’t “look” at the page like a human* – it will search the DOM instead.

(*Newer automation technologies enable visual testing, which will be discussed later in this article.)

Selenium WebDriver separates the concerns of element location and interaction. WebDriver calls for these two concerns are frequently written back-to-back:

// WebDriver example: typing a search phrase at www.google.com
// This code is written in C#, but the calls are the same in any language

// First, element location
IWebElement searchField = driver.FindElement(By.Name("q"));

// Second, element interaction
searchField.SendKeys("panda");

WebDriver provides the following locator query types using “By”:

Which one is best? We’ll discuss that below.

Locators may also return multiple elements, or none at all! For example:

// Get the list of results from a Google search
// Using "FindElements" will return a list of all elements found in order
// Using "FindElement" would return the first element found (or throw an exception if no elements were found)
IList<IWebElement> results = driver.FindElements(By.CssSelector("div.r"));
results.Count.Should().BeGreaterThan(0);

Large test frameworks often use design patterns for structuring locators and interactions. The Page Object Model organizes locators and action methods together in classes by page or component. However, I strongly recommend the Screenplay Pattern over page objects because its pieces are more reusable and scalable. Whatever the pattern, locators are needed.

How do I find elements?

Elements can be a hassle to find when writing locators for test automation. To simplify my work flow, I use Google Chrome’s Developer Tools side-by-side with my IDE. Why choose Chrome?

To inspect any Web page in Chrome, simply right-click anywhere on the page:

Voila! DevTools will open. For finding Web elements, we want to use the Elements tab.

Visually pinpointing an element is easy. Click the “select” tool in the upper-left corner of the DevTools pane. (It looks like a square with a cursor on it.) The icon should turn blue.

Then, move the cursor to the desired element on the page. You will see each element highlighted in different colors as the mouse moves over. The corresponding HTML source code in the Elements tab will simultaneously be highlighted, too. Nice! Click on the desired element to set the highlighting so that it won’t disappear when you move the cursor elsewhere.

From here, you can check out the element’s tag, classes, attributes, contents, parents, and children.

How do I write good locators?

Finding the element is half the battle. Forming a unique locator query is the other half. If a locator is too broad, then it could return false positives. However, if a locator is too specific, then it could be susceptible to break whenever the DOM changes, and it could also be difficult for others to read. The best philosophy is this: Write the simplest locator query that uniquely identifies the target element(s).

My locator query type order-of-preference is:

  1. ID (if unique)
  2. Name (if unique)
  3. Class name
  4. CSS Selector
  5. XPath without text or indexing
  6. Link text / partial link text
  7. XPath with text and/or indexing

Unique IDs, names, and class names make locators super easy to write: queries are short and don’t need extra anchors. Always encourage developers on the team to use unique identifiers like class names for all elements. However, many elements do not have them, which means locators must fall back on more complicated CSS selectors and XPaths (*shiver*). Whenever this happens, here’s some advice:

  • Use parents as anchors if they have unique identifiers.
    • CSS selector example: “#some-list > li”
    • XPath example: “//ul[@id=’some-list’]/li”
  • Avoid XPaths that use text or indexing if possible.
    • Bad example: “//div[3]//span[text()=’hello’]”
    • Those tend to be the most brittle checks.
  • Use the “contains” function when checking for classes in XPath.
    • Example: “//div[contains(@class, ‘some-class’)]”
    • Elements frequently have more than one class.
    • “contains” will check a substring instead of the full class string.
    • However, be careful because “some-class2” would be matched!

Always test locators, too. Syntax errors and false positives happen frequently. Chrome DevTools makes testing locators easy. Simply hit Ctrl-F on the Elements tab and then paste the locator query into the finder field. DevTools will highlight all the matching elements in order. Spiffy!

Sometimes, when I can’t figure out why a locator isn’t working for a test case, I’ll do the following:

  1. Run the test case with debugging from my IDE.
  2. Set a break point on the locator.
  3. Wait for the test case to stop at the break point.
  4. Enter DevTools on the active Chrome window.
  5. Check the DOM and test the locators on the live page.

What if my tests are flaky?

Web UI testing is roundly criticized for being “flaky” because tests often crash for unexpected reasons. However, much of the unreliability people hit with Web UI testing (and often with Selenium WebDriver itself) is that all Web interactions inherently pose race conditions. The automation and the browser execute independently, so interactions must be synchronized with page state. Otherwise, WebDriver will throw exceptions for timeouts, stale elements, and elements not found. Many times, these issues happen intermittently, so they can be difficult to trace and resolve.

The best way to avoid race conditions is this: Always wait for an element to exist before interacting with it. This may seem basic, but it’s easy to overlook. Selenium WebDriver packages all offer some sort of WebDriverWait object that will force the driver to wait for a given condition to be true before proceeding. The easiest way to check if an element exists is to check if the list of elements returned by a FindElements (plural) call is non-empty. Adding another call for each interaction may feel burdensome, but design patterns within well-designed frameworks (like the Screenplay Pattern) can make these checks happen automatically.

Another good practice is this: Always fetch fresh elements. Sometimes, automation will first get some elements and then use a second query to get more elements. Or, in the case of the Page Object Factory (which should never be used because, bluntly, its design is terrible), elements are fetched once when the page object is constructed and referenced thereafter. No matter which way, the longer a Web element object exists, the more prone it is to become stale and cause exceptions. I’ve seen elements turn stale inexplicably even when they still seem to be on the page, too. Always get an element in the moment when it is needed. That way, it can’t go stale!

Want some helpful tips for clicking tricky elements? Check out this article: Clicking Web Elements with Selenium WebDriver.

How can AI help Web UI testing?

Several new AI-based projects/products aim to improve automated Web UI testing over traditional methods:

  • Applitools extends Selenium WebDriver automation with checks for nontrivial visual differences.
  • Testim can automatically heal locators whenever they break, avoiding test flakiness due to front-end changes.
  • Mabl is an assistant that will learn and rerun tests that developers teach it without writing any code.
  • Test.ai runs common user tests like login, searching, and shopping on mobile apps based on what its AI has learned from several other apps.
  • Rainforest QA uses crowdsourcing plus AI to run manual tests specified by a team almost like they are automated.

Test Automation University also offers a free course on using AI for element selection: AI for Element Selection: Erasing the Pain of Fragile Test Scripts.

Many AI testing tools definitely add value, but keep in mind, under the hood, locators are still used somewhere.