testing

The Airing of Grievances: Selenium WebDriver

Selenium WebDriver is the de facto standard for Web UI automation. It’s a great tool, but like anything good, it can also be misused. And that’s where I have grievances. I got a lot of problems with Selenium WebDriver abuses, and now you’re gonna hear about it!

WebDriver “Unit Tests”

“WebDriver unit tests” are like square circles – definitionally, they are logical fallacies. In my books, a unit test must be white box, meaning it has direct access to the product code. However, Web UI tests using WebDriver are inherently black box tests because they are interacting with an actively running website. Thus, they must be above-unit tests by definition. Don’t call them unit tests!

Making Every Test a Web Test

NO! The Testing Pyramid is vital to a healthy overall testing strategy. Web tests are great because they test a website in the ways a user would interact with it, but they have a significant cost. As compared to lower-level tests, they are more fragile, they require more development resources, and they take much more time to run. Browser differences may also affect testing. Furthermore, problems in lower level components should be caught at those lower levels! Sure, HTTP 400s and 500s will appear at the web app layer, but they would be much faster to find and fix with service layer tests. Different layers of testing mitigate risk at their optimal returns-on-investment.

No WebDriver Cleanup

Every WebDriver instance spawns a new system process for “driving” web browser interactions. When the test automation process completes, the WebDriver process may not necessary terminate with it. It is imperative that test automation quits the WebDriver instance once testing is complete. Make sure cleanup happens even when abortive exceptions occur! Otherwise, zombie WebDriver processes may continue on the test machine, causing any number of problems: locked files and directories, high memory usage, wasted CPU cycles, and blocked network ports. These problems can cripple a system and even break future test runs, especially on shared testing machines (like Jenkins nodes). Please, only you can stop the zombie apocalypse – always quit WebDriver instances!

Using “Close” Instead of “Quit”

Regardless of programming language, the WebDriver class has both “close” and “quit” methods. “Close” will close the current browser tab or window, while “quit” will close all windows and terminate the WebDriver process. Make sure to quit during final cleanup. Doing only a close may result in zombie WebDriver processes. It’s a rookie mistake.

Not Optimizing Setup/Cleanup with Service Calls

Web tests are notoriously slow. Whenever you can speed them up, do it! Some tests can be optimized by preparing initial state with service calls. For example, let’s say a user visiting a car dealership website needs to have favorite cars pre-selected for a comparison page test. Rather than navigating to a bunch of car pages and clicking a “favorite” icon, make a setup routine that calls a service to select favorites. Not all tests can do this sort of optimization, but definitely do it for those that can!

Web Elements with No ID

Developers, we need to talk – give every significant element a unique ID. PLEASE! WebDriver calls are so much easier to write and so much more robust to run when locator queries can use IDs instead of CSS selectors or XPaths. Let’s pick ID names during our Three Amigos meetings so that I can program the tests while you develop the features. Determining what elements are import should be easy based on our wireframes. You will save us automators so much time and frustration, since we won’t need to dig through HTML and wonder why our XPaths don’t work.

Changing Web Elements Without Warning

Hey, another thing, developers – don’t change the web page structure without telling us! WebDriver locator queries will break if you change the web elements. Even a seemingly innocuous change could wipe out hundreds of tests. Automation effort is non-trivial. Changes must be planned and sized with automation considerations in mind.

Not Using the Page Object Model

The Page Object Model is a widely-used design pattern for modeling a web page (or components on a web page) as an object in terms of its web elements and user interactions with it. It abstracts Web UI interactions into a common layer that can be reused by many different tests. (The Screenplay pattern, also good, is an evolution of the Page Object Model; tutorial here.) Not using the Page Object Model is Selenium suicide. It will result in rampant code duplication.

Demonizing XPath

XPaths have long been criticized for being slower than CSS selectors. That claim is outdated baloney. In many cases, XPaths outperform CSS selectors – see here, here, and here. Another common complaint is that XPath syntax is more complicated than CSS selector syntax. Honestly, I think they’re about the same in terms of learning curve. XPaths are also more powerful that CSS selectors because they can uniquely pinpoint any element on the page.

Inefficient Web Element Access

Web element IDs make access extremely efficient. However, when IDs are not provided, other locator query types are needed. It is always better to use locator queries to pinpoint elements, rather than to get a list of elements (or even a parent/child chain) to traverse using programming code. For example, I often see code reviews in which an XPath returns a list of results with text labels, and then the programming code (C# or Java or whatever) has a for loop that iterates over each element in the list and exits when the element with the desired label is found. Just add “[text()=’desired text’]” or “[contains(text(), ‘desired text’)]” to the XPath! Use locator queries for all they’re worth.

Interacting with Web Elements Before the Page is Ready

Web UI test automation is inherently full of race conditions. Make sure the elements are ready before calling them, or else face a bunch of “element not found” exceptions. Use WebDriver waits for efficient waiting. Do not use hard sleeps (like Java’s Thread.sleep).

Untuned Timeouts

WebDriver calls need timeouts, or else they could hang forever if there is a problem. (Check online docs for default timeout values.) Timeout value ought to be tuned appropriately for different test environments and different websites. Timeouts that are too short will unnecessarily abort tests, while timeouts that are too long will lengthen precious test runtime.

The Airing of Grievances: Test Automation Code

More grievances! Test code ought to be developed with the same high standards as product code, but often it is neglected. That bothers me so much, and it costs teams a significant amount of time and money through its bad consequences. I got a lot of problems with bad test automation code, and now you’re gonna hear about it!

Copypasta

Code duplication is code cancer. It is particularly rampant in test automation because test steps are often repetitive. That’s no excuse, however, to allow it. Use better programming practices, or face my code review rejections!

Hard-Coded Configuration Data

Automation should be able to run on any environment without hassle. Don’t hard-code URLs, usernames, passwords, and other config-specific values! Read them in as inputs or config files. Nothing is more frustrating than switching environments but – surprise! – not running with the correct values.

Re-reading Inputs for Every Single Test

Config files and input values should not change during a test suite run. So, don’t repeatedly read them! That’s inefficient. Read them once, and hold them in memory in a centrally-accessible location.

Leaving Temporary Files or Settings

Clean up your mess! Don’t leave temporary files or settings after a test completes. Not doing proper cleanup can harm future tests. Files can also build up and eventually run the system out of storage space. It’s a Boy Scout principle: Leave No Trace!

Incorrect Test Results

Test results need integrity to be trusted. Watch out for false positives and false negatives. If there’s a problem during a test, handle it – don’t ignore it. Don’t skip assertion calls. Business decisions are made based on test results.

Uninformative Failure Messages

Here’s one: “Failed.” That tells me nothing. Why did the test fail? Was something missing from a web page? Was there an unexpected exception? Did a service call return 500? TELL ME! Otherwise, I need to waste time rerunning and even debugging the test. The failure may also be intermittent. Tell me what the problem is right when it happens.

Making Assertion Calls in Automation Support Code

Every automated test case must make assertions to verify goodness or badness of system conditions. But, as a best practice, those assertion calls should be written only in the test case code – not in support code like page objects or framework-level classes. Assertions are test-specific, while support code is generic. Support code should simply give system state, while test cases use assertions to validate system state. If support code has assertion calls, then it is far less reusable and traceable.

Treating Boolean Methods like Assertions

This is an all-too-common rookie mistake. Boolean methods simply return a true/false value. They do not (or, as least according to the previous grievance, should not) contain assertion calls. However, I’ve seen many code reviews where programmers call a Boolean method but do nothing with the result. Whoops!

Burying Exceptions

Exceptions mean problems. Test automation is meant to expose problems. Don’t blindly catch all exceptions, log a message, and continue on with a test like nothing’s wrong. Let the exception rise! Most modern test frameworks will catch all exceptions at the test case level, automatically report them as failures, and safely move on to the next test. There is no need to catch exceptions yourself if they ought to abort the test, anyway – that adds a lot of unnecessary code. Catch exceptions only if they are recoverable!

No Automatic Recovery

True automation means no manual intervention. An automation framework should have built-in ways to automatically recover from common problems. For example, if a network connection breaks, automatically attempt to reconnect. If a test fails, retry it. If too many failures happen, either end the run early or pause for some time. And make sure recovery mechanisms are built into the framework, rather than appearing as copypasta throughout the codebase. The purpose of retries (especially for tests) is not to blindly run tests until they turn from red to green, but rather to overcome unexpected interruptions and also to gather more data to better understand failure reasons. Interruptions will happen – handle them when they do. And be sure to log any failures that do happen so that the reasons may be investigated.

No Logging

Please leave a helpful trail of log messages. Tracing through automation code can be difficult after a failure, especially when you’re not the original author. Logging helps everyone quickly get to the root cause of failures. It also really helps to create reproduction scenarios. If I can copy the log from a failing tests into a bug report and be done, then AMEN for an easy triage!

Hard Waits

Ain’t nobody got time for that! Automation needs to be fast, because time is money. Forcing Thread.sleep (or other such equivalent in your language of choice) shows either laziness or desperation. It unnecessarily wastes precious runtime. It also creates race conditions: what if the wait time ends before the thing is ready? Always use “smart” waits – actively and repeatedly check the system at short intervals for the thing to be ready, and abort after a healthy timeout value.

The Airing of Grievances: Test Automation Process

Test automation is a big deal for me. It is my chosen specialty within the broad field of software. When I see things done wrong, or when people just don’t get what it’s about, it really grinds my gears. I got a lot of problems with bad test automation processes, and now you’re gonna hear about it!

Saying “They’re Just Test Scripts”

Test automation is not just a bunch of test scripts: it is a full technology stack that requires design, integration, and expertise. Test automation development is a discipline. Saying it is just a bunch of test scripts is derogatory and demeaning. It devalues the effort test automation requires, which can lead to poor work item sizings and an “us vs. them” attitude between developers and QA.

Not Applying the Same Software Development Best Practices

Test automation is software development, and all the same best practices should thus apply. Write clean, well-designed code. Use version control with code reviews. Add comments and doc. Don’t get lazy because “they’re just test scripts” – wrong attitude!

Lip Service

Don’t say automation is important but then never dedicate time or resources to work on it. Don’t leave automation as a task to complete only if there’s time after manual testing is done. Make automation a priority, or else it will never get done! I once worked on an Agile team where automation framework stories were never included into the sprint because there weren’t “enough points to go around.” So, even though this company hired me explicitly to do test automation, I always got shunted into a manual testing scramble every sprint.

Confusing Test Automation with Deployment Automation

Test automation is the automation of test scenarios (for either functional or performance tests). Deployment automation is the automation of product build distribution and installation in a software environment. They are two different concerns. Cucumber is not Ansible.

Forcing 100% Automation

Some people think that automation will totally eliminate the need for any manual testing. That’s simply not true. Automation and manual testing are complementary. Automation should handle deterministic scenarios with a worthwhile return-on-investment to automate, while manual testing should focus on exploratory testing, user experience (UX), and tests that are too complicated to automate properly. Forcing 100% automation will make teams focus on metrics instead of quality and effectiveness.

Downsizing or Eliminating QA

Test automation doesn’t reduce or eliminate the need for testers. On the contrary, test automation requires even more advanced skills than old-school manual testing. There is still a need for testing roles, whether as a dedicated position or as shared collectively by a bunch of developers. The work done by that testing role just becomes more technical.

Saying Product Code and Test Code Must Use the Same Language

For unit tests, this is true, but for above-unit tests, it is simply false. Any general purpose programming language could be used to automate black-box tests. For example, Python tests could run against an Angular web app. A team may choose to use the same language for product and test code for simplicity, but it is not mandatory.

Not Classifying Test Types

Not all tests are the same. You can play buzzword bingo with all the different test type names: unit, integration, end-to-end, functional, performance, system, contract, exploratory, stress, limits, longevity, test-to-break, etc. Different tests need different tools or frameworks. Tests should also be written at the appropriate Testing Pyramid level.

Assuming All Tests Are Equal

Again, not all tests are the same, even within the same test type. Tests vary in development time, runtime, and maintenance time. It’s not accurate to compare individuals or teams merely on test numbers.

Not Prioritizing Tests to Automate

There’s never enough time to automate everything. Pick the most important ones to automate first – typically the core, highest-priority features. Don’t dilly-dally on unimportant tests.

Not Running Tests Regularly

Automated tests need to run at least once daily, if not continuously in Continuous Integration. Otherwise, the return-on-investment is just too low to justify the automation work. I once worked on a QA team that would run an automated test only once or twice during a 2-year release! How wasteful.

HP Quality Center / ALM

This tool f*&@$#! sucks. Don’t use it for automated tests. Pocket the money and just develop a good codebase with decent doc in the code, and rely upon other dashboards (like Jenkins or Kibana) for test reporting.

The Airing of Grievances: Software Development

Let the Airing of Grievances series begin: I got a lot of problems with bad software development practices, and now you’re gonna hear about it!

Duplicate Code

Code duplication is code cancer. It spreads unmaintainable code snippets throughout the code base, making refactoring a nightmare and potentially spreading bugs. Instead, write a helper function. Add parameters to the method signature. Leverage a design pattern. I don’t care if it’s test automation or any other code – Don’t repeat yourself!

Typos

Typos are a hallmark of carelessness. We all make them from time to time, but repeated appearances devalue craftsmanship. Plus, they can be really confusing in code that’s already confusing enough! That’s why I reject code reviews for typos.

Global Non-Constant Variables

Globals are potentially okay if their values are constant. Otherwise, just no. Please no. Absolutely no for parallel or concurrent programming. The side effects! The side effects!

Using Literal Values Instead of Constants

Literal values get buried so easily under lines and lines of code. Hunting down literals when they must be changed can be a terrible pain in the neck. Just put constants in a common place.

Avoiding Dependency Management

Dependency managers like Maven, NuGet, and pip automatically download and link packages. There’s no need to download packages yourself and struggle to set your PATHs correctly. There’s also no need to copy their open source code directly into your project. I’ve seen it happen! Just use a dependable dependency manager.

Breaking Established Design Patterns

When working within a well-defined framework, it is often better to follow the established design patterns than to hack your own way of doing things into it. Hacking will be difficult and prone to breakage with framework updates. If the framework is deficient in some way, then make it better instead of hacking around it.

No Comments or Documentation

Please write something. Even “self-documenting” code can use some context. Give a one-line comment for every “paragraph” of code to convey intent. Use standard doc formats like Javadoc or docstrings that tie into doc tools. Leave a root-level README file (and write it in Markdown) for the project. Nobody will know your code the way you think they should.

Not Using Version Control

With all the risks of bugs and typos in an inherently fragile development process, you would think people would always want code protection, but I guess some just like to live on the edge. A sense of danger must give them a thrill. Or, perhaps they don’t like dwelling on things of the past. That is, until they break something and can’t figure it out, or their machine’s hard drive crashes with no backup. Give that person a Darwin Award.

No Unit Tests

If no version control wasn’t enough, let’s just never test our code and make QA deal with all the problems! Seriously, though, how can you sleep at night without unit tests? If I were the boss, I’d fire developers for not writing unit tests.

Permitting Tests to Fail

Whenever tests fail, people should be on point immediately to open bug reports, find the root cause, and fix the defect. Tests should not be allowed to fail repeatedly day after day without action. At the very least, flag failures in the test report as triaged. Complacency degrades quality.

Blue Balls

Jenkins uses blue balls because the creator is Japanese. I love Japanese culture, but my passing tests need to be green. Get the Green Balls plugin. Nobody likes blue balls.

Skipping Code Reviews

Ain’t nobody got time for that? Ain’t nobody got time when your code done BROKE cuz nobody caught the problem in review! Take the time for code reviews. Teams should also have policy for quick turnaround times.

Not Re-testing Code After Changes

People change code all the time. But people don’t always re-test code after changes are made. Why not? That’s how problems happen. I’ve seen people post updated code to a PR that wouldn’t even compile. Please, check yourself before you wreck yourself.

“It Works on My Machine”

That’s bullfeathers, and you know it! Nobody cares if it works on your machine if it doesn’t work on other machines. Do the right thing and go help the person, instead of blowing them off with this lame excuse.

Arrogance

Nobody wants to work with a condescending know-it-all. Don’t be that person.

Software Testing Lessons from Luigi’s Mansion

How can lessons from Luigi’s Mansion apply to software testing and automation?

Luigi’s Mansion is a popular Nintendo video game series. It’s basically Ghostbusters in the Super Mario universe: Luigi must use a special vacuum cleaner to rid haunted mansions of the ghosts within. Along the way, Luigi also solves puzzles, collects money, and even rescues a few friends. I played the original Luigi’s Mansion game for the Nintendo GameCube when I was a teenager, and I recently beat the sequel, Luigi’s Mansion: Dark Moon, for the Nintendo 3DS. They were both quite fun! And there are some lessons we can apply from Luigi’s Mansion to software testing and automation.

#1: Exploratory Testing is Good

The mansions are huge – Luigi must explore every nook and cranny (often in the dark) to spook ghosts out of their hiding places. There are also secrets and treasure hiding in plain sight everywhere. Players can easily miss ghosts and gold alike if they don’t take their time to explore the mansions thoroughly. The same is true with testing: engineers can easily miss bugs if they overlook details. Exploratory testing lets engineers freely explore the product under test to uncover quality issues that wouldn’t turn up through rote test procedures.

#2: Expect the Unexpected

Ghosts can pop out from anywhere to scare Luigi. They also can create quite a mess of the mansion – blocking rooms, stealing items, and even locking people into paintings! Software testing is full of unexpected problems, too. Bugs happen. Environments go down. Network connections break. Even test automation code can have bugs. Engineers must be prepared for any emergency regardless of origin. Software development and testing is about solving problems, not about blame-games.

#3: Don’t Give Up!

Getting stuck somewhere in the mansion can be frustrating. Some puzzles are small, while others may span multiple rooms. Sometimes, a player may need to backtrack through every room and vacuum every square inch to uncover a new hint. Determination nevertheless pays off when puzzles get solved. Software engineers must likewise never give up. Failures can be incredibly complex to identify, reproduce, and resolve. Test automation can become its own nightmare, too. However, there is always a solution for those tenacious (or even hardheaded) enough to find it.

 

Want to see what software testing lessons can be learned from other games? Check out Gotta Catch ’em All! for Pokémon!

Gherkin Syntax Highlighting in Chrome

Google Chrome is one of the most popular web browsers around. Recently, I discovered that Chrome can edit and display Gherkin feature files. The Chrome Web Store has two useful extensions for Gherkin: Tidy Gherkin and Pretty Gherkin, both developed by Martin Roddam. Together, these two extensions provide a convenient, lightweight way to handle feature files.

Tidy Gherkin

Tidy Gherkin is a Chrome app for editing and formatting feature files. Once it is installed, it can be reached from the Chrome Apps page (chrome://apps/). The editor appears in a separate window. Gherkin text is automatically colored as it is typed. The bottom preview pane automatically formats each line, and clicking the “TIDY!” button in the upper-left corner will format the user-entered text area as well. Feature files can be saved and opened like a regular text editor. Templates for Feature, Scenario, and Scenario Outline sections may be inserted, as well as tables, rows, and columns.

Another really nice feature of Tidy Gherkin is that the preview pane automatically generates step definition stubs for Java, Ruby, and JavaScript! The step def code is compatible with the Cucumber test frameworks. (The Java code uses the traditional step def format, not the Java 8 lambdas.) This feature is useful if you aren’t already using an IDE for automation development.

Tidy Gherkin has pros and cons when compared to other editors like Notepad++ and Atom. The main advantages are automatic formatting and step definition generation – features typically seen only in IDEs. It’s also convenient for users who already use Chrome, and it’s cross-platform. However, it lacks richer text editing features offered by other editors, it’s not extendable, and the step def gen feature may not be useful to all users. It also requires a bit of navigation to open files, whereas other editors may be a simple right-click away. Overall, Tidy Gherkin is nevertheless a nifty, niche editor.

This slideshow requires JavaScript.

Pretty Gherkin

Pretty Gherkin is a Chrome extension for viewing Gherkin feature files through the browser with syntax highlighting. After installing it, make sure to enable the “Allow access to the file URLs” option on the Chrome Extensions page (chrome://extensions/). Then, whenever Chrome opens a feature file, it should display pretty text. For example, try the GoogleSearch.feature file from my Cucumber-JVM example project, cucumber-jvm-java-example. Unfortunately, though, I could not get Chrome to display local feature files – every time I would try to open one, Chrome would simply download it. Nevertheless, Pretty Gherkin seems to work for online SCM sites like GitHub and BitBucket.

Since Pretty Gherkin is simply a display tool, it can’t really be compared to other editors. I’d recommend Pretty Gherkin to Chrome users who often read feature files from online code repositories.

This slideshow requires JavaScript.

 

Be sure to check out other Gherkin editors, too!

Unpredictable Test Data

Test data is a necessary evil for testing and automation. It is necessary because tests simply can’t run without test case values, configuration data, and ready state (as detailed in BDD 101: Test Data). It is evil because it is challenging to handle properly. Test data may be even more dastardly when it is unpredictable, but thankfully there are decent strategies for handling unpredictability.

What is Unpredictable Test Data?

Test data is unpredictable when its values are not explicitly the same every time a test runs. For example, let’s suppose we are writing tests for a financial system that must obtain stock quotes. Mocking a stock quote service with dummy predictable data would not be appropriate for true integration or end-to-end tests. However, stock quotes act like random walks: they change values in real time, often perpetually. The name “unpredictable” could also be “non-deterministic” or “uncertain.”

Below are a few types of test data unpredictability:

  • Values may be missing, mistyped, or outside of expected bounds.
  • Time-sensitive data may change rapidly.
  • Algorithms may yield non-deterministic results (like for machine learning).
  • Data formats may change with software version updates.
  • Data may have inherent randomness.

Strategies for Handling Unpredictability

Any test data is prone to be unpredictable when it comes from sources external to the automation codebase. Test must be robust enough to handle the inherent unpredictability. Below are 5 strategies for safety and recovery. The main goal is test completion – pass or fail, tests should not crash and burn due to bad test data. When in doubt, skip the test and log warnings. When really in doubt, fail it as a last resort.

Automation’s main goal is to complete tests despite unpredictability in test data.

#1: Make it Predictable

Ask, is it absolutely necessary to fetch data from unpredictable sources? Or can they be avoided by using predictable, fake data? Fake data can be provided in a number of ways, like mocks or database copies. It’s a tradeoff between test reliability and test coverage. In a risk-based test strategy, the additional test coverage may not be worthwhile if all input cases can be covered with fake data. Nevertheless, unpredictable data sometimes cannot or should not be avoided.

#2: Write Defensive Assertions

When reading data, make assertions to guarantee correctness. Assertions are an easy way to abort a test immediately if any problems are found. Assertions could make sure that values are not null, contain all required pieces, and fit the expected format.

#3: Handle Healthy Bounds

Tests using unpredictable data should be able to handle acceptable ranges of values instead of specific pinpointed values. This could mean including error margins in calculations or using regular expressions to match strings. Assertions may need to do some extra preliminary processing to handle ranges instead of singular values. Any anomalies should be reported as warnings.

For the stock quote example, the following would be ways to handle healthy bounds:

  • Abort if the quote value is non-numeric or negative.
  • Warn if the value is $0 or greater than $1M.
  • Continue for values between $0 and $1M.

#4: Scrub the Data

Sometimes, data problems can be “scrubbed” away. Formats can be fixed, missing values can be populated, and given values can be adjusted or filtered. Scrubbing data may not always be appropriate, but if possible, it can mean a test will be completed instead of aborted.

#5: Do Retries

Data may need to be fetched again if it isn’t right the first time. Retries are applicable for data that changes frequently or is random. The automation framework should have a mechanism to retry data access after a waiting period. Set retry limits and wait times appropriately – don’t waste too much time. Retries should also be done as close to the point of failure as possible. Retrying the whole test is possible but not as efficient as retrying a single service call.

Final Advice

Unpredictable test data shouldn’t be a show-stopper – it just need special attention. Nevertheless, try to limit test automation’s dependence on external data sources.