Chrome

Web Element Locators for Test Automation

If you do any Web UI test automation (like with Selenium WebDriver), then you probably spend a large chunk of your test development time finding elements on a page, like buttons, inputs, and divs. Finding the right elements, however, can be challenging, especially when they lack unique IDs or class names. This guide will show you how to locate any Web element like a pro.

What are Web elements?

A Web element is an individual entity rendered on a Web page. Everything a user sees on a Web page (and even some things they don’t see) are elements: title headers, okay buttons, input fields, text areas, and more. Elements are specified in HTML by tag name, attributes, and contents. They may also have child elements, such as a table containing rows. CSS may be applied to elements to style them with colors, sizes, position, etc. Programming languages typically access Web elements as nodes in the Document Object Model (DOM).

What are Web element locators?

Web elements and locators are two different things. A Web element locator is an object that finds and returns Web elements on a page using a given query. In short, locators find elements.

Why are locators needed? As human users, we interact with Web pages visually: We look, scroll, click, and type through a browser. However, test automation interacts with Web pages programmatically: it needs a coded way to find and manipulate those same elements. Traditional automation won’t “look” at the page like a human* – it will search the DOM instead.

(*Newer automation technologies enable visual testing, which will be discussed later in this article.)

Selenium WebDriver separates the concerns of element location and interaction. WebDriver calls for these two concerns are frequently written back-to-back:

// WebDriver example: typing a search phrase at www.google.com
// This code is written in C#, but the calls are the same in any language

// First, element location
IWebElement searchField = driver.FindElement(By.Name("q"));

// Second, element interaction
searchField.SendKeys("panda");

WebDriver provides the following locator query types using “By”:

Which one is best? We’ll discuss that below.

Locators may also return multiple elements, or none at all! For example:

// Get the list of results from a Google search
// Using "FindElements" will return a list of all elements found in order
// Using "FindElement" would return the first element found (or throw an exception if no elements were found)
IList<IWebElement> results = driver.FindElements(By.CssSelector("div.r"));
results.Count.Should().BeGreaterThan(0);

Large test frameworks often use design patterns for structuring locators and interactions. The Page Object Model organizes locators and action methods together in classes by page or component. However, I strongly recommend the Screenplay Pattern over page objects because its pieces are more reusable and scalable. Whatever the pattern, locators are needed.

How do I find elements?

Elements can be a hassle to find when writing locators for test automation. To simplify my work flow, I use Google Chrome’s Developer Tools side-by-side with my IDE. Why choose Chrome?

To inspect any Web page in Chrome, simply right-click anywhere on the page:

Voila! DevTools will open. For finding Web elements, we want to use the Elements tab.

Visually pinpointing an element is easy. Click the “select” tool in the upper-left corner of the DevTools pane. (It looks like a square with a cursor on it.) The icon should turn blue.

Then, move the cursor to the desired element on the page. You will see each element highlighted in different colors as the mouse moves over. The corresponding HTML source code in the Elements tab will simultaneously be highlighted, too. Nice! Click on the desired element to set the highlighting so that it won’t disappear when you move the cursor elsewhere.

From here, you can check out the element’s tag, classes, attributes, contents, parents, and children.

How do I write good locators?

Finding the element is half the battle. Forming a unique locator query is the other half. If a locator is too broad, then it could return false positives. However, if a locator is too specific, then it could be susceptible to break whenever the DOM changes, and it could also be difficult for others to read. The best philosophy is this: Write the simplest locator query that uniquely identifies the target element(s).

My locator query type order-of-preference is:

  1. ID (if unique)
  2. Name (if unique)
  3. Class name
  4. CSS Selector
  5. XPath without text or indexing
  6. Link text / partial link text
  7. XPath with text and/or indexing

Unique IDs, names, and class names make locators super easy to write: queries are short and don’t need extra anchors. Always encourage developers on the team to use unique identifiers like class names for all elements. However, many elements do not have them, which means locators must fall back on more complicated CSS selectors and XPaths (*shiver*). Whenever this happens, here’s some advice:

  • Use parents as anchors if they have unique identifiers.
    • CSS selector example: “#some-list > li”
    • XPath example: “//ul[@id=’some-list’]/li”
  • Avoid XPaths that use text or indexing if possible.
    • Bad example: “//div[3]//span[text()=’hello’]”
    • Those tend to be the most brittle checks.
  • Use the “contains” function when checking for classes in XPath.
    • Example: “//div[contains(@class, ‘some-class’)]”
    • Elements frequently have more than one class.
    • “contains” will check a substring instead of the full class string.
    • However, be careful because “some-class2” would be matched!

Always test locators, too. Syntax errors and false positives happen frequently. Chrome DevTools makes testing locators easy. Simply hit Ctrl-F on the Elements tab and then paste the locator query into the finder field. DevTools will highlight all the matching elements in order. Spiffy!

Sometimes, when I can’t figure out why a locator isn’t working for a test case, I’ll do the following:

  1. Run the test case with debugging from my IDE.
  2. Set a break point on the locator.
  3. Wait for the test case to stop at the break point.
  4. Enter DevTools on the active Chrome window.
  5. Check the DOM and test the locators on the live page.

What if my tests are flaky?

Web UI testing is roundly criticized for being “flaky” because tests often crash for unexpected reasons. However, much of the unreliability people hit with Web UI testing (and often with Selenium WebDriver itself) is that all Web interactions inherently pose race conditions. The automation and the browser execute independently, so interactions must be synchronized with page state. Otherwise, WebDriver will throw exceptions for timeouts, stale elements, and elements not found. Many times, these issues happen intermittently, so they can be difficult to trace and resolve.

The best way to avoid race conditions is this: Always wait for an element to exist before interacting with it. This may seem basic, but it’s easy to overlook. Selenium WebDriver packages all offer some sort of WebDriverWait object that will force the driver to wait for a given condition to be true before proceeding. The easiest way to check if an element exists is to check if the list of elements returned by a FindElements (plural) call is non-empty. Adding another call for each interaction may feel burdensome, but design patterns within well-designed frameworks (like the Screenplay Pattern) can make these checks happen automatically.

Another good practice is this: Always fetch fresh elements. Sometimes, automation will first get some elements and then use a second query to get more elements. Or, in the case of the Page Object Factory (which should never be used because, bluntly, its design is terrible), elements are fetched once when the page object is constructed and referenced thereafter. No matter which way, the longer a Web element object exists, the more prone it is to become stale and cause exceptions. I’ve seen elements turn stale inexplicably even when they still seem to be on the page, too. Always get an element in the moment when it is needed. That way, it can’t go stale!

How can AI help Web UI testing?

Several new AI-based projects/products aim to improve automated Web UI testing over traditional methods:

  • Applitools extends Selenium WebDriver automation with checks for nontrivial visual differences.
  • Testim can automatically heal locators whenever they break, avoiding test flakiness due to front-end changes.
  • Mabl is an assistant that will learn and rerun tests that developers teach it without writing any code.
  • Test.ai runs common user tests like login, searching, and shopping on mobile apps based on what its AI has learned from several other apps.
  • Rainforest QA uses crowdsourcing plus AI to run manual tests specified by a team almost like they are automated.

Many AI testing tools definitely add value, but keep in mind, under the hood, locators are still used somewhere.

Gherkin Syntax Highlighting in Chrome

Google Chrome is one of the most popular web browsers around. Recently, I discovered that Chrome can edit and display Gherkin feature files. The Chrome Web Store has two useful extensions for Gherkin: Tidy Gherkin and Pretty Gherkin, both developed by Martin Roddam. Together, these two extensions provide a convenient, lightweight way to handle feature files.

Tidy Gherkin

Tidy Gherkin is a Chrome app for editing and formatting feature files. Once it is installed, it can be reached from the Chrome Apps page (chrome://apps/). The editor appears in a separate window. Gherkin text is automatically colored as it is typed. The bottom preview pane¬†automatically formats each line, and clicking the “TIDY!” button in the upper-left corner will format the user-entered text area as well. Feature files can be saved and opened like a regular text editor. Templates for Feature, Scenario, and Scenario Outline sections may be inserted, as well as tables, rows, and columns.

Another really nice feature of Tidy Gherkin is that the preview pane automatically generates step definition stubs for Java, Ruby, and JavaScript! The step def code is compatible with the Cucumber test frameworks. (The Java code uses the traditional step def format, not the Java 8 lambdas.) This feature is useful if you aren’t already using an IDE for automation development.

Tidy Gherkin has pros and cons when compared to other editors like Notepad++ and Atom. The main advantages are automatic formatting and step definition generation – features typically seen only in IDEs. It’s also convenient for users who already use Chrome, and it’s cross-platform. However, it lacks richer text editing features offered by other editors, it’s not extendable, and the step def gen feature may not be useful to all users. It also requires a bit of navigation to open files, whereas other editors may be a simple right-click away. Overall, Tidy Gherkin is nevertheless a nifty, niche editor.

This slideshow requires JavaScript.

Pretty Gherkin

Pretty Gherkin is a Chrome extension for viewing Gherkin feature files through the browser with syntax highlighting. After installing it, make sure to enable the “Allow access to the file URLs” option on the Chrome Extensions page (chrome://extensions/). Then, whenever Chrome opens a feature file, it should display pretty text. For example, try the GoogleSearch.feature file from my Cucumber-JVM example project,¬†cucumber-jvm-java-example. Unfortunately, though, I could not get Chrome to display local feature files – every time I would try to open one, Chrome would simply download it. Nevertheless, Pretty Gherkin seems to work for online SCM sites like GitHub and BitBucket.

Since Pretty Gherkin is simply a display tool, it can’t really be compared to other editors. I’d recommend Pretty Gherkin to Chrome users who often read feature files from online code repositories.

This slideshow requires JavaScript.

 

Be sure to check out other Gherkin editors, too!