BDD 101: Test Data

How should test data be handled in a behavior-driven test framework? This is a common question I hear from teams working on BDD test automation. A better question to ask first is, What is test data? This article will explain different types of test data and provide best practices for handling each. The strategies covered here can be applied to any BDD test framework. (Check the Automation Panda BDD page for the full table of contents.)

Types of Test Data

Personally, I hate the phrase “test data” because its meaning is so ambiguous. For functional test automation, there are three primary types of test data:

  1. Test Case Values. These are the input and expected output values for test cases. For example, when testing calculator addition “1 + 2 = 3”, “1” and “2” would be input values, and “3” would be the expected output value. Input values are often parameterized for reusability, and output values are used in assertions.
  2. Configuration Data. Config data represents the system or environment in which the tests run. Changes in config data should allow the same test procedure to run in different environments without making any other changes to the automation code. For example, a calculator service with an addition endpoint may be available in three different environments: development, test, and production. Three sets of config data would be needed to specify URLs and authentication in each environment (the config data), but 1 + 2 should always equal 3 in any environment (the test case values).
  3. Ready State. Some tests require initial state to be ready within a system. “Ready” state could be user accounts, database tables, app settings, or even cluster data. If testing makes any changes, then the data must be reverted to the ready state.

Each type of test data has different techniques for handling it.

Test Case Values

There are 4 main ways to specify test case values in BDD frameworks, ranging from basic to complex.

In The Specs

The most basic way to specify test case values is directly within the behavior scenarios themselves! The Gherkin language makes it easy – test case values can be written into the plain language of a step, as step parameters, or in Examples tables. Consider the following example:

Scenario Outline: Simple Google searches
  Given a web browser is on the Google page
  When the search phrase "<phrase>" is entered
  Then results for "<phrase>" are shown
  
  Examples: Animals
    | phrase   |
    | panda    |
    | elephant |
    | rhino    |

The test case value used is the search phrase. The When and Then steps both have a parameter for this phrase, which will use three different values provided by the Examples table. It is perfectly suitable to put these test case values directly into the scenario because the values are small and descriptive.

Furthermore, notice how specific result values are not specified for the Then step. Values like “Panda Express” or “Elephant man” are not hard-coded. The step wording presumes that the step definition will have some sort of programmed mechanism for checking that result links relate to the search phrase (likely through regular expression matching).

Key-Value Lookup

Direct specification is great for small sets of simple values, but one size does not fit all needs. Key-value lookups are appropriate when test data is lengthier. For example, I’ve often seen steps like this:

Given the user navigates to "http://www.somewebsite.com/long/path/to/the/profile/page"

URLs, hexadecimal numbers, XML blocks, and comma-separated lists are all the usual suspects. While it is not incorrect to put these values directly into a step parameter, something like this would be more readable:

Given the user navigates to the "profile" page

Or even:

Given the user navigates to their profile page

The automation would store URLs in a lookup table so that these new steps could easily fetch the URL for the profile page by name. These steps are also more declarative than imperative and better resist changes in the underlying environment.

Another way to use key-value lookup is to refer to a set of values by one name. Consider the following scenario for entering an address:

Scenario Outline: Address entry
  Given the profile edit page is displayed
  When the user sets the street address to "<street>"
  And the user sets the second address line to "<second>"  
  And the user sets the city to "<city>"
  And the user sets the state to "<state>"
  And the user sets the zipcode to "<zipcode>"
  And the user sets the country to "<country>"
  And the user clicks the save button
  Then ...

  Examples: Addresses
    | street | second | city | state | zipcode | country |
    ...

An address has a lot of fields. Specifying each in the scenario makes it very imperative and long. Furthermore, if the scenario is an outline, the Examples table can easily extend far to the right, off the page. This, again, is not readable. This scenario would be better written like this:

Scenario Outline: Address entry
  Given the profile edit page is displayed
  When the user enters the "<address-type>" address
  And the user clicks the save button
  Then ...

  Examples: Addresses
    | address-type |
    | basic        |
    | two-line     |
    | foreign      |

Rather than specifying all the values for different addresses, this scenario names the classifications of addresses. The step definition can be written to link the name of the address class to the desired values.

Data Files

Sometimes, test case values should be stored in data files apart from the specs or the automation code. Reasons could be:

  • The data is simply too large to reasonably write into Gherkin or into code.
  • The data files may be generated by another tool or process.
  • The values are different between environments or other circumstances.
  • The values must be selected or switched at runtime (without re-compiling code).
  • The files themselves are used as payloads (ex: REST request bodies or file upload).

Scenario steps can refer to data files using the key-value lookup mechanisms described above. Lightweight, text-based, tabular file formats like CSV, XML, or JSON work the best. They can parsed easily and efficiently, and changes to them can easily be diff’ed. Microsoft Excel files are not recommended because they have extra bloat and cannot be easily diff’ed line-by-line. Custom text file formats are also not recommended because custom parsing is an extra automation asset requiring unnecessary development and maintenance. Personally, I like using JSON because its syntax is concise and its parsing tools seem to be the simplest in most programming languages.

External Sources

An external dependency exists when the data for test case values exists outside of the automation code base. For example, test case values could reside in a database instead of a CSV file, or they could be fetched from a REST service instead of a JSON file. This would be appropriate if the data is too large to manage as a set of files or if the data is constantly changing.

As a word of caution, external sources should be used only if absolutely necessary:

  1. External sources introduce an additional point-of-failure. If that database or service goes down, then the test automation cannot run.
  2. External sources degrade performance. It is slower to get data from a network connection than from a local machine.
  3. Test case values are harder to audit. When they are in the specs, the code, or data files, history is tracked by version control, and any changes are easy to identify in code reviews.
  4. Test case values may be unpredictable. The automation code base does not control the values. Bad values can fail tests.

External sources can be very useful, if not necessary, for performance / stress / load / limits testing, but it is not necessary for the vast majority of functional testing. It may be convenient to mock external sources with either a mocking framework like Mockito or with a dummy service.

Configuration Data

Config data pertain to the test environments, not the test cases. Test automation should never contain hard-coded values for config data like URLs, usernames, or passwords. Rather, test automation should read config data when it launches tests and make references to the required values. This should be done in Before hooks and not in Gherkin steps. In this way, automated tests can run on any configuration, such as different test environments before being released to production.

Config data can be stored in data files or accessed through some other dependency. (Read the previous section for pros and cons of those approaches.) The config to use should be somehow dynamically selectable when tests run. For example, the path to the config file to use could be provided as a command line argument to the test launch command.

Config data can be used to select test values to use at runtime. For example, different environments may need different test value data files. Conversely, scenario tagging can control what parts of config data should be used. For example, a tag could specify a username to use for the scenario, and a Before hook could use that username to fetch the right password from the config data.

For efficiency, only the necessary config data should be accessed or read into memory. In many cases, fetching the config data should also be done once globally, rather than before each test case.

Ready State

All scenarios have a starting point, and often, that starting point involves data. Setup operations must bring the system into the ready state, and cleanup operations must return the system to the ready state. Test data should leave no trace – temporary files should be deleted and records should be reverted. Otherwise, disk space may run out or duplicate records may fail tests. Maintaining the ready state between tests is necessary for true test independence.

During the Test Run

Simple setup and cleanup operations may be done directly within the automation. For example, when testing CRUD operations, records must be created before they can be retrieved, updated, or deleted. Setup would create a record, and cleanup would guarantee the record’s deletion. If the setup is appropriate to mention as part of the behavior, then it should be written as Given steps. This is true of CRUD operations: “Given a record has been created, When it is deleted, …”. If multiple scenarios share this same setup, then those Given steps should be put into a Background section.

However, sometimes setup details are not pertinent to the behavior at hand. For example, perhaps fresh authentication tokens must be generated for those CRUD calls. Those operations should be handled in Before hooks. The automation will take care of it, while the Gherkin steps can focus exclusively on the behavior.

No matter what, After hooks must do cleanup. It is incorrect to write final Then steps to do cleanup. Then steps should verify outcomes, not take more actions. Plus, the final Then steps will not be run if the test has a failure and aborts!

External Preparation

Some data simply takes too long to set up fresh for each test launch. Consider complicated user accounts or machine learning data: these are things that can be created outside of the test automation. The automation can simply presume that they exist as a precondition. These types of data require tool automation to prepare. Tool automation could involve a set of scripts to load a database, make a bunch of service calls, or navigate through a web portal to update settings. Automating this type of setup outside of the test automation enables engineers to more easily replicate it across different environments. Then, tests can run in much less time because the data is already there.

However, this external preparation must be carefully maintained. If any damage is done to the data, then test case independence is lost. For example, deleting a user account without replacing it means that subsequent test runs cannot log in! Along with setup tools, it is important to create maintenance tools to audit the data and make repairs or updates.

Advice for Any Approach

Use the minimal amount of test data necessary to test the functionality of the product under test. More test data requires more time to develop and manage. As a corollary, use the simplest approach that can pragmatically handle the test data. Avoid external dependencies as much as possible.

To minimize test data, remember that BDD is specification by example: scenarios should use descriptive values. Furthermore, variations should be reduced to input equivalence classes. For example, in the first scenario example on this page, it would probably be sufficient to test only one of those three animals, because the other two animals would not exhibit any different searching behavior.

Finally, be cautioned against randomization in test data. Functional tests are meant to be deterministic – they must always pass or fail consistently, or else test results will not be reliable. (Not only could this drive a tester crazy, but it would also break a continuous integration system.) Using equivalence classes is the better way to cover different types of inputs. Use a unique number counting mechanism whenever values must be unique.

For handling unpredictable test data, check out Unpredictable Test Data.

31 comments

  1. Hello @Automation Pana,

    Do you know if it’s possible to use multiple data tables in one scenario ?

    For example : (Jbehave syntax with external data table file)

    Scenario Outline: Address entry
    Given the profile edit page is displayed
    When the enters the “” address
    And the user clicks the save button
    Then …

    Examples:
    data/user.table
    data/address.table

    In my case, the user.table is used by multiple scenarios, that’s why I don’t use to duplicate the user data test in each table of each scenario.

    Thanks for your answer

    Like

    1. Hi tag,

      In Cucumber’s Gherkin language standard, a Scenario Outline may have multiple examples tables, but Cucumber does not support external .table files. That goes against the principle of specification-by-example.

      I have not personally used JBehave, but based on JBehave’s grammar (http://jbehave.org/reference/latest/grammar.html), it looks like JBehave does not support multiple examples tables for individual scenario outlines. But who knows? Please give it a try and reply here.

      Sincerely,
      Andy

      Like

      1. Hi @Automation Panda !

        Sorry for the long response time! Finally, seems that JBehave does not support multiple examples tables.

        By the way, I found some annoying behaviors like :
        – if you have 3 scenarios in 1 story file, we will need to declare the datatable for each scenario, even if the 3 scenarios are using the same table…
        – an opened issue https://jbehave.atlassian.net/browse/JBEHAVE-1006 : the Before/After in JBehave story can not be parameterized with table…

        Liked by 1 person

  2. Hello,

    I have some scenarios that would require dates. For example, when I am at a certain tab by default I should see a “from” today’s date and a “to” 3 days after, if these dates are not changed they will populate another tab “from” “to”. How can I use a datatable with this characteristics?

    Thank you in advance 🙂

    Like

    1. Describe it naturally! Since the initial date is meant to be today’s date, do not hard-code values for it. Instead, describe it in Gherkin as “today’s date” or “three days from today’s date”, and do some smart checking in the step definitions to make sure the dates that appear align to the system clock.

      Like

      1. Hi Andy!

        First, let me thank you for such a quick reply 🙂

        I have another doubt if you can help me. Do good practices say anything about using a datatable as “environment”?

        Let me explain better: I can access a certain area from a button on the homepage of my website, I can also access that same area from a different page and another different page. Is it correct to use a datable with these 3 “sources”?

        Thank you so much!
        Filipa Rodrigues

        Like

      2. Hi Filipa,

        Yes, BUT.

        You could take the approach you mentioned – parameterizing the different sources. However, be careful to be behavioral. I would recommend writing behavior scenarios to make sure that the three sources work minimally, but then pick one to do deeper testing. Check out the “What is a Behavior?” section on this page: https://automationpanda.com/2017/01/25/bdd-101-introducing-bdd/. Notice how entries to a point can be separated from behavior at the point.

        Andy

        Like

  3. Hi Andy nice article,

    I have a question,

    Is it a good practice to use a DataTable to handle only one object?

    Let’s say, I want to create a test to add a user:

    Gherkin:
    ———–
    Given the admin is logged in
    When a new user is added
    |firstName|lastName|username|psw|
    | John | Doe | jdoe |123|
    Then the new user should be in the list

    Java code:
    ————–
    When(“a new user is added”, (DataTable newUsersTable) -> {
    List users = newUsersTable.asList(User.class);
    someHelperMethodToAddTheUser(users.get(0));
    .
    ..

    ….
    });

    On the When I’m using DataTables, Lists and a lot of resources to user only one single object and I think this is not the right way but, I’m not sure how to deal with this kind of scenarios.

    Is it better to have a parameter for each field?

    Like

    1. You could use a table for one object if it has many columns, but I recommend it only if you cannot phrase it in plain language well. For example:

      When the user “jdoe” with password “123” is added for “John” “Doe”

      Here’s another aspect to consider: is it necessary to specify these values at the Gherkin level? You could just put them in the underlying step definition code. Lots of table columns typically indicates either imperative scenarios or lack of behavior focus. Plus, I’m skittish about hard-coding passwords (even if for testing purposes) into the Gherkin steps.

      Liked by 1 person

  4. Hello,
    nice article!
    do you have an example in framework where key is a csv file? I need to do it, but haven’t found anything like that. We want to avoid using path and store the file in test data.

    Like

    1. Hi Anara! Thanks for reading. I don’t have an example readily available, and I don’t know any frameworks off the top of my head that will do that sort of thing other than pytest-bdd. If you want to put data into a CSV file, then you’ll most likely need to DIY.

      Like

  5. Hello can you please demo an example of hoe we can read data in the example table from a external source for example excel.

    Like

    1. In most BDD frameworks, Scenario Outline Examples tables cannot be completely replaced by Excel spreadsheet files. There would need to be an Examples table with one row per variation. The row could contain some sort of key to look up data in an Excel spreadsheet file. However, this is probably not a good practice if there are lots of rows. Gherkin frameworks are not well-suited for data-driven testing.

      Futhermore, I do not have an example for this at my disposal. Sorry!

      Like

  6. This is a great article. However I am surprised that the word Persona has not come up in this discussion.
    In my research in how to handle BDD test data or setup data I have come across the concept of grouping types of data around a Persona and passing the persona name into a step definition.

    Something like:
    Given a ‘contract admin’ is logged in
    And a ‘basic purchase contract’ exists
    When the ‘contract admin’ closes the contract
    Then the contract status is ‘CLOSED’

    To achieve that, I would build a mechanism to map persona names to either a specific set up configuration data or a setup process.

    ContractPersona.create(‘basic purchase contract’)

    Thanks

    Like

  7. Thank you for the wonderful problem, I have a particular question which I often find while writing BDD acceptance criteria. .

    Sometimes system behavior is such that it depends on 5-7 different variables . .
    and each variable possessing two states that define an outcome ..

    Before you say that split into many . .
    I want to mention that those 5-7 variable/2 state per variable are very cohesively related to one user Scenario. .that is only one (When statement )

    Sure, if you insist . .I can go ahead and split . .into appx 3 features and 10 scenarios per feature. .but then again . .the context is lost.objectivity is lost. .

    So what I do instead is write BDD and say the follow following business rules

    Then I mention business rules in an excel decision table .

    is this approach right ? or you have some other suggestion . Kindly guide. please

    Like

Leave a comment