
Testing Web Services with Karate

Karate is a relatively new open source framework for testing Web services. Even though Karate is written in Java, its main value proposition is that testers don’t need to do any Java programming in order to write fully automated tests. Instead, testers use a Gherkin-like language with steps for making requests and validating responses. It’s like Cucumber with out-of-the-box Web API steps! There are a bunch of other nifty features, too.

This article is my quick-start guide for Karate. As a prerequisite, make sure you understand how Web services work (like REST APIs). Knowing BDD will also help.

My System

Since Karate is an open-source Java project, it can run almost anywhere. Here’s my system config:

  • macOS 10.13.6 (High Sierra)
  • Java 1.8.0_191
  • Apache Maven 3.6.0
  • Karate 0.9.0

Warning: I initially attempted to run my Karate project using Java 11.0.1, but I repeatedly hit SSL handshake exceptions despite trying many fixes. Downgrading to Java 8 fixed the errors. See Issue #617.

Project Setup

I created my Karate project using the Maven archetype since I am quite familiar with Maven. I named my project “firstchop”:

mvn archetype:generate \
  -DarchetypeGroupId=com.intuit.karate \
  -DarchetypeArtifactId=karate-archetype \
  -DarchetypeVersion=0.9.0 \
  -DgroupId=com.automationpanda \

The directory layout was standard for Java Maven / Cucumber-JVM projects. (In fact, Karate was based on Cucumber-JVM until version 0.8.0.) Interestingly, the Karate docs recommend placing feature files under src/test/java instead of src/test/resources.


The Karate docs recommend using Eclipse or IntelliJ IDEA for developing Karate tests. Both IDEs offer support for JUnit and Cucumber, which Karate can leverage not only for editing but also for running tests. I’d probably use IntelliJ IDEA for serious testing.

However, for my initial exploration, I chose to use Visual Studio Code. Why?

  • It’s fast and easy.
  • It has good support for Java and Gherkin.
  • It has a file explorer and an integrated terminal.

Here’s what my Karate project looked like inside Visual Studio Code. Nice!

The Examples

The archetype project included example tests in the users.feature file. The first scenario from the file, copied below, tests getting users from the JSONPlaceholder REST API:

Feature: sample karate test script

* url 'https://jsonplaceholder.typicode.com'

Scenario: get all users and then get the first user by id

Given path 'users'
When method get
Then status 200

* def first = response[0]

Given path 'users', first.id
When method get
Then status 200

Anyone familiar with Cucumber will immediately recognize the Given/When/Then format for scenarios. However, Karate’s standard steps make the language more powerful than raw Gherkin:

  • Given steps build requests
  • When steps make request calls
  • Then steps validate responses
  • Catch-all steps (*) provide additional directives, like setting variables

All steps are quite concise. The Java implementation is essentially hidden from the tester (unless, of course, they want to plunge into the framework’s lower levels). Furthermore, this scenario shows how to use data from one response as the input for a second request.

Running Tests

Since I chose to use Maven and Visual Studio Code, the easiest way to run tests without any additional configuration was through the command line using “mvn test”. The example tests came with an ExamplesTest.java file that will run all feature files in the package when it is discovered during Maven’s “test” phase. The project defaulted to using JUnit 4, but JUnit 5 is also supported.

Running the tests will print many lines to the console. Every request and response will be printed. Below is the tail end of a successful run for users.feature:

$ mvn test
feature: classpath:examples/users/users.feature
scenarios:  2 | passed:  2 | failed:  0 | time: 1.5232
HTML report: (paste into browser to view) | Karate version: 0.9.0

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.09 sec

Results :

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  5.635 s
[INFO] Finished at: 2018-12-09T23:32:29-05:00
[INFO] ------------------------------------------------------------------------

Karate can generate helpful test reports, too. The project generates JUnit reports by default, but other report formats like Cucumber reports are also possible.


The JUnit HTML report shows a step-by-step log for each scenario.


Full requests and responses are automatically logged, which is great for debugging.


Failures are visually easy to identify. Above, I hacked the example to deliberately fail.

My Scenarios

Seeing examples is great, but I wanted to write my own scenarios with a real Web service for some hands-on experience. So, I wrote a test for Recipe Puppy, a search engine that provides links to recipes for input ingredients:

Feature: Recipe Puppy

* url 'http://www.recipepuppy.com'

Scenario Outline: Get a recipe for <ingredient>

Given path 'api'
And params {i: '<ingredient>'}
When method get
Then status 200
And match response contains {results: '#array'}
And match response.results[*] contains
    title: '#string',
    href: '#string',
    ingredients: '#regex <ingredient>',
    thumbnail: '#string'

    | ingredient |
    | tomato     |
    | pepperoni  |
    | cheese     |


Here are a few things I learned while writing this new feature file:

  • Scenario Outlines are supported, but data-driven features are preferred.
  • JSON objects can be used as step arguments or as variables.
  • Matching syntax is simple but sophisticated.
  • Markers like “#array” support fuzzy matching when specific values are unknown.
  • Multi-line JSON expressions need block quotes.

This new scenario ran just as successfully as the examples!

Standalone Execution

Even though writing tests using Karate’s domain-specific language does not require Java development skills, setting up the full Karate project does. Thankfully, Karate provides a standalone JAR that can run feature files without any other dependencies or configuration. It simply takes in paths to feature files, runs them, and generates Cucumber reports. The standalone JAR would be a good option for testers who don’t have strong programming skills.


This was the Cucumber report generated by running my Recipe Puppy test with the standalone JAR.

Other Features

Despite being a fairly young project (with a GitHub creation date of February 7, 2017), Karate is full of nifty features that I have yet to try:

  • Parallel test execution
  • A mock servlet
  • A UI for visually debugging scripts
  • Reading files of many types into variables
  • Calling a feature file from another feature file
  • Calling JavaScript code
  • Calling Java code
  • Using scenarios as Gatling performance tests

Project contributors are also experimenting to extend Karate to test browser, mobile, and desktop UIs using Selenium WebDriver.


Overall, Karate is a great tool for testing Web services. It handles all of the programming implementation details so that testers can focus more on testing. Its syntax is concise, clear, and versatile. Its native support for JSON object makes request and response handling feel natural. The GitHub docs are on-point. Anyone testing REST APIs should give it serious consideration.

Even though Karate uses Gherkin-like syntax, it is not truly a behavior-driven test framework. Instead, it simply leverages the strengths of Cucumber – readability, reusability, and tooling – to make automated Web service testing easier. I have long stated that one of the best solutions to test automation challenges is to create a domain-specific testing language that automatically handles low-level details. (See Behavior-Driven Blasphemy.) The Karate project validates my claim. Karate’s DSL acts much more like a testing language than a business-oriented specification language. Its Gherkin keywords simply provide structure and familiarity to its steps. And it’s perfectly okay that Karate isn’t “pure BDD” – just ask the project’s creator.

With that said, I would argue that a tester probably needs basic programming skills to be successful with Karate. Execution needs Java setup no matter what, and feature files are really just fancy test scripts. And Web service APIs are inherently code-y.

I look forward to seeing how the Karate project grows, especially if/when WebDriver-based steps are added to the language.



Behavior-Driven Blasphemy

This is my 100th post on Automation Panda! I’m thrilled to see how much this blog has grown and how many people it has helped. For such a monumental occasion, I have chosen to voice a rather controversial opinion about test automation.

Behavior-driven development seems to be the software testing buzzword of the decade. What started as a refinement of test-driven development by developers in Europe and the UK quickly became the big process fad of the 2010’s. The Cucumber project (now 10 years old) developed or inspired Gherkin-based test automation frameworks in all the major programming languages. Companies started requiring Given-When-Then format for acceptance criteria and test scenarios. Three Amigos meetings became standard calendar fixtures during sprints. Organizations that once undertook “Agile transformations” now have similar initiatives for BDD. For better or worse, BDD exists and cannot be ignored.

The dogmatic benefits of BDD are better collaboration and automation. However, leaders frequently insist that Gherkin-style test frameworks add value only when paired with practices like Example Mapping. “BDD is a process, not a tool,” is a common mantra. “Otherwise, the Gherkin just gets in the way.” Although I wholeheartedly agree that behavior-driven practices add significant value to the development process, I nevertheless espouse a rather blasphemous opinion:

BDD test automation frameworks are better than traditional frameworks for black box functional testing even when BDD processes are not followed.

What Exactly Are You Saying?

My claim is that behavior-driven test frameworks like Cucumber, SpecFlow, and behave are significantly better than traditional xUnit-style frameworks for testing live features. For example, I would rather use SpecFlow than NUnit for testing a Web app with Selenium WebDriver, whether or not the other two Amigos are with me. The resulting automation code has better structure, readability, and reusability.

I’m not saying that teams shouldn’t do BDD practices, and I’m not saying that the Three Amigos should be separated. Collaboration is key to success, and BDD really helps. Example Mapping is one of the most useful practices a development team can do. I’m also not saying that BDD frameworks should be used for all testing purposes – they are poorly suited for unit testing and for performance testing.


I find myself very lonely in this opinion. BDD leaders repeatedly insist that BDD is not about testing and automation:

The most outspoken BDDers (mostly coalescing around the Cucumber community) have largely moved their focus to the collaboration benefits, almost forsaking the automation benefits. (This may not necessarily be true, but it appears that way based on the literature and materials floating on the Web.) That outlook is somewhat disingenuous because the main tools supporting BDD are, in fact, test frameworks.

BDD also has outspoken opponents – it’s love or hate. I’ve personally spoken with several engineers who despise Gherkin-based frameworks. “I can see how it would be valuable when a whole team embraces behavior-driven practices,” many have told me, “but otherwise, the Gherkin layer just gets in the way of automation.” I’ve heard it called “plaster” and “garbage.” Engineers just want to code their tests. And code should always be readable, right?


Testing is an inherently opinionated space. People can never seem to agree on things.

The Bigger Picture

Test automation must be developed regardless of any specific development practices, and its architecture must stand firmly in its own right. Unfortunately, both sides miss the bigger picture:

The best solution for test automation is a domain-specific language.

A domain-specific language (DSL) is a programming language with a purpose. It is designed to handle very specific needs, rather than general-purpose programming. For example:

  • SQL is a DSL for database queries.
  • XPath is a DSL for finding elements in an XML document.
  • YAML is a DSL for object serialization.

Gherkin is also a DSL – for behavior specification.

Domain-specific languages naturally suit test automation due to the clear difference between test cases and test code. Test cases are procedures that exercise product behavior. Anyone can write a test case. They are dictated or explained in plain language. Test code, however, is the software implementation of test cases. Test code handles function calls, logging, exceptions, and all those other little programming details that help run tests. A test automation DSL separates those concerns: test cases are written in a special language, and the interpreter handles repetitive, low-level details. Some type of extensions can handle product-specific interactions. The purpose of a language is to effectively express intention – and the intention is to test the product.

To truly achieve an optimal solution, however, the DSL and its interpreter must be treated as part of the automation software, just like the test cases and extensions. Remember, a language’s interpreter is just another piece of software. The interpreter is part of the separation of concerns and the single responsibility principle. Concerns that would typically be handled by classes and functions in traditional test code should be moved to the interpreter. For example, the interpreter should automatically log every test case step, rather that forcing the author to write explicit logging statements.

When I worked at NetApp years ago, I implemented a DSL to test platform-level features of our operating system. I called it DS – short for “Design Steps” (from HP ALM) (but also not without an affinity for the Nintendo DS). NetApp’s entire test automation code was developed in Perl at the time, so I implemented the DS interpreter in Perl to reuse existing libraries. DS test cases were typically only a dozen lines long each, and DS expressions could call specially-written Perl modules directly for complete extendability. During the first big release using DS, my team saved countless hours of automation development time as compared to the previous release while delivering a higher number of tests. I also did this before I had ever heard of BDD.

Unfortunately, most teams have neither the time to develop their own testing DSL nor the understanding of compiler theory to build it right. And if they were given such a language, they typically limit themselves to the provided implementation instead of taking ownership to extend the language for their needs.


The original Nintendo DS. Fun times!

Who Truly Misunderstands Gherkin?

Enter Gherkin: the world’s first major general-purpose, off-the-shelf language for test automation. It is general enough to cover any case through its plain language steps, yet specific enough to standardize tests. Users don’t need to be compiler theory experts – they just make up their own step names and provide the definition code to execute them. Early BDD projects like JBehave and Cucumber packaged an interpreter as a test framework and delivered it to a testing world still stuck on JUnit. The need for a testing DSL was there, whether or not the BDD folks meant to serve it.

Cucumber-ites frequently bemoan that their framework is misunderstood by the masses. They shudder to see teams using their framework purely for test automation. However, Cucumber effectively lowered the entry barrier for teams to make their own testing DSLs. Kodak did the same thing for film: they made it cheap and standard so anyone could be a photographer. Not everyone who uses a BDD framework misunderstands its purpose: some (like me) just see an alternative value proposition than what is preached by orthodox BDD practitioners. Gherkin fills a need that nobody knew. Its popularity validates that claim.

Benefits Apart from Process

Using a BDD framework adds much value to testing and development even without BDD processes. Below are just a handful of benefits:

  1. Focus first on good scenarios. Gherkin forces authors to think before they code.
  2. Faster automation development. Gherkin steps are reusable and parametrizable.
  3. Stronger structure. Engineers know where to put things in the framework.
  4. Test understandability. Anyone can read scenarios because they are written in plain language. Business people can help. New people can pick it up fast.
  5. Test sharing. Feature files can be shared apart from test code, which can be helpful for business partners.
  6. Test similarity. Tests all look the same. Team members can more easily help each other.
  7. Clearer failures. When a scenario fails, reports show exactly what step failed.
  8. Simpler bug reports. Use scenario steps as instructions to reproduce the failure.
  9. 2-phase test reviews. Review Gherkin first and then test code second to make sure the test cases are good before implementing the wrong things.
  10. BDD enablement. Using a BDD framework opens the door for a team to embrace better behavioral practices in the future.

I wrote about these advantages before:

Case Studies

I’m also not the only one who finds value in BDD test frameworks outside of the full BDD process. Below are five case studies.


radish is a Python test framework inspired by Cucumber. Its DSL syntax is a superset of Gherkin that adds preconditions, loops, variables, and expressions. These language additions indicate a bias towards automation because they enable engineers to write tests more programmatically, albeit in a Gherkin-ese way.


Karate is a test framework with a full DSL based on Gherkin with steps specifically tailored to Web service calls. Although it is implemented in Java, testers do not need to do any Java programming to write complete tests cases from day one. Peter Thomas, the creator of Karate, unabashedly declares that Karate does not truly adhere to BDD but nevertheless uses Cucumber for its automation benefits. (Note: Karate is working to move completely off of Cucumber. See GitHub issue #444.)

REST Assured

REST Assured is a Java package for testing REST APIs. Unlike Karate, REST Assured provides a fluent syntax (and not a DSL) for writing service calls directly in Java code. The fluent syntax is based on Gherkin: given() a request spec is created, when() the call is made, then() verify the response. Although REST Assured is not a full testing framework, it nevertheless pulls inspiration from BDD frameworks for order and structure.


Cycle is a BDD-focused test automation platform from Cycle Labs for testing Web, terminal, and desktop apps. Cycle is unique because it provides out-of-the-box steps for all types of supported testing so that no programming experience is required. Testers write feature files using Cycle 2.0’s slick new Electron app. Scenarios are written in CycleScript, a Gherkin-ese language with additions like variables and sub-scenario calls. Steps tend to be imperative, but that’s the tradeoff for not requiring lower-level programming.


Hexawise is a combinatorial testing tool designed to maximize coverage with minimal test counts by smartly joining feature variations. It helps testers write better tests with less redundancy and fewer gaps. Although Hexawise has historically assisted manual testers, it also can generate Gherkin feature files for test variations.


Not all cucumbers are the same. Above is a sea cucumber.

Good Enough?

Gherkin-based test frameworks are not perfect, but they do provide good structure. They gained popularity outside of the pure BDD movement because they genuinely added value to testing and automation. Like any other tool, teams will use them in both good and bad ways. (Trust me, I’ve seen scary Gherkin.)

It’s interesting to see how groups outside the Cucumber diaspora are attempting to solve the limitations of pure Gherkin. Each case study above showed a unique path. Clearly, the test automation problem has not yet been completely solved, but current BDD frameworks are the best off-the-shelf solutions we have until a new software testing movement comes along.