YAML Comments in Gherkin Feature Files

In Gherkin-based BDD test frameworks, feature files hold behavior scenarios with Given-When-Then steps. Features and scenarios may be categorized by tags for hooks and filtering, and additional comment lines may be added anywhere. However, Gherkin itself may not be sufficient enough to capture all desired test metadata. Tags are great for simple classification but crude for larger information. And comments are meaningful only to the reader.

Fraser Scott (zeroXten) came up with a nifty idea for improving Gherkin information while working on the OWASP Cloud Security project: write YAML comments in feature files to provide more formal documentation. As stated on the project home page, “The OWASP Cloud Security project aims to help people secure their products and services running in the cloud by providing a set of easy to use threat and control BDD stories that pool together the expertise and experience of the development, operations and security communities.” It’s a pretty cool idea – use Gherkin to model attacks for both education and automation. The team is writing YAML comments at the top of feature files to provide custom information in a clean, readable format that could also be easily parsed by other tools. Below is an example feature file I copied from the project, with YAML comments at the top:


# Id: OCST-1.1.1
# Status: Confirmed
# Service: AWS EC2
# Components:
# – User Data
# STRIDE:
# – Elevation of privilege
# – Information disclosure
# References:
# – https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
Feature: User Data contains sensitive information
In order to obtain sensitive information about the target
As an attacker
I want the target to have inappropriately placed sensitive information in User Data that I can access
Scenario Outline: Access via instance attribute
Given an instance with sensitive information in the User Data attribute
And a principal with the ability to read the instance attributes
When the attacker searches the User Data for the "<data-type>"
Then the sensitive information is returned to the attacker
Examples: Data types
| datatype |
| password |
| API key |
| X.509 private key |
| SSH private key |
| Internal URL |
Scenario: Access via CloudFormation
Given an instance built using CloudFormation
And a principal with the ability to read CloudFormation templates
When the attacker searches the CloudFormation templates
Then the sensitive information is returned to the attacker
Scenario: Access via AutoScaling LaunchConfiguration
Given an instance built inside an Autoscaling group
And a principal with the ability to read Autoscaling launch configurations
When the attacker searches the launch configurations
Then the sensitive information is returned to the attacker

At first, I wasn’t too thrilled by the thought of YAML comments in feature files. Gherkin should provide all specification needs, and tag classification is often needed for automation. However, the YAML comments are quite clean, and for this project, they appear to document aspects of the scenarios that shouldn’t be buried in Gherkin (such as confirmation status and reference links). YAML is a very sensible format for formalized comments, too.

Take this idea as food for thought: YAML comments can be an effective way to add metadata to Gherkin feature files. Just make sure to capture all behavior specification using Gherkin and to still use tags for automation.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s