Tutoring: A Lifelong Impact

On Saturday, February 17, 2018, I delivered the keynote address at RIT TutorCon 2018 at the Rochester Institute of Technology in Rochester, NY. I was a student tutor at RIT from 2007-2010. The Academic Support Center asked me to speak about my experiences. Below is the transcript of my speech.

It’s good to be back in Ra-cha-cha! Happy Presidents’ Day weekend, and also Happy Chinese New Year! Let me get a good look at our tutors: If you are a tutor, please stand up.

[Wait for tutors to stand up.]

Great! It’s awesome to see so many of you here today. Is anyone in Computer Science?

Now, remain standing if you have been a tutor for at least one year.

[Wait for people to sit down.]

Not bad. What about two years?

[Wait for more people to sit down.]

Three? [Wait.] Four? [Wait.] Five? [Wait.]

What about ten years? Ten years of tutoring? [Give anyone who remains standing a round of applause, and then ask them to sit down.]

Ten years is a long time! A lot can happen; a lot can change. Here’s a question for you today, though: Will your tutoring make an impact in ten years? [Repeat the question for emphasis.]

Ten years ago, I was one of you. I was in my second year at RIT studying computer science, and I worked for the Academic Support Center and TRIO as a tutor for math, physics, and basically anything that was needed. I would have been sitting in your chair if we had these fancy tutoring conferences back them. Things were quite different a decade ago. Let me drop some knowledge bombs on you for the world in February 2008:

  • We were still on the iPhone 1. iPads did not exist yet.
  • Barack Obama was still seen as a surprise challenger to Hillary Clinton in the 2008 Democratic primaries.
  • The Great Recession was looming but had not yet hit.
  • The Summer Olympics were going to be held in Beijing, China. (Michael Phelps & Usain Bolt)
  • Lady Gaga had not yet released her debut album.

Now, let me contextualize this for RIT:

  • Bill Destler was still in his first year as university president.
  • RIT was still on the quarter system.
  • Park Point was being built.
  • The Simon Center (a.k.a. the “Toilet Bowl”) was being built.
  • The main drop-in study center was the “Math Lab” in Building 1, not Bates.

One thing that looks like it hasn’t changed, though, is Gracie’s. [Assume the audience will laugh.]

By the way, have they knocked down Riverknoll yet? I lived at 232 Kimball Drive. [Assume the audience will laugh or somehow respond.]

A lot happens in ten years. But, will your tutoring have an impact in ten years? Will the tutoring you do today benefit your students years from now? It should.

As college students, life is typically fast-paced. You have classes, you have papers, you have projects; quarters – excuse me, semesters – fly by; and it’s all over after about four years. And, for you, tutoring is just a part of that overall experience. It’s just a part-time job. As we saw earlier, most of you will spend only a few years tutoring before entering your career fields. Personally, I haven’t done any tutoring since 2010. It’s tempting to think that the time you spent tutoring doesn’t matter. So what if you help people finish their homework problems a few times a week? Students come and go anyway. It’s no big deal, right?

Well, if you’re here today at this tutoring conference, I’m pretty sure that tutoring is a big deal to you. You know it’s important. I’d be willing to bet that many of you would do tutoring even if you didn’t get paid – although, the pay is certainly deserved! I want you all to understand that what you do as a tutor will impact your students and will also impact you for the rest of your lives. Tutoring is a vector: I want you to see the line and not just the dot.

Your students come with a myriad of different circumstances. Some are just looking for a healthy environment for doing their homework. Maybe they’re stuck on a tricky physics brain-buster. Others struggle. Some really struggle – and may be one more failure away from academic suspension. But all students have one thing in common: they come to you because they want to do better. Whoever they are, they look to you as tutors to help them succeed. And every question you answer – or rather, every guiding question you turn back to them – puts them further down their paths to success. Today’s practice problems become tomorrow’s degrees. With you, they’ll learn not just the course material but, more importantly, they will learn how to learn. They will learn what questions to ask themselves. They will learn how to find answers using their resources. They will learn to teach themselves. Plutarch once said, “The mind is not a vessel to be filled but a fire to be ignited.”

With my perspective of the line, I want to give you three big ways you can make your tutoring today leave an impact for a lifetime.

First, own your role. As tutors, you have a very unique role with your students: you are peers; you are not professors. That’s a big difference! Professors are experts in their fields with years of experience and dozens of publications. You, as tutors, are students yourselves, just a few more years ahead. You can relate to your students on much more common ground. You’ve taken the same courses. You’ve taken the same tests. You’ve probably even done the very same problems. One of the tutoring tricks is to always work with a student at their level – if they sit at the table, you sit; if they stand at the board, you stand; and unless you’re making a really good example, don’t stand on the table! The equal-level principle also applies to your role as a peer tutor. There’s camaraderie. There’s energy. There’s less embarrassment to ask “stupid” questions. There’s a sense that they can do it because you can do it. So own your role as a peer tutor.

Second, focus on the student and not the problem. The problem is the dot; the student is the line. Tutors aren’t there to solve the world’s problems! Nobody comes to a tutoring center to watch a tutor show off with how much they know or how fast they can solve problems. “Look at how smart I am” – NO! Let’s be real, here: the solution to any given practice problem doesn’t really matter. What does matter is how the student learned to handle problems. Did they make an attempt? Did they look at their formulas? Did they write out their work? Did they persevere when they got stuck? Let me ask you a question: Do you think that I remember specific details to any homework assignments from ten years ago? [Wait for audience response.] Nope! But, I remember that a derivative is a rate of change. And, if I had to solve a derivative again, I’d know exactly where to look in my books to figure it out. That’s how you want your students to be in ten years. Cultivate your students to become independent.

Third, build camaraderie. Your students are already your peers – make them your friends. I don’t have any fancy statistics to share, but I know anecdotally that most students become “repeat customers.” You’ll see them again, and again, and again. Whether intended or not, you will forge relationships with your students. As your tutoring shifts become part of your everyday life, so, too, do the students who show up. Treat every single one of them the way you’d want to be treated. Work to form good relationships. Work to form trust. Be honest when you don’t know something. And furthermore, build camaraderie with your fellow tutors as well! Tutors are a team – each one brings fresh eyes and unique expertise. My specialty? Discrete math and differential equations – what a combo! We, as tutors, are trained in common techniques and share the common burdens to help our students. It’s almost like we have a special, unspoken club. I still keep up with my students and my tutors. I dined with a former student on top of the Space Needle. I partied with another on New Year’s Eve. I’m attending another student’s wedding this summer. A fellow tutor came to mine. So build camaraderie with your students and your fellow tutors.

As I close, I’d like to remind you that you are all in tutoring together. For some of you, this might just be the best job you ever have. I challenge all of you today to make your tutoring count: for now, for ten years from now, and for a lifetime. Tutors don’t make bad students good – tutors make students learn to teach themselves. That is how your tutoring will make a lifelong impact. Thank you.

Django Projects in Visual Studio Code

Visual Studio Code is a free source code editor developed my Microsoft. It feels much more lightweight than traditional IDEs, yet its extensions make it versatile enough to handle just about any type of development work, including Python and the Django web framework. This guide shows how to use Visual Studio Code for Django projects.


Make sure the latest version of Visual Studio Code is installed. Then, install the following (free) extensions:

Reload Visual Studio Code after installation.

This slideshow requires JavaScript.

Editing Code

The VS Code Python editor is really first-class. The syntax highlighting is on point, and the shortcuts are mostly what you’d expect from an IDE. Django template files also show syntax highlighting. The Explorer, which shows the project directory structure on the left, may be toggled on and off using the top-left file icon. Check out Python with Visual Studio Code for more features.

This slideshow requires JavaScript.

Virtual Environments

Virtual environments with venv or virtualenv make it easy to manage Python versions and packages locally rather than globally (system-wide). A common best practice is to create a virtual environment for each Python project and install only the packages the project needs via pip. Different environments make it possible to develop projects with different version requirements on the same machine.

Visual Studio Code allows users to configure Python environments. Navigate to File > Preferences > Settings and set the python.pythonPath setting to the path of the desired Python executable. Set it as a Workspace Setting instead of a User Setting if the virtual environment will be specific to the project.

VS Code Python Venv

Python virtual environment setup is shown as a Workspace Setting. The terminal window shows the creation and activation of the virtual environment, too.

Helpful Settings

Visual Studio Code settings can be configured to automatically lint and format code, which is especially helpful for Python. As shown on Ruddra’s Blog, install the following packages:

$ pip install pep8
$ pip install autopep8
$ pip install pylint

And then add the following settings:

    "team.showWelcomeMessage": false,
    "editor.formatOnSave": true,
    "python.linting.pep8Enabled": true,
    "python.linting.pylintPath": "/path/to/pylint",
    "python.linting.pylintArgs": [
    "python.linting.pylintEnabled": true

Editor settings may also be language-specific. For example, to limit automatic formatting to Python files only:

    "[python]": {
        "editor.formatOnSave": true

Make sure to set the pylintPath setting to the real path value. Keep in mind that these settings are optional.

VS Code Django Settings.png

Full settings for automatically formatting and linting the Python code.

Running Django Commands

Django development relies heavily on its command-line utility. Django commands can be run from a system terminal, but Visual Studio Code provides an Integrated Terminal within the app. The Integrated Terminal is convenient because it opens right to the project’s root directory. Plus, it’s in the same window as the code. The terminal can be opened from ViewIntegrated Terminal or using the “Ctrl-`” shortcut.

VS Code Terminal.png

Running Django commands from within the editor is delightfully convenient.


Debugging is another way Visual Studio Code’s Django support shines. The extensions already provide the launch configuration for debugging Django apps! As a bonus, it should already be set to use the Python path given by the python.pythonPath setting (for virtual environments). Simply switch to the Debug view and run the Django configuration. The config can be edited if necessary. Then, set breakpoints at the desired lines of code. The debugger will stop at any breakpoints as the Django app runs while the user interacts with the site.

VS Code Django Debugging

The Django extensions provide a default debug launch config. Simply set breakpoints and then run the “Django” config to debug!

Version Control

Version control in Visual Studio Code is simple and seamless. Git has become the dominant tool in the industry, but VS Code supports other tools as well. The Source Control view shows all changes and provides options for all actions (like commits, pushes, and pulls). Clicking changed files also opens a diff. For Git, there’s no need to use the command line!

VS Code Git

The Source Control view with a diff for a changed file.

Visual Studio Code creates a hidden “.vscode” directory in the project root directory for settings and launch configurations. Typically, these settings are specific to a user’s preferences and should be kept to the local workspace only. Remember to exclude them from the Git repository by adding the “.vscode” directory to the .gitignore file.

VS Code gitignore

.gitignore setting for the .vscode directory

Editor Comparisons

JetBrains PyCharm is one of the most popular Python IDEs available today. Its Python and Django development features are top-notch: full code completion, template linking and debugging, a manage.py console, and more. PyCharm also includes support for other Python web frameworks, JavaScript frameworks, and database connections. Django features, however, are available only in the (paid) licensed Professional Edition. It is possible to develop Django apps in the free Community Edition, as detailed in Django Projects in PyCharm Community Edition, but the missing features are a significant limitation. Plus, being a full IDE, PyCharm can feel heavy with its load time and myriad of options.

PyCharm is one of the best overall Python IDEs/editors, but there are other good ones out there. PyDev is an Eclipse-based IDE that provides Django support for free. Sublime Text and Atom also have plugins for Django. Visual Studio Code is nevertheless a viable option. It feels fast and simple yet powerful. Here’s my recommended decision table:

What’s Going On What You Should Do
Do you already have a PyCharm license? Just use PyCharm Professional Edition.
Will you work on a large-scale Django project? Strongly consider buying the license.
Do you need something fast, simple, and with basic Django support for free? Use Visual Studio Code, Atom, or Sublime Text.
Do you really want to stick to a full IDE for free? Pick PyDev if you like Eclipse, or follow the guide for Django Projects in PyCharm Community Edition

Starting a Django Project in an Existing Directory

Django is a wonderful Python web framework, and its command line utility is indispensable when developing Django sites. However, the command to start new projects is a bit tricky. The official tutorial shows the basic case – how to start a new project from scratch using the command:

$ django-admin startproject [projectname]

This command will create a new directory using the given project name and generate the basic Django files within it. However, project names have strict rules: they may contain only letters, numbers, and underscores. So, the following project name would fail:

$ django-admin startproject my-new-django-project
CommandError: 'my-new-django-project' is not a valid project name.
Please make sure the name is a valid identifier.

Another problem is initializing a new Django project inside an existing directory:

$ mkdir myproject
$ django-admin startproject myproject
CommandError: '/path/to/myproject' already exists

These two problems commonly happen when using Git (or other source control systems). The repository may already exist, and its name may have illegal project name characters. The project could be created as a sub-directory within the repository root, but this is not ideal.

Thankfully, there’s a simple solution. The “django-admin startproject” command takes an optional argument after the project name for the project path. This argument sidesteps both problems. The project root directory and the Django project file directory can have different names. The example below shows how to change into the desired root directory and start the project from within it using “.”:

$ cd my-django-git
$ django-admin startproject myproject .
$ ls
manage.py myproject

This can be a stumbling block because it is not documented in Django’s official tutorial. The “django-admin help startproject” command does document the optional directory argument but does not explain when this option is useful. Hopefully, this article makes its use case more intuitive!

Are Gherkin Scenarios with Multiple When-Then Pairs Okay?

Don’t know about Behavior-Driven Development or Gherkin? Start here!

Writing Gherkin is easy, but writing good Gherkin is hard. My post BDD 101: Writing Good Gherkin covers many aspects of good behavior specification, including titles, phrasing, and data. One of the major points I make anytime I discuss good Gherkin is what I call the “Cardinal Rule of BDD.”

The Cardinal Rule of BDDOne Scenario, One Behavior!

A behavior scenario specification should focus on one individual behavior. This is the essence of the BDD mindset – a product’s features can be specified in terms of its behaviors, and the specs should be written as examples of those behaviors in action. Identifying individual behaviors brings clarity to design, development, and testing. Combining behaviors into a single scenario causes ambiguity, miscommunication, and test gaps. Test failure triage also becomes more difficult and time consuming because the root causes for failures are less clear – the culprit could be one of multiple behaviors. There is also a high risk of duplication when scenarios repeat the same sequence of steps instead of isolating behaviors.

One of the dead giveaways to violations of the Cardinal Rule of BDD is when a Gherkin scenario has multiple When-Then pairs, like this:

Feature: Google Searching

  Scenario: Google Image search shows pictures
    Given the user opens a web browser
    And the user navigates to "https://www.google.com/"
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page
    When the user clicks on the "Images" link at the top of the results page
    Then images related to "panda" are shown on the results page

A When-Then pair denotes a unique behavior. In this example, the behaviors of performing a search and changing the search to images could and should clearly be separated into two scenarios, like this:

Feature: Google Searching

  Scenario: Search from the search bar
    Given a web browser is at the Google home page
    When the user enters "panda" into the search bar
    Then links related to "panda" are shown on the results page

  Scenario: Image search
    Given Google search results for "panda" are shown
    When the user clicks on the "Images" link at the top of the results page
    Then images related to "panda" are shown on the results page

Despite being so central to BDD philosophy, the Cardinal Rule is the one thing people always try to sidestep. Nobody ever doubts the usefulness of step parameters or the need for good grammar, but people frequently show me scenarios with multiple When-Then pairs and basically ask for an exception from the rule. My gut reaction is always, “NO! Rules don’t change.”


I must first admit that the Cardinal Rule of BDD is “opinionated” – it is the way that I have found BDD to work best for collaboration and automation. Adherence forces people to adopt a behavior-driven mindset, and strictness keeps feature and test quality high. Other experts are more permissive of multiple When-Then pairs, though. Most examples I could find from leading sources such as The Cucumber Book exhibit strict Given-When-Then order for Gherkin scenarios, but other sources such as the online JBehave documentation show scenarios with multiple When-Then pairs boldly on the front page.

I must also begrudgingly admit that there are times when it is simply more convenient for a single scenario to have multiple behaviors (and thus multiple When-Then pairs). This is by no means a best practice but rather a pragmatic alternative for specification dilemmas. (See Purist vs. Pragmatist.) Below are situations in which multiple When-Then pairs may be acceptable.

Lengthy End-to-End Scenarios

End-to-end tests verify execution paths through a live system with all of its parts. Web UI tests frequently fall into this category: Selenium WebDriver interacts with a page in a browser, which then triggers calls to a backend service layer or database. Despite the name, end-to-end tests may still focus on one individual behavior. The example scenarios above, though short, technically count as end-to-end tests.

However, many people use the term “end-to-end” to refer to tests that cover sequences of behaviors. Such a scenario could violate the Cardinal Rule of BDD if it is not handled carefully. My article BDD 101: Unit, Integration, and End-to-End Tests gives strategies for handling lengthy end-to-end scenarios. One strategy is to simply turn a blind eye to multiple When-Then pairs. Ideally, each behavior would already have its own individual scenario, but then a new scenario would explicitly combine the behaviors together to get that full, end-to-end path. The new scenario would be easy to write because the steps could be reused. This isn’t the only strategy, so please be sure to consider the others before writing the tests.


Software system audits frequently require lengthy end-to-end scenarios. They are quite common in highly-regulated domains. For example, a bank may need to prove that a loan is prepared correctly or that a transaction puts money into the right accounts. Auditors typically require tests to run through entire system paths (e.g., multiple behaviors) using the same records, such as one loan application or one payment. Auditees must not only provide test results for past runs but must also repeat tests on demand. Separating each individual behavior into its own scenario makes each test independent, so during test execution, there will be no guaranteed order and no shared test data, and auditors would not have the end-to-end verification that they require. The simplest way to give the auditors what they need is to write one lengthy scenario with multiple When-Then pairs.

Service Calls

Service call testing is another case for which multiple When-Then pairs may be pragmatically justified. REST, SOAP, and WSDL are examples of service call types. Service layer development is more engineering-centric than business-centric, but many teams nevertheless choose to test service calls with Gherkin-based frameworks like Cucumber. Due to the programmatic nature of services, Gherkin scenarios for service calls tend to be quite imperative: specify a request, make the call, and verify parts of the response. This isn’t so bad for independent service calls, but it becomes a problematic when one request needs another call’s response.

One solution is the classic “pure” scenario split: put any necessary setup, including initial requests to get required response parts, into custom Given steps. This abides by the Cardinal Rule and avoids duplicate When-Then pairs. But, it introduces an unsavory form of code duplication. Many service calls end up being written twice: once as a Gherkin scenario for testing, and once in the underlying automation code to be called by Given steps. This violates the DRY principle.

The alternative “pragmatic” solution is to write scenarios that specify multiple service calls in the Gherkin steps. The Karate project advocates this approach, as shown in their “Hello World” example:

Take Caution!

There may be other cases when When-Then repetition is useful. Feel free to leave suggestions in the comments below. My examples are meant to be descriptive, not prescriptive. Another aspect to consider is that allowing multiple When-Then pairs per scenario indicates that a team sees more value in BDD’s test framework than in its collaborative spec process. (Refer to ‑‑BDD; Automation without Collaboration and BDD‑‑; Collaboration without Automation.)

Ultimately, you must decide what practices are best for your project. The main reason I uphold the Cardinal Rule of BDD so strongly is that it makes for good specs and good tests. I’ve seen engineers write extremely long, intensive test procedures (and I mean, dozens of duplicate behaviors per test) that are alright for manual testing but do not transition well into automation because they are too fragile and they don’t yield useful information upon failure. The Cardinal Rule is a way to break out of the procedure-driven mindset, and banning multiple When-Then pairs per Gherkin scenario is an effective rule for enforcing it.

Good Gherkin Scenario Titles

Don’t know about Behavior-Driven Development or Gherkin? Start here!

The Golden Gherkin Rule states:

Treat other readers as you would want to be treated. Write Gherkin so that people who don’t know the feature will understand it.

Part of writing good Gherkin (or any other specification-by-example language) includes writing good behavior scenario titles. The title is the face of the scenario: it summarizes what the behavior is all about. Good titles make collaboration and test triage a breeze, whereas bad titles make it tougher. But what makes a title “good”? Below are some helpful pointers.


Good titles should be short one-liners. One simple statement should be sufficient to concisely capture the intended behavior. Anything longer likely means that either the author doesn’t truly understand the behavior in focus, or that the scenario does not focus on one main behavior. Extra comments may be added to supplement the scenario’s description if necessary to avoid lengthy titles. Also, most BDD test automation frameworks will print scenario titles to logs for traceability.

Bad Example Good Example
The user can log into the app, navigate to the profile page, and see their full name, address, phone number, email, and username The profile page displays the user’s personal info

Conjunction Disjunction

Watch out for conjunction words like “and,” “or,” and “but.” Conjunctions typically imply that more than one thing will be done, which for scenario titles implies that more than one behavior will be covered. Or, it indicates that a Scenario Outline may be appropriate Don’t break the Cardinal Rule of BDD! Keep each scenario focused on one main behavior.

Avoid other conjunctions like “because,” “since,” and “so” as well. Phrases starting with those words often give an explanation for why the scenario exists. However, for conciseness, scenario titles should focus on what the behavior is. The why can either be deduced from the steps or made plain with comments.

Bad Example Good Example
The user can request an insurance quote from the big “Get-A-Quote” button on the home page or from the “Insurance Policies” page Two Scenarios: The user requests an insurance quote from the “Get-A-Quote” button on the home page / The user requests an insurance quote from the “Insurance Policies” page


Scenario Outline: The user requests an insurance quote

The last five search phrases are saved so that the user can rerun them from the history page The history page saves the last five search phrases

Avoid Assertion Language

Don’t use the words “verify,” “assert,” or “should” in scenario titles. They put the scenario’s emphasis on the assertion rather than the behavior. Assertions are merely a facet of behavior testing – they verify that something exists or that two values are equal. Behavior scenarios, however, are full software specifications. BDD is a development practice for making better software products – it’s not just a test tool. Don’t reduce the behavior-driven mindset to a test-only mindset.

Furthermore, leading every scenario title with “verify” or “assert” becomes very repetitive. The words just don’t enhance the meaningfulness of the title. They also thwart alphabetical order.

Bad Example Good Example
Verify the user can change their address on the profile page Profile page address change
Assert that a stock quote is displayed in green text when its value is higher than its previous closing value A stock quote has green text when its value is higher than its previous closing value
The goodbye page should be displayed after a successful logout Logout displays the goodbye page


Do you have any more suggestions? Put them in the comments below!

JavaScript Testing with Jasmine

Table of Contents

  1. Introduction
  2. Setup and Installation
  3. Project Structure
  4. Unit Tests for Functions
  5. Unit Tests for Classes
  6. Unit Tests with Mocks
  7. Integration Tests for REST APIs
  8. End-to-End Tests for Web UIs
  9. Basic Test Execution
  10. Advanced Test Execution with Karma
  11. Angular Testing


Jasmine is one of the most popular JavaScript test frameworks available. Its tests are intuitively recognizable by their describe/it format. Jasmine is inspired by Behavior-Driven Development and comes with many basic features out-of-the-box. While Jasmine is renowned for its Node.js support, it also supports Python and Ruby. Jasmine also works with JavaScript-based languages like TypeScript and CoffeeScript.

This guide shows how to write tests in JavaScript on Node.js using Jasmine. It uses the jasmine-node-js-example project (hosted on GitHub). Content includes:

  • Basic white-box unit tests
  • REST API integration tests with frisby
  • Web UI end-to-end tests with Protractor
  • Spying with sinon
  • Monkeypatching with rewire
  • Handling config data with JSON files
  • Advanced execution features with Karma
  • Special considerations for Angular projects

The Jasmine API Reference is also indispensable when writing tests.

Setup and Installation

The official Jasmine Node.js Setup Guide explains how to set up and install Jasmine. Jasmine tests may be added to an existing project or to an entirely new project. As a prerequisite, Node.js must already be installed. Use the following commands to set things up.

# Initialize a new project (if necessary)
# This will create the package.json file
$ mkdir [project-name]
$ cd [project-name]
$ npm init

# Install Jasmine locally for the project and globally for the CLI
$ npm install jasmine
$ npm install -g jasmine

# Create a spec directory with configuration file for Jasmine
$ jasmine init

# Optional: Install official Jasmine examples
# Do this only for self-education in a separate project
$ jasmine examples

The code used by this guide is available in GitHub at jasmine-node-js-example. Feel free to clone this repository to try things out yourself!

Recommended editors and IDEs include Visual Studio Code with the Jasmine Snippets extensions, Atom, and JetBrains WebStorm.

Project Structure

Jasmine does not require the project to have a specific directory layout, but it does use a configuration file to specify where to find tests. The default, conventional project structure created by “jasmine init” puts all Jasmine code into a “spec” directory, which contains “*spec.js” files for tests, helpers that run before specs, and a support directory for config. The JASMINE_CONFIG_PATH environment variable can be set to change the config file used. (The default config file is spec/support/jasmine.json.)

|-- [product source code]
|-- spec
|   |-- [spec sub-directory]
|   |   `-- *spec.js
|   |-- helpers
|   |   `-- [helper sub-directory]
|   `-- support
|       `-- jasmine.json
`-- package.json

This structure may be changed using the “spec_dir”, “spec_files”, and “helpers” properties in the config file. For example, it may be useful to change the structure to include more than one level of directories to the hierarchy. However, it is typically best to leave the conventional directory layout in place. The default config values as of Jasmine 2.8 are below.

  "spec_dir": "spec",
  "spec_files": [
  "helpers": [
  "stopSpecOnExpectationFailure": false,
  "random": false

It is also a best practice to separate tests between different levels of the Testing Pyramid. The example project has spec subdirectories for unit, integration, and end-to-end tests. Directory-level organization makes it easy to filter tests by level when executed.

Unit Tests for Functions

The most basic unit of code to be tested in JavaScript is a function. The “lib/calculator.functions.js” module contains some basic math functions for easy testing.

// --------------------------------------------------
// lib/calculator.functions.js
// --------------------------------------------------

// Calculator Functions

function add(a, b) {
    return a + b;

function subtract(a, b) {
    return a - b;

function multiply(a, b) {
    return a * b;

function divide(a, b) {
    let value = a * 1.0 / b;
    if (!isFinite(value))
        throw new RangeError('Divide-by-zero');
        return value;

function maximum(a, b) {
    return (a >= b) ? a : b;

function minimum(a, b) {
    return (a <= b) ? a : b;

// Module Exports

module.exports = {
    add: add,
    subtract: subtract,
    multiply: multiply,
    divide: divide,
    maximum: maximum,
    minimum: minimum,

Its tests are in “spec/unit/calculator.function.spec.js”. Below is a snippet showing simple tests for the “add” function. A describe block groups a “suite” of specs together. Each it block is an individual spec (or test). Titles for specs are often written as what the spec should do. Describe blocks may be nested for hierarchical grouping, but it blocks (being bottom-level) may not. Assertions are made using Jasmine’s fluent-like expect and matcher methods. Since the functions are stateless, no setup or cleanup is needed. Tests for other math functions are similar.

// --------------------------------------------------
// spec/unit/calculator.function.spec.js
// --------------------------------------------------

const calc = require('../../lib/calculator.functions');

describe("Calculator Functions", function() {

  describe("add", function() {

    it("should add two positive numbers", function() {
      let value = calc.add(3, 2);

    it("should add a positive and a negative number", function() {
      let value = calc.add(3, -2);

    it("should give the same value when adding zero", function() {
      let value = calc.add(3, 0);



The divide-by-zero test for the “divide” function is special because it must verify that an exception is thrown. The divide call is wrapped in a function so that it may be passed into the expect call.

  describe("divide", function() {

    // ...

    it("should throw an exception when dividing by zero", function() {
      let divideByZero = function() { calc.divide(3, 0); };
      expect(divideByZero).toThrowError(RangeError, 'Divide-by-zero');

    // ...


The “maximum” and “minimum” functions have parametrized tests using the Array class’s forEach method. This is a nifty trick for hitting multiple input sets without duplicating code or combining specs. Note that the spec titles are also parametrized. Tests for “maximum” are shown below.

  describe("maximum", function() {

      [1, 2, 2],
      [2, 1, 2],
      [2, 2, 2],
    ].forEach(([a, b, expected]) => {
      it(`should return ${expected} when given ${a} and ${b}`, () => {
        let value = calc.maximum(a, b);


Unit Tests for Classes

Jasmine can also test classes. When testing classes, setup and cleanup routines become more helpful. The Calculator class in the “lib/calculator.class.js” module calls the math functions and caches the last answer.

// --------------------------------------------------
// lib/calculator.class.js
// --------------------------------------------------

// Imports

const calcFunc = require('./calculator.functions');

// Calculator Class

class Calculator {

  constructor() {
      this.last_answer = 0;

  do_math(a, b, func) {
      return (this.last_answer = func(a, b));

  add(a, b) {
      return this.do_math(a, b, calcFunc.add);

  subtract(a, b) {
      return this.do_math(a, b, calcFunc.subtract);

  multiply(a, b) {
      return this.do_math(a, b, calcFunc.multiply);

  divide(a, b) {
      return this.do_math(a, b, calcFunc.divide);

  maximum(a, b) {
      return this.do_math(a, b, calcFunc.maximum);

  minimum(a, b) {
      return this.do_math(a, b, calcFunc.minimum);


// Module Exports

module.exports = {
  Calculator: Calculator,

The Jasmine specs in “spec/unit/calculator.class.spec.js” are very similar but now call the beforeEach method to construct the Calculator object before each scenario. (Jasmine also has methods for afterEach, beforeAll, and afterAll.) The verifyAnswer helper function also makes assertions easier. The addition tests are shown below.

// --------------------------------------------------
// spec/unit/calculator.class.spec.js
// --------------------------------------------------

const calc = require('../../lib/calculator.class');

describe("Calculator Class", function() {

  let calculator;

  beforeEach(function() {
    calculator = new calc.Calculator();

  function verifyAnswer(actual, expected) {

  describe("add", function() {

    it("should add two positive numbers", function() {
      verifyAnswer(calculator.add(3, 2), 5);

    it("should add a positive and a negative number", function() {
      verifyAnswer(calculator.add(3, -2), 1);

    it("should give the same value when adding zero", function() {
      verifyAnswer(calculator.add(3, 0), 3);


  // ...


Unit Tests with Mocks

Mocks help to keep unit tests focused narrowly upon the unit under test. They are essential when units of code depend upon other callable entities. For example, mocks can be used to provide dummy test values for REST APIs instead of calling the real endpoints so that receiving code can be tested independently.

Jasmine’s out-of-the-box spies can do some mocking and spying, but it is not very powerful. For example, it doesn’t work when members of one module call members of another, or even when members of the same module call each other (unless they are within the same class). It is better to use rewire for monkey-patching (mocking via member substitution) and sinon for stubbing and spying.

The “lib/weather.js” module shows how mocking can be done with member dependencies. The WeatherCaller class’s “getForecast” method calls the “callForecast” function, which is meant to represent a service call to get live weather forecasts. The “callForecast” function returns an empty object, but the specs will “rewire” it to return dummy test values that can be used by the WeatherCaller class. Rewiring will work even though “callForecast” is not exported!

// --------------------------------------------------
// lib/weather.js
// --------------------------------------------------

function callForecast(month, day, year, zipcode) {
  return {};

class WeatherCaller {

  constructor() {
    this.forecasts = {};

  getForecast(month, day, year, zipcode) {
    let key = `${month}/${day}/${year} for ${zipcode}`;
    if (!(key in this.forecasts)) {
      this.forecasts[key] = callForecast(month, day, year, zipcode);
    return this.forecasts[key];


module.exports = {
  WeatherCaller: WeatherCaller,

The tests in “spec/unit/weather.mock.spec.js” monkey-patch the “callForecast” function with a sinon stub in the beforeEach call so that each test has a fresh spy count. Note that the weather method is imported using “rewire” instead of “require” so that it can be monkey-patched. Even though the original function returns an empty object, the tests pass because the mock returns the dummy test value.

// --------------------------------------------------
// spec/unit/weather.mock.spec.js
// --------------------------------------------------

// Imports

const rewire = require('rewire');
const sinon = require('sinon');

// Rewirings

const weather = rewire('../../lib/weather');

// WeatherCaller Specs
describe("WeatherCaller Class", function() {

  // Test constants
  const dummyForecast = {"high": 42, "low": 26};

  // Test variables
  let callForecastMock;
  let weatherModuleRestore;
  let weatherCaller;

  beforeEach(function() {
    // Mock the inner function's return value using sinon
    // Do this for each test to avoid side effects of call count
    callForecastMock = sinon.stub().returns(dummyForecast);
    weatherModuleRestore = weather.__set__("callForecast", callForecastMock);

    // Construct the main caller object
    weatherCaller = new weather.WeatherCaller();

  it("should be empty upon construction", function() {
    // No mocks required here

  it("should get a forecast for a date and a zipcode", function() {
    // This simply verifies that the return value is correct
    let forecast = weatherCaller.getForecast(12, 25, 2017, 21047);

  it("should get a fresh forecast the first time", function() {
    // The inner function should be called and the value should be cached
    // Note the sequence of assertions, which guarantee safety
    let forecast = weatherCaller.getForecast(12, 25, 2017, 21047);
    const forecastKey = "12/25/2017 for 21047";
    expect(forecastKey in weatherCaller.forecasts).toBeTruthy();

  it("should get a cached forecast the second time", function() {
    // The inner function should be called only once
    // The same object should be returned by both method calls
    let forecast1 = weatherCaller.getForecast(12, 25, 2017, 21047);
    let forecast2 = weatherCaller.getForecast(12, 25, 2017, 21047);

  it("should get and cache multiple forecasts", function() {
    // The other tests verify the mechanics of individual calls
    // This test verifies that the caller can handle multiple forecasts

    // Initial forecasts
    let forecast1 = weatherCaller.getForecast(12, 25, 2017, 27518);
    let forecast2 = weatherCaller.getForecast(12, 25, 2017, 27518);
    let forecast3 = weatherCaller.getForecast(12, 25, 2017, 21047);

    // Change forecast value
    const newForecast = {"high": 39, "low": 18}
    callForecastMock = sinon.stub().returns(newForecast);
    weatherModuleRestore = weather.__set__("callForecast", callForecastMock);

    // More forecasts
    let forecast4 = weatherCaller.getForecast(12, 26, 2017, 21047);
    let forecast5 = weatherCaller.getForecast(12, 27, 2017, 21047);

    // Assertions
    expect("12/25/2017 for 27518" in weatherCaller.forecasts).toBeTruthy();
    expect("12/25/2017 for 21047" in weatherCaller.forecasts).toBeTruthy();
    expect("12/26/2017 for 21047" in weatherCaller.forecasts).toBeTruthy();
    expect("12/27/2017 for 21047" in weatherCaller.forecasts).toBeTruthy();

  afterEach(function() {
    // Undo the monkeypatching


Integration Tests for REST APIs

Jasmine can do black-box tests just as well as it can do white-box tests. Testing REST API service calls are some of the most common integration-level tests. There are many REST request packages for Node.js, but frisby is particularly designed for testing. Frisby even has its own expect methods (though the standard Jasmine expect and matchers may still be used).

A best practice for black-box tests is to put config data for external dependencies into a config file. Config data for REST API calls could be URLs, usernames, and passwords. Never hard-code config data into test automation. JavaScript config files are super simple: just write a JSON file and read it during test setup using the “require” function, just like any module. The config data will be automatically parsed as a JavaScript object!

Below is an example test for calling Wikipedia’s REST API. It reads the base URL from a config file and uses it in the frisby call. The config file:

// --------------------------------------------------
// spec/support/env.json
// --------------------------------------------------
  "integration" : {
    "wikipediaServiceBaseUrl": "https://en.wikipedia.org/api/rest_v1"

And the spec:

// --------------------------------------------------
// spec/integration/wikipedia.service.spec.js
// --------------------------------------------------

const frisby = require('frisby');

describe("English Wikipedia REST API", function() {

  const ENV = require("../support/env.json");
  const BASE_URL = ENV.integration.wikipediaServiceBaseUrl;

  describe("GET /page/summary/{title}", function() {

    it("should return the summary for the given page title", function(done) {
        .get(BASE_URL + "/page/summary/Pikachu")
        .then(function(response) {


  // ...

End-to-End Tests for Web UIs

Jasmine can also be used for end-to-end Web UI tests. One of the most popular packages for web browser automation is Selenium WebDriver, which uses programming calls to interact with a browser like a real user. Selenium releases a WebDriver package for JavaScript for Node.js, but it is typically a better practice to use Protractor.

Protractor integrates WebDriver with JavaScript test frameworks to make it easier to use. By default, Jasmine is the default framework for Protractor, but Mocha, Cucumber, and any other JavaScript framework could be used. One of the best advantages Protractor has over WebDriver by itself is that Protractor does automatic waiting: explicit calls to wait for page elements are not necessary. This is a wonderful feature that eliminates a lot of repetitive automation code. Protractor also provides tools to easily set up the Selenium Server and browsers (including mobile browsers). Even though Protractor is designed for Angular apps, it can nevertheless be used for non-Angular front-ends.

Web UI tests can be quite complicated because they cover many layers and require extra configuration. Web page interactions frequently need to be reused, too. It is a best practice to use a pattern like the Page Object Model to handle web interactions in one reusable layer. Page objects pull WebDriver locators and actions out of test fixtures (like describe/it functions) so that they may be updated more easily when changes are developed for the actual web pages. (In fact, some teams choose to co-locate page object classes with product source code for the web app so that both are updated simultaneously.) The Page Object Model is a great way to manage the inherently complicated Web automation design.

This guide does not provide a custom example for Protractor with Jasmine because the Protractor documentation is pretty good. It contains a decent tutorial, setup and config instructions, framework integrations, and a full reference. Furthermore, proper Protractor setup requires careful local setup with a live site to test. Please refer to the official doc for more information. Most of the examples in the doc use Jasmine.

Basic Test Execution

The simplest way to run Jasmine tests is to use the “jasmine” command. Make sure you are in the project’s root directory when running tests. Below are example invocations.

# Run all specs in the project (according to the Jasmine config)
$ jasmine

# Run a specific spec by file path
$ jasmine spec/integration/wikipedia.service.spec.js

# Run all specs that match a path pattern
# Warning: this call is NOT recursive and will not search sub-directories!
$ jasmine spec/unit/*

# Run all specs whose titles match a regex filter
# This searches both "describe" and "it" titles
$ jasmine --filter="Calculator"

# Stop testing after the first failure happens
$ jasmine --stop-on-failure=true

# Run tests in a random order
# Optionally include a seed value
$ jasmine --random=true --seed=4321

Test execution options may also be set in the Jasmine config file.

Advanced Test Execution with Karma

Karma is a self-described “spectacular test runner for JavaScript.” Its main value is that it runs JavaScript tests in live web browsers (rather than merely on Node.js), testing actual browser compatibility. In fact, developers can keep Karma running while they develop code so they can see test results in real time as they make changes. Karma integrates with many test tools (including Istanbul for code coverage) and frameworks (including Jasmine). Karma itself runs on Node.js and is distributed as a number of packages for different browsers and frameworks. Check out this Google Testing Blog article to learn the original impetus behind developing Karma, originally called “Testacular.”

Karma and Protractor are similar in that they run tests against real web browsers, but they serve different purposes. Karma is meant for running unit tests against JavaScript code, whereas Protractor is meant for running end-to-end tests against a full, live site like a user. Karma tests go through a “back door” to exercise pieces of a site. Karma and Protractor are not meant to be used together for the same tests (see Protractor Issue #9 on GitHub). However, one project can use both tools at their appropriate test layers, as done for standard Angular testing.

This guide does not provide a custom example for Karma with Jasmine because it requires local setup with the right packages and browser versions. Karma packages are distributed through npm. Karma with Jasmine requires the main karma package, the karma-jasmine package, and a launcher package for each desired browser (like karma-chrome-launcher). There are also plenty of decent examples online here, here, and here. Please refer to the official Karma documentation for more info.

Running Jasmine tests with Karma is not without its difficulties, however. One challenge is handling modules and imports. ECMAScript 6 (ES6) has a totally new syntax for modules and imports that is incompatible with the CommonJS module system with require used by Node.js. Node.js is working on ES6-style module support, but at the time this article was written, full support was not yet available. Module imports are troublesome for Karma because Karma is launched from Node.js (requiring require) but runs in a browser (which doesn’t support require). There are a few workarounds:

  • Use RequireJS to load modules.
  • Use Browserify to make require work in browsers.
  • Use rollup.js to bundle all modules into one to sidestep imports.
  • Use Angular with TypeScript, which builds and links everything automatically.

Angular Testing

Angular is a very popular front-end Web framework. It is a complete rewrite of AngularJS and is seen as an alternative to React. One of Angular’s perks is its excellent support for testing. Out of the box, new Angular projects come with config for unit testing with Jasmine/Karma and end-to-end testing with Jasmine/Protractor. It’s easy to integrate other automation tools like Istanbul code coverage or HTML reporting. Standard Angular projects using TypeScript also don’t suffer from the module import problem: imports are linked properly when TypeScript is compiled into JavaScript.

Angular unit tests are written just like any other Jasmine unit tests except for one main difference: the Angular testing utilities. These extra packages create a test environment (a “TestBed”) for testing each part of the Angular app internally and independently. Dependencies can be easily stubbed and mocked using Jasmine’s spies, with no need for sinon since everything binds. NGRX also provides extended test utilities. The Angular testing utilities can seem overwhelming at first, but together with Jasmine, they make it easy to write laser-precise unit tests.

Another interesting best practice for Angular unit tests is to co-locate them with the modules they cover. For every *.js/*.ts file, there should be a *.spec.js/*.spec.ts file with the covering describe/it tests. This is not common practice for unit tests, but the Angular doc notes many advantages: tests are easy to find, coverage is roughly visual, and updates are less likely forgotten. The automatically-generated test config has settings to search the whole project for spec files.

Angular end-to-end tests are treated differently from unit tests, however. Since they test the app as a whole, they don’t use the Angular testing utilities, and they should be located in their own directory (usually named “e2e”). Thus, Angular end-to-end tests are really no different than any other Web UI tests that use Protractor. Jasmine is the default test framework, but it may be advantageous to switch to Cucumber.js for all the advantages of BDD.

This guide does not provide Angular testing examples because the official Angular documentation is stellar. It contains a tutorial, a whole page on testing, and live examples of tests (linked from the testing page).

To Infinity and Beyond: A Guide to Parallel Testing

Are your automated tests running in parallel? If not, then they probably should be. Together with continuous integration, parallel testing the best way to fail fast during software development and ultimately enforce higher software quality. Switching tests from serial to parallel execution, however, is not a simple task. Tests themselves must be designed to run concurrently without colliding, and extra tools and systems are needed to handle the extra stress. This article is a high-level guide to good parallel testing practices.

What is Parallel Testing?

Parallel testing means running multiple automated tests simultaneously to shorten the overall start-to-end runtime of a test suite. For example, if 10 tests take a total of 10 minutes to run, then 2 parallel processes could execute 5 tests each and cut the total runtime down to 5 minutes. Even better, 10 processes could execute 1 test each to shrink runtime to 1 minute. Parallel testing is usually managed by either a test framework or a continuous integration tool. It also requires more compute resources than serial testing.

Why Go Parallel?

Running automated tests in parallel does require more effort (and potentially cost) than running tests serially. So, why go through the trouble?

The answer is simple: time. It is well documented that software bugs cost more when they are discovered later. That’s why current development practices like Agile and BDD strive to avoid problems from the start through small iterations and healthy collaboration (“shift left“), while CI/CD defensively catches regressions as soon as they happen (“fail fast“). Reducing the time to discover a problem after it has been introduced means higher quality and higher productivity.

Ideally, a developer should be told if a code change is good or bad immediately after committing it. The change should automatically trigger a new build that runs all tests. Unfortunately, tests are not instantaneous – they could take minutes, hours, or even days to complete. A test automation strategy based on the Testing Pyramid will certainly shorten start-to-end execution time but likely still require parallelization. Consider the layers of the Testing Pyramid and their tests’ average runtimes, the Testing Pyramid Rule of 1’s:

The Testing Pyramid with Times

Each layer is listed above with the rough runtime of a typical test. Though actual runtimes will vary, the Rule of 1’s focuses on orders of magnitude. Unit tests typically run in milliseconds because they often exercise product code in memory. Integration tests exercise live products but are limited in scope and often cover low-level areas (like REST service calls). End-to-end tests, however, cover full paths through a live system, which requires extra setup and waiting (like Selenium WebDriver interaction).

Now, consider how many tests from each layer could be run within given time limits, if the tests are run serially:

Test Layer 1 Minute
10 Minutes
Coffee Break
1 Hour
There Goes Today
Unit 60,000 600,000 3,600,000
Integration 60 600 3,600
End-to-End 1 10 60

Unit test numbers look pretty good, though keep in mind 1 millisecond is often the best-case runtime for a unit test. Integration and end-to-end runtimes, however, pose a more pressing problem. It is not uncommon for a project to have thousands of above-unit tests, yet not even a hundred end-to-end tests could complete within an hour, nor could a thousand integration tests complete within 10 minutes. Now, consider two more facts: (1) tests often run as different phases in a CI pipeline, to total runtimes are stacked, and (2) multiple commits would trigger multiple builds, which could cause a serious backup. Serial test execution would starve engineering feedback in any continuous integration system of scale. A team would need to drastically shrink test coverage or give up on being truly “continuous” in favor of running tests daily or weekly. Neither alternative is acceptable these days. CI needs parallel testing to be truly continuous.

The Danger of Collisions

The biggest danger for parallel testing is collision – when tests interfere with each other, causing invalid test failures. Collisions may happen in the product under test if product state is manipulated by more than one test at a time, or they may happen in the automation code itself if the code is not thread-safe. Collisions are also inherently intermittent, which makes them all the more difficult to diagnose. As a design principle, automated tests must avoid collisions for correct parallel execution.

Making tests run in parallel is not as simple as flipping a switch or adding a new config file. Automated tests must be specifically designed to run in parallel. A team may need to significantly redevelop their automation code to make parallel execution work right.


A train collision in Iran in November 2016. Don’t let this happen to your tests!

Handling Product-Level Collisions

Product-level collisions essentially reduce to how environments are set up and handled.

Separate Environments

The most basic way to avoid product-level collisions would be to run each test thread or process against its own instance of the product in an exclusive environment. (In the most extreme case, every single test could have its own product instance.) No collisions would happen in the product because each product instance would be touched by only one test instance at a time. Separate environments are possible to implement using various configuration and deployment tools. Docker containers are quick and easy to spin up. VMs with Vagrant, Puppet, Chef, and/or Ansible can also get it done.

However, it may not always be sensible to make separate environments for each test thread/process:

  • Creating a new environment is inefficient – it takes extra time to set up that may cancel out any time saved from parallel execution.
  • Many projects simply don’t have the money or the compute resources to handle a massive scale-out.
  • Some tests may not cause collisions and therefore may not need total isolation.
  • Some product environments are extremely large and complicated and would not be practical to replicate for each test individually.

Shared Environments

Environments with a shared product instance are quite common. One could be a common environment that everyone on a team shares, or one could be freshly created during a CI run and accessed by multiple test threads/processes. Either way, product-level collisions are possible, and tests must be designed to avoid clashing product states. Any test covering a persistent state is vulnerable; usually, this is the vast majority of tests. Consider web app testing as an example. Tests to load a page and do some basic interactions can probably run in parallel without extra protection, but tests that use a login to enter data or change settings could certainly collide. In this case, collisions could be avoided by using different logins for each simultaneous test instance – by using either a pool of logins, a unique login per test case, or a unique login per thread/process. Each product is different and will require different strategies for avoiding collisions.


We all share certain environments. Take care of them when you do. (Photo: The Blue Marble, taken by the Apollo 17 crew on Dec 7, 1972)

Handling Automation-Level Collisions

Automation-level collisions can happen when automation code is not thread-safe, which could mean more than simply locks and semaphores.

#1: Test Independence

Test cases must be completely independent of each other. One test must not require another test to run before it for the sake of setup. A test case should be able to run by itself without any others. A test suite should be able to run successfully in random order.

#2: Proper Variable Scope

If parallel tests will be run in the same memory address space, then it is imperative to properly scope all variables. Global or static mutable variables (e.g., “non-constants”) must not be allowed because they could be changed unexpectedly. The best pattern for handling scope is dependency injection. Thread-safe singletons would be a second choice. (Typically, global or static variables are used to subvert design patterns, so they may reveal further necessary automation rework when discovered.)

#3: External Resources

Automation may sometimes interact with external resources, such as test config files or test result databases/services. Make sure no external interactions collide. For example, make sure test run updates don’t overwrite each other.

#4: Logging

Logs are very difficult to trace when multiple tests are simultaneously printed to the same file. The best practice is to generate separate log files for each test case, thread, or process to make them readable.

#5: Result Aggregation

A test suite is a unified collection of tests, no matter how many threads/processes are used to run its tests in parallel. Make sure test results are aggregated together into one report. Some frameworks will do this automatically, while others will require custom post-processing.

#6: Test Filtering

One strategy to avoid collisions may be to run non-colliding partitions (subsets) of tests in parallel. Test tagging and filtering would make this possible. For example, tests that require a special login could be tagged as such and run together on one thread.

Test Scalability

The previous section on collisions discussed how to handle product environments. It is also important to consider how to handle the test automation environment. These are two different things: the product environment contains the live product under test, while the test environment contains the automation software and resources that run tests against the product. The test environment is where the parallel tests will be executed, and, as such, it must be scalable to handle the parallelization. A common example of a test environment could be a Jenkins master with a few agents for running build pipelines. There are two primary ways to scale the test environment: scale-up and scale-out.

Parallel Scale-Up

Scale-up is when one machine is configured to handle more tests in parallel. For example, scale-up would be when a machine switches from one (serial) thread to two, three, or even more in parallel. Many popular test runners support this type of scale-up by spawning and joining threads in a common memory address space or by forking processes. (For example, the SpecFlow+ Runner lets you choose.)

Scale-up is a simple way to squeeze as much utility out of an existing machine as possible. If tests are designed to handle collisions, and the test runner has out-of-the-box support, then it’s usually pretty easy to add more test threads/processes. However, parallel test scale-up is inherently limited by the machine’s capacity. Each additional test process succumbs to the law of diminishing returns as more memory and processor cycles are used. Eventually, adding more threads will actually slow down test execution because the processor(s) will waste time constantly switching between tests. (Anecdotally, I found the optimal test-thread-to-processor ratio to be 2-to-1 for running C#/SpecFlow/Selenium-WebDriver tests on Amazon EC2 M4 instances.) A machine itself could be upgraded with more threads and processors, but nevertheless, there are limits to a single machine’s maximum capacity. Weird problems like TCP/IP port exhaustion may also arise.

Scale Up

Scale-up adds more threads to one machine.

Parallel Scale-Out

Scale-out is when multiple machines are configured to run tests in parallel. Whereas scale-up had one machine running multiple tests, scale-out has multiple machines each running tests. Scale-out can be achieved in a number of ways. A few examples are:

  • One master test execution machine launches multiple Web UI tests that each use a remote Selenium WebDriver with a service like Selenium Grid, Sauce Labs, or BrowserStack.
  • A Jenkins pipeline launches tests across ten agents in parallel, in which each agent executes a tenth of the tests independently.

Scale-out is a better long-term solution than scale-up because scale-out can handle an unlimited number of machines for parallel testing. The limiting factor with scale-out is not the maximum capacity of the hardware but rather the cost of running more machines. However, scale-out is much harder to implement than scale-up. It requires tests to be evenly divided with some sort of balancer and filter. It also requires some sort of test result aggregation for joint reporting – people won’t want to piece together a bunch of separate reports to get an overall snapshot of quality. Plus, the test environment is more complicated to build and maintain (though tools like CloudBees Jenkins Enterprise or Amazon EC2 can make it easier.)

Scale Out

Scale-out distributes tests across multiple machines.

Upwards and Outwards

Of course, scale-up and scale-out are not mutually exclusive. Scaled-out nodes could individually be scaled-up. Consider a test environment with 10 powerful VMs that could each handle 10 tests in parallel – that means 100 tests could run simultaneously. Using the Rule of 1’s, it would take only about a minute to run 100 Web UI tests, which serially would have taken over an hour and a half! Use both strategies to shorten start-to-end runtime as much as possible.


Parallel testing is a worthwhile endeavor. When done properly, it will not only reduce development time but also improve the development experience. For readers who want to start doing parallel testing, I recommend researching the tools and frameworks you want to use. Many popular test frameworks support parallel execution, and even if the one you choose doesn’t, you can always invoke tests in parallel from the command line. Do well!