development

Surviving Without Python


Python is such a popular language for good reason: Its principles are strong. However, if Python is “the second-best language for everything”… that means the first-best is often chosen instead. Oh no! How can Pythonistas survive a project or workplace without our favorite language?

Personally, even though I love Python, I don’t use it daily at my full time job. Nevertheless, Pythonic thinking guides my whole approach to software. I will talk about how the things that make Python great can be applied to non-Python places in three primary ways:

  1. Principles from the Zen of Python
  2. Projects that partially use Python
  3. People who build strong, healthy community

Check out my talk, Surviving Without Python, from PyOhio 2019! It was one of the most meaningful talks I’ve ever given.

Frameworks: All-in-One or Piece-by-Piece?

Software frameworks are great because they apply the principle of Separation of Concerns. A framework’s tools and code handle a specific need in a standard way for developers to write other code more easily. For example:

  • Web frameworks support receiving requests and sending responses.
  • Test frameworks include test case structure, runners, and reporting mechanisms.
  • Logging frameworks control how messages are gathered and stored.
  • Dependency injection frameworks create and manage object instances.

Recently, a question hit me: How far should a framework go to separate concerns? Should a framework try to do everything all-in-one, or should it behave more like a library that focuses on doing one thing well?

Let’s look at Python Web frameworks as an example. Django, the “Web framework for perfectionists with deadlines,” provides everything a developer could want out of the box. Flask, on the other hand, is a “microframework” that prides itself on minimalism: any extras must be handled by extensions or other packages. The differences between the two become clear when comparing some of their features:

Feature Django Flask
HTTP Requests and Routing Included Werkzeug (bundled)
Templates Included Jinja2 (bundled)
Forms Included None (Flask-WTF)
Object-Relational Mapping (ORM) Included None (SQLAlchemy)
Security Included None (Flask-Security)
Localization Included None (Flask-Babel)
Admin Interface Included None

Clearly, Django is all-in-one, while Flask is piece-by-piece. To make a serious Flask app, developers must pull in many extra pieces. There are many other frameworks with similar competitions:

  • JavaScript testing: Jasmine vs. Mocha
  • JavaScript development: Angular vs. React
  • Java BDD testing: Serenity vs. Cucumber-JVM

I think each approach has its merits. All-in-one frameworks are more convenient to use, especially for beginners. For developers who are new to a domain or just need to get something up fast, all-in-ones are the better choice. They come with all units already integrated together, and they often have good documentation. However, developing an all-in-one framework takes much more work because it covers multiple concerns. Developers may also feel shoehorned into the framework’s way of doing things. All-in-ones typically dictate what they believe to be the “best” solution.

Piece-by-piece frameworks require more expertise but offer greater flexibility. Developers can pick and choose the pieces they need, and they can change the packages used by the solution more easily. Found a better ORM? Not a problem. Need to localize the site in Chinese? Add it! Solutions can avoid excess weight and stay nimble for the future. The big challenge is successful integration. Furthermore, a library or framework for a singular concern tends to solve the concern in better ways simply because project contributors give it exclusive focus. The more I learn about a space, the more I lean towards a piece-by-piece approach.

As always, pick frameworks based on the needs at hand. For example, I like to use Django to make websites for my wife’s small businesses because the admin interface is just so convenient for her, even though I could get away with Flask. However, I’ll probably pick Mocha (piece-by-piece) over Jasmine (all-in-one) whenever I return to JavaScript testing.

Speaking Pythonese

The Python community, like many groups, has its own language – and I don’t mean just Python. There are many words and phrases thrown around that may confuse people new to Python. I originally shared some terms in my article, Which Version of Python Should I Use?, but below are some more of those colloquialisms for quick reference:

Word or Phrase Meaning
Anaconda
  • Open-source implementation of Python (and R)
  • Meant for data scientists
  • Uses the conda package manager
Benevolent Dictator for Life (BDFL)
  • Guido van Rossum
  • The inventor of Python
  • Resigned in July 2018 but remains BDFL Emeritus
The CheeseShop
  • a fun code name for the Python Package Index
Class
  • A programming definition for creating objects
  • Combines attributes (variables) and behaviors (methods)
  • Useful for reusing code
Conda
  • Package manager for Python and other languages
  • Part of the Anaconda project
Core Developer
  • A developer who has commit privilege to the CPython codebase
  • Very few Pythonistas are core developers
CPython
  • The default and most widely used implementation of the Python language
  • Implemented in C
Django
  • A batteries-included Python Web framework for perfectionists with deadlines
  • Offers many features out of the box
  • The most popular Python framework in 2017
  • Size: Flask < Pyramid < Django
Flask
  • A microframework for Python Web development
  • Uses Werkzeug and Jinja2
  • Super-minimalist
  • Size: Flask < Pyramid < Django
Function
  • A definition for a callable subroutine
  • May take inputs
  • May return outputs
  • Great for code reuse
  • Functions are first-order values
The Hitchhiker’s Guide to Python
  • A popular online guide to Python
  • Opinionated
  • Shares many good best practices
Jinja2
  • A Python template engine
  • Inspired by Django’s templates
Jupyter
  • A project for interactive Python development
  • “Jupyter Notebooks” allow programmers to dynamically rewrite and rerun Python code, and then share code easily with others
  • Popular with data scientists
Module
  • A Python source code file containing definitions and statements
  • Every .py file is a module
  • May be imported by other modules to reuse code
NumPy
  • A popular Python package for scientific computing
Pandas
  • A popular Python package for data analysis
pip
  • The PyPA-recommended tool for installing Python packages
  • Command: “pip install “
  • Recursive acronym for “pip installs packages”
PyBites
  • A community of Pythoneers who improve their skills through code challenges
PyCharm
  • A popular Python IDE developed by JetBrains
  • Offers great development features
  • Has a free Community Edition and a paid Professional Edition
PyCon
  • The annual Python conference held in North America
  • GO – it will change your life!
  • Several other conferences are held worldwide
PyPy
  • An alternative Python implementation
  • Known for speed, memory usage, and compatibility
  • Good alternative to CPython for high performance workloads
Pyramid
  • A Python Web framework
  • Start small, finish big, stay finished
  • Provides many parts but not everything (such as ORM)
  • Size: Flask < Pyramid < Django
pytest
  • A lightweight-yet-powerful Python test framework (and arguably the best)
Python 2
  • The old version of Python
  • Will reach end-of-life in 2020
  • Final version will be 2.7.x
  • Please upgrade to version 3
Python 3
  • The current version of Python
  • Most packages now support Python 3
  • Has incompatibilities with Python 2
  • Please don’t use Python 2
Python Enhancement Proposal (PEP)
  • Official proposals for enhancing the Python language
Python Package Index (PyPI)
The Python Software Foundation (PSF)
  • Non-profit organization
  • Keeps Python going strong
  • Support them!
Pythoneer
  • A programmer who uses Python to solve problems
  • Styled off the word “engineer”
Pythonic
  • Describes idiomatic code for Python
  • Closely related to conciseness, readability, and elegance
  • Highly recommended
  • Follow style guidelines to learn how to be Pythonic
Pythonista
  • Someone who loves the Python language
  • Often an advanced Python programmer
Sphinx
  • A popular Python tool to generate documentation
virtualenv
  • Tool to create isolated Python environments
  • Enables programmers to use different versions of Python and packages for different projects
  • Also see venv and pipenv
Web Server Gateway Interface (WSGI)
  • A specification for how Web servers forward requests to Web applications and frameworks
  • A core piece of Python Web development
  • See PEP-333 and PEP-3333
The Zen of Python
  • The list of guiding principles for Python’s design
  • Run “import this” to see them
  • See PEP-20

The Software Engineer in Test

I am a Software Engineer in Test (SET). Many people don’t know quite what that means, though. Developers frequently refer to me as a “tester” or “QA,” and a former director once thought I did DevOps. While my work covers these areas, they aren’t the main focus. Let’s clarify what it means to be a SET.

What is a Software Engineer in Test?

A “Software Engineer in Test” (a.k.a. “Software Development Engineer in Test”) is a software developer who develops software for testing: tools, frameworks, and automated tests. SETs focus primarily on automation for running tests quickly and repeatedly. Test automation is a software product: just as front-end developers write web pages and back-end developers write microservices, SETs write automated tests. The same practices and coding skills apply. I frequently say that a Software Engineer in Test must have the heart of a developer.

So they just write test scripts?

No. Never say this to a SET. Test automation involves much more than just “writing test scripts.” A serious testing solution requires serious design and effort. The top-level automation of a test case is usually just a small piece. SETs are responsible for:

  • Collaborating with developers and product owners
    • Contributing to planning and design
    • Reviewing product code
    • Formulating test scenarios
  • Developing test automation frameworks
  • Automating test scenarios using the frameworks
  • Knowing and using design patterns where appropriate
  • Setting up the infrastructure to run tests
    • Running tests in CI/CD
    • Running tests in parallel
    • Running tests with the appropriate test data
  • Setting up dashboards for reporting test results in real time
  • Teaching others good quality and testing practices
  • Developing tools to assist manual and exploratory testing

Saying that testing is “just scripting” belittles the role of the SET and underestimates the workload it requires.

How did this role start?

Software “tester” and “QA” roles have existed for decades, but the SET role first became distinct in the 2000s when large-scale test automation became both feasible and necessary. According to Wikipedia, Microsoft coined the title “Software Developer Engineer in Test” (SDET) in 2005, and others like Amazon and Apple quickly adopted it. Google coined the name “Software Engineer in Test” for the same type of role. I personally prefer the SET title over the SDET title simply because it is more concise.

How is it different from “QA” or “testing” roles?

To me, the SET role is distinct from the traditional “QA” or “testing” role. Software testers historically focused on manual testing and thus didn’t need strong programming skills. Their fortes were product domain knowledge, intuition, out-of-the-box thinking, test planning, and test system setup. And these certainly are important, indispensable skills! SETs, however, live in both the development and testing worlds. They use developer skills to provide software solutions for testing problems. Automation is much more central to the SET role. These days, almost all “testing” job openings are SET roles; manual-only test roles have been largely deprecated. Nevertheless, there will always be a need for manual testing because there are some problems a human can catch much more easily than a script (or an AI agent).

Personally, I avoid using the titles “QA” and “software tester” for myself because they don’t accurately describe all that I do. I also avoid the title “automation engineer” because, again, it is reductionist. I tackle software testing with the heart of a developer, and I set up test automation solutions from the ground up. I’m proud to be a software engineer who specializes in testing.

 

As a bonus, check out Test and Code episode 47, in which Brian Okken and I discuss what it means to be a Software Engineer in Test (among other topics).

 

Ignoring Files with Git

Git is one of the most popular version control systems (VCS) available, especially thanks to hosting vendors like GitHub. It keeps code safe and shareable. Sometimes, however, certain files should not be shared, like local settings or temporary configs. Git provides a few ways to make sure those files are ignored.

.gitignore

The easiest and most common way to ignore files is to use a gitignore file. Simply create a file named .gitignore in the repository’s root directory. Then, add names and patterns for any files and directories that should not be added to the repository. Use the asterisk (“*”) as a wildcard. For example, “*.class” will ignore all files that have the “.class” extension. Remember to add the .gitignore file to the repository so that it can be shared. As a bonus, Git hosting vendors like GitHub usually provide standard .gitignore templates for popular languages.

Any files covered by the .gitignore file will not be added to the repository. This approach is ideal for local IDE settings like .idea or .vscode, compiler output files like *.class or *.pyc, and test reports. For example, here’s GitHub’s .gitignore template for Java:

.git/info/exclude

As a best practice, .gitignore should be committed to the repository, which means all team members will share the same set of ignored files. However, some files should be ignored locally and not globally. Those files could be added to .gitignore, but large .gitignore files become cryptic and more likely to break other people’s setup. Thankfully, git provides a local-only solution: the .git/info/exclude file (under the repository’s hidden .git directory). Simply open it with a text editor and add new entries using the same file pattern format as .gitignore.

# Append a new file to ignore locally
echo "my_private_file" >> .git/info/exclude

skip-worktree

.gitignore file prevents a file from being added to a repository, but what about preventing changes from being committed to an existing file? For example, developers may want to safely override settings in a shared config file for local testing. That’s where skip-worktree comes in: it allows a developer to “skip” any local changes made to a given file. Changes will not appear under “git status” and thus will not be committed.

Use the following commands:

# Ignore local changes to an existing file
git update-index --skip-worktree path/to/file

# Stop ignoring local changes
git update-index --no-skip-worktree path/to/file

Warning: The skip-worktree setting applies only to the local repository. It is not applied globally! Each developer will need to run the skip-worktree command in their local repository.

assume-unchanged

Another option for ignoring files is assume-unchanged. Like skip-worktree, it makes Git ignore changes to files. However, whereas skip-worktree assumes that the user intends to change the file, assume-unchanged assumes that the user will not change the file. The intention is different. Large projects using slow file systems may gain significant performance optimizations by marking unused directories as assume-unchanged. This option also works with other update-index options like really-refresh.

Use the following commands:

# Assume a file will be unchanged
git update-index --assume-unchanged path/to/file

# Undo that assumption
git update-index --no-assume-unchanged path/to/file

Again, this setting applies only to the local repository – it is not applied globally.

Comparison Table

Which is the best way to ignore files?

Method Description Best Use Cases Scope
.gitignore file Prevents files from being added to the repository. Local settings, compiler output, test results, etc. Global
.git/info/exclude file Prevents local files from being added to the repository. Local settings, compiler output, test results, etc. Local
skip-worktree setting Prevents local changes from being committed to an existing file. Shared files that will have local overwrites, like config files. Local
assume-unchanged setting Allows Git to skip files that won’t be changed for performance optimization. Files and folders that a developer won’t touch. Local

Resources

NuGet Quick Reference

What is NuGet?

NuGet is a package manager for Microsoft .NET. It installs packages and manages dependencies for .NET projects. It is like Maven (Java) or pip (Python). The NuGet Gallery hosts thousands of popular packages like Json.NET, NUnit, and jQuery. If you develop .NET applications (like in C#), then you probably need to use NuGet.

Installing Packages

The easiest way to use NuGet is through Visual Studio, which includes NuGet features by default. Packages are managed per project. Right-click on a project in Solution Explorer and select “Manage NuGet Packages…” to open the project’s package manager page.

  • The Browse tab lets you search and install new packages.
  • The Installed tab shows which packages are installed and can uninstall them.
  • The Updates tab lets you update packages to their latest versions.

Nuget Package Manager Page

The NuGet Package Manager page for a project in Visual Studio

When packages are installed and updated, NuGet also pulls any dependencies they require. Visual Studio also creates a packages.config file for all dependencies. Then, just build and run!

NuGet Configuration

NuGet can be configured using a NuGet.Config file. This file can be placed under a project directory, a solution directory, or a system-wide location. One of the most common settings is the package sources: NuGet uses the public nuget.org repository by default, but others (like private company repos) can also be added. Check the nuget.config reference online for docs on all options. (Package sources can also be configured through Visual Studio under Tools > NuGet Package Manager > Package Manager Settings.)

NuGet Package Manager Console

Sometimes, it’s helpful to control NuGet directly through the Package Manager Console. From the menu bar: Tools > NuGet Package Manager > Package Manager Console. For example, when packages get messed up, I’ll run “Update-Package -Reinstall” to reinstall everything. (Right-clicking the solution and selecting “Restore NuGet Packages” never seems to work for me.) Check the help command or the official guide for more info.

Nuget Package Manager Console

The NuGet Package Manager Console in Visual Studio

NuGet CLI

The NuGet CLI nuget.exe provides the full extent of NuGet features, including the ability to make packages. It is more powerful than the Package Manager Console. It must be installed independently – it does not come with Visual Studio. Check the NuGet CLI reference online for full details. The .NET Core CLI dotnet.exe can also be used for managing packages. See the feature comparison for the differences.

nuget CLI

The NuGet CLI

Creating a NuGet Package

A NuGet package is basically a ZIP file with a .nupkg extension. It typically contains an assembly DLL and maybe other related files. Creating a NuGet package is pretty easy:

  1. Install the NuGet CLI.
  2. Create a .nuspec file for the project.
  3. Add appropriate settings to the .nuspec file.
  4. Run the “nuget pack” command to create the .nupkg file.
  5. Publish the .nupkg file to the desired destination.

The .nuspec file can be created by running the “nuget spec” command in the project’s directory. The generated <project-name>.nuspec file will contain replacement tokens that will be substituted with values from the project’s AssemblyInfo when the package is built. Make sure to set AssemblyInfo values appropriately for the substitution. The version is especially important, and the automatic version format may be useful for guaranteeing uniqueness. Be sure to add any packages upon which the project depends as dependencies, too. (The .nuspec file can also be created manually.) Refer to the .nuspec reference for full details.

The standard package creation command is “nuget pack <project-name>.nuspec”. However, if the .nuspec file contains replacement tokens, then use “nuget pack <project-name>.csproj” instead. Once the package is created, it can be published publicly to nuget.org or to a private NuGet feed.

Below is an example .nuspec file with replacement tokens:

Resources

 

The Panda’s Dozen: Top PyCon 2018 Talks

There were tons of great talks at PyCon 2018 – more than I could attend in person – that are now available on the PyCon 2018 YouTube channel. This post has links to my favorites. Enjoy!

Check out PyCon 2018 Reflections to read my personal reflections, too. Watch my talk, Behavior-Driven Python, too!

By the Numbers: Python Community Trends in 2017/2018 (Dmitry Filippov, Ewa Jodlowska) – At the end of 2017, the Python Software Foundation teamed up with JetBrains to conduct an official Python Developers Survey. Data science is taking Python by storm, and Python 3 now has majority adoption. There are tons of other cool statistics, too!

How Netflix does failovers in 7 minutes flat (Amjith Ramanujam) – That speed at that scale is mind-blowing. This is a fascinating talk, even for non-engineers!

Solve Your Problem With Sloppy Python (Larry Hastings) – “If you ever start writing a shell script, delete it and write a Python script instead.” This talk is a jovial reminder that Python is a powerful tool, even for hack-n-slash jobs.

The AST and Me (Emily Morehouse-Valcarcel) – Emily gives a great overview of the inner workings of the Python language. This talk is a must-see for anyone into compiler theory.

Dataclasses: The code generator to end all code generators (Raymond Hettinger) – Dataclasses are new data structures to Python to generate classes based on specs.

Pipenv: the Future of Python Dependency Management (Kenneth Reitz) – Pipenv is a new tool that makes pip, Pipfile, and virtualenv easier to use together. Kenneth gives a good overview of Python packaging and why pipenv is awesome.

Type-checked Python in the real world (Carl Meyer) – Sometimes, I wish Python had static typing. Now, it can! Facebook has done some innovative things to make it possible.

Beyond Unit Tests: Taking Your Testing to the Next Level (Hillel Wayne) – Property tests + contracts = integration tests. Hillel gives a fantastic strategy for making tests smarter.

“WHAT IS THIS MESS?” – Writing tests for pre-existing code bases (Justin Crown) – This is a pragmatic guide to adding new tests to old code. Now, you’ll never procrastinate on your tests again!

Demystifying the Patch Function (Lisa Roach) – Mocking is a great practice for limiting scope in unit tests. Whether using unittest.mock or other patching/mocking packages, this is a great talk for learning why and how to do mocking in unit tests!

Automating Code Quality (Kyle Knapp) – One of Python’s most beloved traits is its elegance. Maintaining high standards of code quality can be challenging for large projects, though. Kyle shows how to use existing tools to drive higher quality.

Keynote (Ying Li) – [35:07] – Ying is a software security engineer at Docker. In her keynote, she urged all people involved in technology to learn the basics of security. Definitely watch the video recording – she used a fun story to illustrate her points.

Keynote: The People and Python (Qumisha Goss) – [1:07:35] – “Q” is a librarian at the Detroit Public Library who taught herself Python so she could teach coding classes to kids. She shared the highs and lows of her experiences, especially in light of many disadvantages her students had. My favorite takeaway was to “cultivate greatness in others.”

Pipenv: Python Packagement for Champions!

While recently deploying a new Python Django app to Heroku, I noticed the documentation mentioned a tool I hadn’t known before: pipenv. I thought to myself, “Great, now I need to learn a new tool. What was so bad about pip and virtualenv?” So, I did my research, and BOOM! Yes. Mind blown. Life changed. This.

What It Is

Pipenv is the Python packaging and environments tool for champions.

  • It unites pip, Pipfile, and virtualenv into a sophisticated workflow with simple commands.
  • It automatically creates virtual environments for projects.
  • It automatically updates package dependencies (and their dependencies).
  • It locks versions for deterministic builds.

Despite some controversy and limitations, I strongly recommend using pipenv for most new Python projects. The Python Packaging Authority recommends it, too.

What It’s About

Packages and environments (“packagement”) are essential to Python development. Typically, Pythoneers create a virtual environment for each project and install dependent packages into it locally using pip. They then “freeze” the dependencies into a requirements.txt file so that others can easily recreate the environment. Virtual environments thus enable different projects to use different package versions without global conflict.

Unfortunately, this traditional workflow has some problems:

  • It uses multiple tools instead of one and requires many commands.
  • Different projects can do the workflow differently, which can be confusing.
  • The requirements.txt file must be manually generated and can easily fall out of date.
  • Dev-only dependencies are a hassle to separate.
  • Uninstalling packages will not remove sub-packages.
  • Dependencies with version ranges instead of fixed versions cause nondeterministic builds.

Pipenv solves these problems by combining pipPipfile, and virtualenv into a standard workflow that automatically handles and locks package updates.

How to Use It

See how simple it is to use pipenv with a Python project:

# Install pipenv
pip install pipenv

# Create a new project directory
mkdir panda_project
cd panda_project
echo "print('hello')" > main.py

# Init pipenv:
# Creates a virtual environment
# Then creates Pipfile and Pipfile.lock files
pipenv install

# Install a package:
# Updates the Pipfiles
pipenv install requests

# Install a dev-only package:
# Updates the Pipfiles
pipenv install --dev pytest

# Run commands in the environment
pipenv run python --version
pipenv run python main.py

More Info

There’s no need for me to repeat what other people have already said:

 

 

giphy

Me, after using pipenv for the first time.

 

[8/24/2018 Update: Mentioned some of the controversy and limitations of pipenv.]

Quality Metrics 101: Product Quality

New to the series? Start from the beginning!

Product quality metrics measure the excellence of a product and its features. They measure the “goodness” inherent in the product, apart from how the product was developed. High-quality processes and tests contribute to, but do not alone guarantee, high-quality products. That’s why quality must be built into the product from the start and checked throughout all phases of development. Below are metrics for assuring quality in the delivered products.

0024977_knex-education-energy-motion-aeronautics-set

Functionality

Quality Aspect Does the product work correctly?
Desired State True – Features either work, or they don’t.
Metrics Test Failure Rate – The whole purpose of functional testing is to determine which features work and which don’t. Assuming test quality is high, the test failure rate is the single best indicator of product functionality. Higher failure rates mean more broken features. Teams should target low-to-zero test failures. It may be useful to keep a failure history for each test. For large products, it may also be useful to break down failure rates by feature area.

It is imperative to recognize, however, that the test failure rate is meaningful only if test quality is high – meaning that tests have good coverage and reliability. Poor-quality tests will give untrustworthy results. For example, weak coverage could mean that failure rate is low because functionality is not truly exercised, and poor reliability could mean that failure rate is high because tests always crash. Be sure to back up any reporting on test failure rate with assurance that test quality is high (using test quality metrics).

shutterstock_362478776

Stability

Quality Aspect Does the product work reliably?
Desired State High – Product functionality should be consistently good and available.
Metrics Build Failure Rate – The build failure rate is the proportion of builds that have failed for whatever reason over a given period of time. While process metrics focus on response times to fix broken builds, the build failure rate itself indicates the health of the product while it is being developed. It does not track how badly a build failed like test failure rate does, but instead it impartially tracks ultimate success or failure. Make sure to limit the history of builds included in the calculation to keep it relevant (such as the last 30 days or so). Occasional build failures are acceptable as long as they are fixed quickly. High build failure rates indicate product instability, which could be due to design flaws, weak pre-check-in testing, tricky bugs, or even pipeline faults.

Uptime – Uptime refers to the total time a system is usable. For example, consider a website that must go down for a one-hour service window every week – its uptime would be 167/168 = 99.4%. Not all downtime is planned, however. A bad deployment during maintenance could knock that website offline for an additional 3 hours – dragging uptime down to 97.6% for the week. This may not seem bad at first, but it’s quite terrible when considering that (a) lost time is lost money and (b) the goal of Six Sigma is 99.99966%. A product should have near-perfect availability. System monitoring tools can easily measure uptime. Low uptime indicates either poor design or lack of failover redundancy.

ekonomi-depar

Performance

Quality Aspect Does the product work optimally?
Desired State Optimal – Performance should be at its best in all areas.
Metrics There are four classic software performance metrics. They may be applied in various ways to aspects of product behavior. Ultimately, software products should have a minimal impact on the system while providing a maximal capacity for work.

Processor Usage – Processor cycles should not be needlessly wasted. Make sure algorithms are efficient in terms of computational complexity (big O) and implementation details.

Memory Usage – Watch out for both memory bloat (when features take up a lot of memory unnecessarily) and memory leaks (when memory is not freed up after it is no longer needed.)

Response Time – Response time, or latency, measures the turnaround time from when an action is taken to when the actor receives feedback that the action is completed. Common examples of response time are web page loading, REST API call responses, and database queries. Response time should be as short as possible.

Throughput – Throughput measures how much load a system can handle. It could refer to data I/O bandwidth, transactions per time unit, number of concurrent users, etc. Typically, higher stress on a system will cause other performance metrics to degrade. The “sweet spot” to find is the maximum throughput value that does not unacceptably impact other performance aspects.

01_ma27836-edit

 

Complexity

Quality Aspect Is the software code unnecessarily complicated?
Desired State Minimal – Simple is better than complex. Complex is better than complicated. (See The Zen of Python.)
Metrics There are a number of code metrics that indicate complexity in various ways.

Lines of Code – One of the most rudimentary metrics is to count the lines of code. All things equal, line count indicates the magnitude of the software product, with the assumption that fewer lines will be easier to maintain. Any modern IDE (or, worst case, shell scripting) can yield line counts. However, all things are not equal, and line count alone does not indicate quality or efficiency.

Cyclomatic ComplexityCyclomatic complexity measures the number of different execution paths the code can take. It is more meaningful than counting sheer lines of code because it indicates the magnitude of testing needed for full coverage. Lower values are better. Cyclomatic complexity is a popular code metric, and many modern analysis tools can measure it.

Depth of Inheritance – For object-oriented languages, the depth of inheritance measures the maximum length of a class inheritance tree from child class to its ultimate root. For example, in the class inheritance tree of Tiger > Cat > Animal > Object, Tiger would have an inheritance depth of 3. Lower values are desirable because they make classes easier to understand.

There are countless other code metrics available. For example, Microsoft Visual Studio calculates the metrics above plus a maintainability index and class coupling. Halstead metrics are another way to measure complexity.

customer_exp

Satisfaction

Quality Aspect Does the product satisfy the end user?
Desired State High – The product should meet the end user’s needs, and the end user should like using it.
Metrics Customer satisfaction is inherently subjective, so trying to measure it is difficult. Ultimately, the end users must find compelling value in the product over other alternatives, or else they won’t use it or buy it. There are many ways to attempt to gauge customer satisfaction: surveys, interviews, A/B testing, etc. Statistics and psychology also play a part. Check out articles here, here, and here to get some ideas.

Quality Metrics 101: Process Quality

New to the series? Start from the beginning!

Process quality metrics make sure that software development practices build good, high-quality features. Healthy software processes identify and resolve issues as early as possible because later bug discovery means higher cost to fix. Quality starts at inception, when features are first brainstormed, and it carries through design, implementation, and testing. Every step in the development process should have quality checkpoints: acceptance criteria for planning, reviews for design and implementation, and reports for testing. Process quality metrics primarily focus on delivery speed or the effectiveness of feedback loops to make sure a team is responding appropriately to change.

Note: Standard software development methodologies often come with canned metrics. For example, Agile Scrum focuses heavily on velocity for determining a team’s capacity for work, while Agile Kanban focuses heavily on lead time and cycle time for measuring how fast work gets done. This article will not cover methodology-specific metrics – please refer to external resources to learn more about them. Instead, this article will cover generic aspects of process quality.

delivery_truck_accidents

Delivery Speed

Quality Aspect How fast are new features with high quality delivered to the end user?
Desired State ASAP – Deliver them fast without compromising quality.
Metrics People are impatient – they always want things as soon as possible. Fast delivery speed is thus crucial for businesses to meet client expectations and respond quickly to change. However, delivery speed is not the sole metric for success: it must be counterbalanced with safety measures. Delivery speed could be absolutely minimized by committing changes directly to production, but that’s a terrible practice because the damage risk is too high. The best strategy is to pursue the fastest speed without sacrificing too much coverage.

Time to Production – Time to production focuses on the time it takes for a developer’s checked-in code to become useful to end users. It’s a decent way to judge from a business perspective how quickly new stuff gets out the door. Measure the total time for each code check-in from when it is first committed to when it is deployed to production. Source control logs and deployment histories can be pieced together to measure the total time. It may be beneficial to split check-ins by feature area and to review distributions rather than averages. Short, consistent times are desirable. Long times reveal delays in testing, fixing, and deploying changes.

Pipeline Speed – Pipeline speed is a DevOps-y metric. Measure the total start-to-end time from triggering the build pipeline to the final deployment, and measure the time taken by each stage. This will give insights into bottlenecks, such as: system resource exhaustion, network delays, being stuck in job queues, tests that are too long, etc. Knowing each stage will indicate where the greatest optimizations can occur. For example, parallel test execution can significantly reduce total pipeline time. Use pipeline speed metrics to find efficiencies, not to justify cutting vital stages. Most modern continuous integration systems should provide time metrics.

Test Coverage per Time Period – There is always a tradeoff between test coverage and delivery speed. Assuming tests have optimally efficient execution times, higher coverage means slower delivery. Whenever time periods are fixed (such as CI pipeline limits or release deadlines), the best strategy is to maximize test coverage during the available time. For this purpose, coverage should be heuristically scored in terms of feature coverage priority (or the importance of the behaviors under test), not so much in terms of numerical code coverage. Then, for each test, divide the coverage score by the execution time. Sort tests by this ratio, and select the tests with the greatest scores until the total test execution time reaches the time limit. This approach guarantees that maximal test coverage will be achieved in the given period. It may also be advantageous to determine a threshold score for minimal coverage – if the maximum score for a given time period is below the minimal coverage threshold, then the time period should be increased. This metric is compelling if, for example, a CI pipeline needs more time for tests but managers are hesitant to slow down delivery.

Note: The metrics here cover speed after code is checked in, focusing on operational excellence. Metrics covering speed before code is checked in are important but are typically already covered by standard processes (like Scrum’s velocity). There are several ways to measure speed before code check-in: development time, backlog age, story completion rate, etc. Slow times before check-in indicate that a team is overloaded with work, lacks focus on priorities, or is being disrupted too frequently. However, one major caution for these metrics is that they are difficult to accurately measure, and they presume artifacts are logged precisely at event times. For example, if a story ticket is not created until a week after a new feature was first inspired, then the actual times measured will be inaccurate.

health-screening-2

Feedback Notification

Quality Aspect How quickly does a team identify problems?
Desired State Fast – Fast feedback helps teams resolve issues quickly before they become more costly.
Metrics Software development is the poster child for Murphy’s Law: anything that can go wrong will. Problems will happen. Metrics targeting perfection (such as 100% pass rates or 0-bug counts) are foolishly impossible and hopelessly destructive. Instead, metrics should gauge feedback loops – how well a team handles problems as they arise. Feedback has two parts: (1) notification time to discover and report problems, and (2) response time to fix problems. Ultimately, the sum should be minimal, but separating the parts identifies bottlenecks. This section covers notification.

Code Review Effectiveness – Code reviews are often the second line of defense against bugs (the first line being the author themselves). They grant an opportunity for other experts to inspect code for problems before fully committing changes. However, measuring the effectiveness of code reviews can be tricky. A few metrics to consider are:

  • Percentage of code check-ins that undergo review, if the team notoriously skips reviews
  • Average review turnaround time, if reviews are ignored
  • Code change size in terms of line number or another similar unit, if reviews are too large for teams to handle effectively
  • Issues caught, whenever a review successfully identifies and resolves an issue

Issue Discovery Time – The sooner issues are discovered, the less costly they are to resolve. “Issues” typically mean defects in the product (e.g., “bugs”), but they could include problems with the environment, deployment, or tests. The simplest form of issue discovery time is the measurement from when a pipeline starts to the time the issue is discovered. More advanced measurements can track time back to the root cause, such as when code containing a bug was committed, but these may be difficult to gather or may be less accurate. Issue types should be analyzed as separate distributions. Look specifically for blocking issues that appear late in the pipeline, such as critical services being down, and add checks early in the pipeline to discover them ASAP.

Bugs per Phase – Raw bug counts, like test counts, are not helpful beyond soundbites, but the proportions of bug counts per phase are useful for determining test effectiveness. A well-engineered pipeline should have meaningful phases (or “stages” or “steps”) with feedback after each one. A typical pipeline could have phases for build, unit tests, integration tests, end-to-end tests, and production deployment. Ideally, bugs should be caught in the shortest time, at the lowest level, and in the earliest phase. For example, if the majority of bugs are caught by end-to-end tests or (gasp!) in production, then the lower-level tests might need stronger coverage.

1200px-metallic_shield_bug444

Feedback Response

Quality Aspect How quickly does a team resolve problems once they are found?
Desired State Fast – Again, resolve issues quickly before they become more costly.
Metrics Time to Fix a Broken Build – Build health is vital for successful software development, especially in continuous integration. After a build is broken, it must be fixed ASAP so that it does not block progress. “Fixing” a build means that the pipeline can run to completion with an acceptable test passing rate. Fixing a build may mean:

  • Fixing a bug in the product
  • Fixing a problem in the environment, deployment, or tests
  • Reverting a code check-in that caused a bug
  • Updating tests to somehow flag the failure

Subverting safety checks (like removing tests or skipping phases) is not acceptable because it doesn’t truly fix the build’s underlying problems.

Measure the time it takes from when a pipeline reports a broken build to when the pipeline produces the first subsequent working build. The distribution of these times will reveal the team’s dedication to build stability. Clearly, shorter times are better. When broken builds are caused by code changes, the author should favor reverting check-ins over attempting fixes for faster recovery speed.

Time to Resolve Bugs – While the time to fix a broken build focuses on immediate product stability, the time to resolve bugs focuses instead on ultimate correctness. Just because a build is fixed does not mean a bug is necessarily fixed – tests may mark it as an acceptable failure, or the code containing the bug may simply be reverted. The time to resolve a bug is the total time from when the bug was first discovered to when it is fixed or otherwise closed (such as being marked as invalid or won’t fix). Bug tracker tools should easily provide this data. Bugs should be separated by severity when analyzing resolution times. Bugs should be resolved quickly, with priority given to higher-severity bugs. Resolution time metrics indicate if bugs are addressed adequately and in the proper order. Long resolution times may indicate overloaded teams, tolerance of low quality, or the need for redesign/refactoring.