Mask of the Red Build

Do we always have to be 100% green, all tests passing? Can we decide to live for a while with a red build?

Let me tell you a story.

A while ago, we had a few performance tests running as part of our tests for Isolator’s mocking framework. These tests were there to let us know if an operation took more time than it should. We put these test knowing they were a bit problematic.

Tests should pinpoint problems. Yet those problems are in the eye of the beholder, and in this case, the eyes of the developers, looking at the test report.

So although we had tests for an operation’s duration, we were not so worried when they failed.

Why? Because, we could always blame the computer running the tests to be occupied at that point with disk access for making the operation take longer. After all, it’s a reasonable explanation for a failing test once in a while.

So if the build failed, and the number of failures were around 3 (which was the number of performance tests), we let it go. We didn’t even look at the failing tests, because we knew which failed. And of course because we were sure, we missed other real failures.

Imagine now, that our build has intermittently 20 or so tests. Even 200. It’s red most of the time. We know why these tests are failing, so we don’t look at the build report at all. We know that in a few days (or weeks) we’ll get back on track, because we know how to fix those pesky tests.

Yet we continue to push code into the system, without looking at the results, because we know what the result will be.  And we miss other failures.

There are a few lessons to be learned:

  • Only write tests for something that has a real meaning when it fails. If you can blame the failure on environment, it is not a good test.
  • A failure we can explain now and we’ll tend to later, even if it’s a real bug, is a risk to other parts of the code we’re writing: Code we know works, with green tests, may also break, and we won’t even notice it.
  • Even one failure (let alone many) can mask real problems. If we’re ready to live with a red build for a while, we still need to look at the build result details, and make sure the tests fail because of why we think they are failing. But…
  • We are human. We’re very good with finding good explanations (rather than investigating) real problems. We‘re sure of our competence to work in an uncertain situations, but we’re fooling ourselves.

Eventually we removed those tests from our suite. We understood that when we see red, it should signal a real problem. So we cut down the suite to everything we trust, and then went from there.

Testing is about handling uncertainty and risk. We decided to leave out additional, yet uncertain, information, in order to be certain of a smaller set.

This is a better bet.

Looking to improve your programming skills? Start unit testing! Claim your FREE trial of Isolator Complete right here.