Structuring tests using Given-When-Then

Introduction

All software requires automated testing for quality assurance. Tests are written at various levels, such as unit, integration and end-to-end tests. Readability and maintainability of tests is as important as of production code. For example, suppose you modify code and a test breaks. Did you introduce a bug, or was the design changed so that the test needs to be updated? In order to determine the correct course of action, you need to understand the structure and intent of the test.

Here we discuss tactics for writing tests using a given-when-then structure, and suggest refactoring patterns to make tests conform to the structure. Given-when-then is often associated with behavioral tests, but it can be used at any testing level.

Given-When-Then-Finally

A test case is a linear execution of code sections labeled given, when, then and finally. Only when and then are mandatory. The tests should not contain any other code, and the sections should not be in mixed order.

The purpose of each section is:

Given: establish a known state to the system
When: execute an action using the system
Then: verify (assert) that the system responded according to specification
Finally: revert the state back to pre-given state. This section is executed even if when or then fails.

Hoare triples

In parallel to the code-centric formulation, we can think of the test structure in terms of logic. Hoare triples (1969) capture the intent of given-when-then sections. Hoare triples have the form <precondition, call, postcondition>, where pre- and postcondition are logical propositions and call is the when code block. A triple is valid (the test passes) if the precondition is true before executing the call, and the postcondition is true after executing the call.

The code and logic aspects are related so that the given section makes the precondition true, and the then section asserts that the postcondition is true:

Mapping to test frameworks

Some test frameworks, such as Cucumber, are built around the given-when-then structure, but others use a different design. However, the structure can be adapted to any test framework, as the important aspects are:

All test code belongs to one of the four code sections
The sections form a linear sequence and are not mixed

Minimal: mark sections

Minimally, we can just mark the sections using comments when no explicit syntax is available. In the following example, we use Jest to test a custom array implementation. In such a simple test, it may seem pedantic to mark the sections, but it ensures that all tests are structurally sound.

test('an array with one element should have length 1', () => {
    // given
    const array = new CustomArray();

    // when
    array.push("item");

    // then
    expect(array.length).toBe(1);
});

Detailed: describe sections

For complex tests, the purpose of the test is clearer when the sections are described using natural language. When describing the given and then sections, it is useful to think in terms of the pre- and postconditions (logical state) instead of the actual code.

The description for the above test could be:

Given an empty array
When an item is pushed
Then the length of the array is one

This could be embedded as code comments or function documentation, or one can use a more structured method, such as the describe function from Jest:

describe('GIVEN an empty array', () => {
    const array = new CustomArray();

    describe('WHEN an item is pushed', () => {
        array.push("item");

        test('THEN the length of the array is one', () => {
            expect(array.length).toBe(1);
        });
    });
});

Next, we look at two examples.

Example: unit testing a pure function

Unit tests can use the given section to create arguments for calling the unit under test. Example of testing a timestamp parser:

test('should parse an ISO format timestamp with UTC timezone', () => {
    // given
    const timestamp = "2021-10-18T10:28:15Z";

    // when
    const parsed = myTimestampParser.parse(timestamp);

    // then
    expect(parsed).toBe(...);
});

If the arguments are trivial, they can be embedded into the when section and the given section omitted.

Example: functional API testing

In functional testing, state setup and teardown are often done using helper facilities from the test framework. This means that the given-when-then-finally sections are split across the test file. Comments can identify which code belongs to which section.

In the following example, we insert an object using an API with a fixed object ID, and verify that the object can be found. Finally, we remove the object to clear the state for the next test.

const testObjectId = "12345678-1234";

beforeEach(() => {
    // given
    return api.insertObject({id: testObjectId});
});

afterEach(() => {
    // finally
    return api.deleteObject(testObjectId);
});

test('should find an object that has been inserted', async () => {
    // when
    const result = await api.queryObject(testObjectId);

    // then
    expect(result.id).toBe(testObjectId);
});

Next, we suggest how to refactor tests to conform to the given–when–then structure.

Refactoring: parameterized tests

Commonly you want to test a function using multiple inputs. The following is an antipattern for this. It breaks the given–when–then structure, because when and then sections are mixed. It’s also unclear whether we have one or three tests.

test('should compute the square of numbers', () => {
    expect(myMathModule.square(1)).toBe(1); // when, then
    expect(myMathModule.square(2)).toBe(4); // when, then
    expect(myMathModule.square(3)).toBe(9); // when, then
});

Such patterns can be refactored using parameterized tests. They are supported by many test frameworks, including Jest (test.each), JUnit (@ParameterizedTest) and pytest (@pytest.mark.parametrize). Example using Jest:

test.each([
    [1, 1],
    [2, 4],
    [3, 9],
])('should compute square of %d', (input: number, expected: number) => {
    // when
    const squared = myMathModule.square(input);

    // then
    expect(squared).toBe(expected);
});

Refactoring: decomposing test sequences

In functional testing, it is common to test sequences of operations. It is tempting to combine the whole sequence into one test, but such large tests are hard to understand:

test('length should tell how many elements the array has', () => {
    // GIVEN an empty array
    const array = new CustomArray();

    // WHEN pushing an element to the array
    array.push("item1");

    // THEN the array has one element
    expect(array.length).toBe(1);

    // WHEN pushing a second element to the array
    array.push("item2");

    // THEN the array has two elements
    expect(array.length).toBe(2);
});

This sequence can be decomposed into multiple tests. This is done by splitting the sequence after the first then, and using the postcondition of the first test (array has one element) as the precondition (given) of the next.

test('an array with one element should have length 1', () => {
    // GIVEN an empty array
    const array = new CustomArray();

    // WHEN pushing an element to the array
    array.push("item1");

    // THEN the array has one element
    expect(array.length).toBe(1);
});

test('an array with two elements should have length 2', () => {
    // GIVEN an array with one element
    const array = new CustomArray();
    array.push("item1");

    // WHEN pushing a second element to the array
    array.push("item2");

    // THEN the array has two elements
    expect(array.length).toBe(2);
});

Conclusions

Systematically organizing tests into sections of given–when–then–finally ensures that the tests are structurally sound. Sections can be marked using lightweight comments, or with direct support from the test framework.

References

Hoare, C. A. R. (1969). An axiomatic basis for computer programming. Communications of the ACM, 12(10). Read more: Coupling and cohesion: guiding principles for clear code

This article is written by Senior Software Architect Kristian Ovaska.

November 19, 2021

Henry Lehto

No Comments

0 Likes