Patterns for effective Acceptance Criteria

26 03 2011

We should all know about the standard “Given…When…Then” form for Acceptance Criteria (and hopefully use it). But how do we know if the Acceptance critieria that we write are good? Can we assess them?

Just like other roles in the development cycle I think there are some patterns that mark good acceptance criteria. In this post I’m going to explore some of the patterns that I have noticed that mark out good acceptance criteria but before doing that I want to explore quickly the reasons why we write acceptance criteria because this has helped me in the way that I write them.

Acceptance Criteria are a way of capturing the expected behaviour of the system, written in such a way that we can easily test to see if they have been met. So we have a couple of main motivations:

1) Capturing expected behaviour

Given that the acceptance criteria are (largely) written before development starts; and because they capture expected system behaviour, they should form part of the business sign-off of the story. With this in mind they should be written in such a way that a business person can read and understand them

2) Enable testing

The Acceptance Criteria of the story will ultimately be used to determine when the story is done. In order to be complete all the behaviours documented in the criteria must be met so they should be clear enough to explain to a user what steps to take to ensure that the criteria are met and also unambiguous enough that anyone testing them can clearly see if their testing succeeds or fails. Not only do they facilitate manual testing but they should also be written to facilitate automated testing. Automated testing normally follows a model something like this

  • Setup
  • Perform the required action
  • Validate the results
  • Teardown to facilitate other tests

The standard form for Acceptance Criteria fit nicely into this model:

  • Given tells us the prerequisites (what needs to be setup / done before we start this test?) – this will be the setup phase of automated testing
  • When tells us the action that needs to happen in order to trigger the outcomes that we are testing for – this will be the action that the test needs to perform
  • Then tells us what to expect when the action is performed – this is the validation step of the automated test
  • Teardown will then clean up any persistent setup to ensure that the test can be run repeatedly without any adverse impacts

It could be argued that one of the reasons that we write acceptance criteria is to facilitate developer’s understanding of the functionality to be developed. I’m working on the assumption here that we are doing TDD, based on this assumption any acceptance criteria that enables testing will also perform the role of facilitating developer’s understanding.

Now that we know why we write the acceptance criteria lets look at some patterns that can help.

1) Readable

“Does this acceptance criteria read well? Is it clearly understandable?”

We want the business to review and correct the acceptance criteria and then to sign them off. If the functionality is hidden behind obscurely written criteria then there is little chance that we will get useful feedback from the business. It is for this reason that I prefer paragraph based acceptance criteria rather than table based.

Paragraph:

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
When: I dial a number
Then: I am connected to the person I want to talk to
And: incoming calls are diverted to my voicemail

Table:

Given When Then
That I have entered a telephone number into my mobile phone and I have sufficient signal to make a call I dial the number I am connected to the person I want to talk to
Incoming calls are diverted to my voicemail

Personally I find the first one easier to read and understand – I’m used to reading paragraphs of text and I can follow the format. That’s not to say that table based acceptance criteria can’t be readable I just think that it requires more work. Even with paragraph based criteria it is important to read them as a whole not just as a set of fragments to make sure that they make sense as a single paragraph. Here is an example of the anti-pattern:

Given: logged in to the system
When: perform search
Then: results all contain word from the search criteria

This can be re-written to make it more readable:

Given: a user is logged in to the system
When: they perform a search
Then: all the results displayed must contain the word from the search criteria

2) Testable

“Can I easily test the results laid out in the acceptance criteria?”

This may seem obvious but it is often overlooked. There are 2 common anti-patterns that I have noticed that make the criteria impossible to test:

The first is the use of vague statements such as:

Given: that I have the search page loaded
When: I perform a search
Then: the search results come back within a reasonable period of time

Who is to say what is reasonable? This would be better with an absolute definition of what is considered reasonable:

Given: that I have the search page loaded
When: I perform a search
Then: the search results come back within 5ms

The second is the use of a non-system outcome such as:

Given: that I am on the home page
And: I am logged in
When: I navigate to account preferences
Then: I can see my account preferences

This looks and reads like an acceptable criteria until it comes to automating a test for it because I cannot write an assert statement about what the user is seeing. If the user closes their eyes does the test fail? This is better re-written with a system outcome:

Given: that I am on the home page
And: I am logged in
When: I navigate to account preferences
Then: my account preferences are displayed

I can write an assert statement to check that the account preferences are displayed so I can write an automated test to ensure that this requirement is met.

3) Implementation Agnostic

“Does the acceptance criteria drive the developers down a particular implementation route?”

If the acceptance criteria specifies implementation detail then re-write it to remove the implementation and make it agnostic. Do this by focusing on the functionality rather than the form of the outcome.

Given: that I am on the home page
And: I am logged in
When: I navigate to advanced search
Then: the advanced search web page must be displayed
And: a text box labelled “Name” is displayed
And: a text box labelled “Description” is displayed
And: a command button named “Search” is displayed

The clearly specifies some implementation detail: the advanced search must be a web page, text boxes and command buttons must be used. We can re-write this to make it more agnostic as such:

Given: that I am on the home page
And: I am logged in
When: I navigate to advanced search
Then: the advanced search is displayed
And: an option to search by name is displayed
And: an option search by description is displayed
And: the advanced search is displayed in accordance with the attached wireframe

This way if the wireframe changes then I don’t need to change any of the acceptance criteria – I only need to attach the updated wireframe.

Which brings up another topic which I’ll touch on here quickly. Rather than specifying detailed UI or implementation requirements in Acceptance Criteria, I would rather put them into Adornments. Adornments can be referenced in the Acceptance Criteria and can include colour palettes, CSS sheets, button images, fonts, wireframes etc.

The level to which stories are implementation agnostic will depend to some extent the maturity of the project. During the early stages of a project it is most important to stay agnostic. Later in a project, where stories may only be minor tweaks to the User Interface, this becomes less of an issue.

4) Actionable When Statement

“Can I automate the when statement?”

The symptoms for the anti-pattern often contain the real action in the Given statement instead of moving it to the When clause. So it might read something like this:

Given: that I am on the homepage
And: I have navigated to the search
When: I look at the page
Then: the search options are displayed

This breaks the relationship that we established in the beginning between the automated test model and the standard form of the acceptance test because I can no longer automate it. The When statement refers to something that is not a system action and the actual When has been hidden in the Given. Re-writing it to make it more actionable produces this:

Given: that I am on the homepage
When: I navigate to the search
Then: the search options are displayed

5) Strong Verb Usage

“Is the language used deterministic?”

Avoid weak verb forms like should and could instead use absolute statements. Replace “the system should show x” with “the system displays x“.

This small change makes a big differences to the strength of the Acceptance Criteria changing something like this:

Given: that I am logged into the system
When: I navigate to the search page
Then: an option to search based on the “Name” field should be displayed

into something like this:

Given: that I am logged into the system
When: I navigate to the search page
Then: an option to search based on the “Name” field is displayed

6) Keep the criteria specific to the story

This often manifests itself as really long acceptance criteria which are specifying behaviours, functions or look and feel that were implemented in previous stories. Try to ensure that the When portions of the criteria are specific to the story that is being developed. Only specify changes from previous functionality (i.e. where you expect it to be different from previously specified).

7) Tell a story

“Do the Acceptance Criteria walk the user through the scenario?”

For complex stories where there are multiple acceptance criteria and exception cases, I like to structure my acceptance criteria in such a way that they walk the user through the scenarios in a realistic order with exception cases at the end.

Call Available number

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
And: the person I am calling is available on the network
When: I dial the number
Then: the screen display indicates that the phone is ringing
And: my incoming calls are diverted to my voicemail

Person answers call

Given: that I have dialled an available phone
And: the phone I am calling is ringing
When: they answer the call
Then: the screen display shows that I am connected

I end the call

Given: that I have dialled an available phone
And: the person I called has answered the call
When: I terminate the call
Then: the screen displays the fact that the call has ended
And: my incoming calls are no longer diverted to my voicemail

Call Unavailable number

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
And: the person I am calling is not available on the network
When: I dial the number
Then: the screen displays a message to tell me that the phone is unavailable
And: my incoming calls are diverted to my voicemail

The other thing that I like about writing my acceptance criteria in paragraphs is that I can give each one a heading. Having a heading instead of a number for the acceptance criteria makes them more readable and means that I can have meaningful conversations with people. Instead of “We’re having a little difficulty with acceptance criteria number 3a” we can talk about “I’m having a bit of difficulty with Call an unavailble number”. This makes for more meaningful conversations because everyone can understand what the conversation is about without having to look up what acceptance criteria 3a is about.

If I tried to write the above criteria as a table I might end up with something like this (you can make your own mind up as to which is most useful):

Table:

Given When Then
That my mobile phone is switched on and I have sufficient signal to make a call and the person I am calling is available on the network I dial the number the screen display indicates that the phone is ringing
my incoming calls are diverted to my voicemail
That I have dialled an available phone they answer the call the screen display shows that I am connected
That I have dialled an available phone and the person I called has answered the call I terminate the call the screen displays the fact that the call has ended
my incoming calls are no longer diverted to my voicemail
That my mobile phone is switched on and I have sufficient signal to make a call and the person I am calling is not available on the network I dial the number the screen displays a message to tell me that the phone is unavailable
my incoming calls are diverted to my voicemail

Actions

Information

4 responses

3 05 2011
Aaron Corcoran

Thanks for the write up. This was a very useful post.

9 05 2013
Neil Craven

I like the write up and I’m definitely not a fan of tabulated Given/When/Then.

A lot of people use this syntax to describe the UI, which makes me sad 😦 and even worse, over-keen coders/testers will automate anything that’s written in that syntax even if it’s not worth the effort, leading to flaky and brittle tests that take an age to run.

The true power of this syntax is not to define ‘acceptance criteria’, though. Acceptance criteria include cross-functional stuff like performance, UI stuff like colours and fonts (brand identity etc.) Often, these can’t be described in Gherkin. What Gherkin is great at is clearly describing examples of what the expected behaviour of the system is once the story is done. The functional specs, basically.

Importantly that means it should be written and described at system-level. I would encourage people to write these kind of tests directly into the code base structured by feature, and not bury them story-by-story (you don’t do that with the code it’s testing, after all). This is particularly important if you follow the philosophy that once a story is done, it doesn’t exist as a discrete unit any more (you can’t reach into a system and pull out one story that you played 6 weeks ago any more than you can reach into a clay statue and pull out that lump of clay you added 6 weeks ago).

10 05 2013
brettansley

Hi Neil

Thanks for taking the time to comment.

I agree wholeheartedly with embedding functional tests of this kind in code to facilitate the ‘living’ documentation of code. The problem I have then faced is having too many tests embedded in code slowing down the build cycle. At some point I think it starts becoming important to recognise what functionality is important to the overall (and therefore long-lasting) functionality of the application and what is more short-lived or particularly associated to the implementation of this story.

My ideal is to build my automated tests structured by user journeys (as opposed to feature) where each story contributes to the overall user journey – judiciously done this requires that the automated tests are regularly pruned to ensure that they make sense from the overall journey’s objectives. This provides context for the tests themselves so that when we come to replace / supersede a piece of functionality we understand the context in which it is being tested (i.e. the overall user journey). This helps to solve the issue of implementing automated tests to cover all the functional requirements on a story-by-story basis and then being left with large numbers of automated tests and not being aware of the context and therefore which tests can be changed / superseded.

10 05 2013
Neil Craven

I’ve had the same experiences 🙂

Rather that sacrifice a lot of effort in pruning, I’d suggest one can be even more conservative (lazy?) about what they automate, at least in this type of automation. It’s very hard to persuade a nervous tester that deleting tests is OK.

For tests that only intend prove the story was done and you know won’t really live very long, there are other ways that might be more efficient. Some TWers in Oz wrote a nice blog about it here http://cromulent-testing.com/2011/07/05/disposable-automation.html. Deleting most of these tests at the same time you decide the story as accepted, and making sure the highest risk stuff is nicely covered by unit tests or one or two functional tests for catching regression.

People usually get scared when I say that, but often I find in projects with slow and unwieldy test suites, regressions still aren’t caught by the high-level tests, and we spend aaages maintaining flaky or brittle tests. Usually scenarios that test through the UI using badly-implemented Selenium are the culprit. Unit testing the JS and having contract-based tests to ensure the integration between front-end and logic gives you most of the confidence in a much more maintainable and faster suite.

Thats why I discourage G/W/T do describe that fine level of Acceptance Criteria. On anything but a small highly functioning team you’re going to get keen people, who don’t have enough business context to work out which scenarios are the risky ones. They’ll take time to automate them all in Selenium, driven by rSpec or JBehave or whatever, that takes 40 seconds to test whether static text appears on your website, or whether the label on your button is ok (but doesn’t even test that when you press the button, your data is stored correctly, or even whether hitting it twice stores the data twice…) and then you change the CSS and the tests can’t find the button any more.

As for organising these things as user journeys rather than as features, I’m curious: How do you avoid repeatedly testing the parts of the system that are common across user journeys, and how do you ensure your tests are independent?

Cheers!
Neil

Leave a comment