Review of Systems Thinking in the Public Sector by John Seddon – what can commercial organisations learn from public sector failings

22 11 2015

I’ve heard John Seddon speaking a number of times and have watched some of his videos online, I’ve also been able to apply many of the ideas that I have learned in my various roles over the past 4 years and seen what an impact they can have so i was keen to find out more on his thoughts on the functioning of the civil service here in the UK especially since I am currently working in the public Sector.

Now, having read the book, I can see how many of the truths in this book apply not only to the public sector but also to many commercial organisations. I’ll explore my thoughts on this at the end of this post but first a brief description of the book.

John starts out by describing the way services are managed within the public sector and how New Labour’s vision has impacted these services along with a good description of Systems Thinking and how radically different this approach is from the way services are currently managed.

John sums this up by saying “A central idea in systems thinking is the relationship between purposes, measures and methods. By imposing targets the regime creates a purpose (to make the numbers) and constrains the method (the way services are designed). By contract, if you derive measures from purpose from the customer’s point of view, then you liberate method: innovation and improvement flow.”

That lays the foundation for John to explore multiple examples of failure within the current system driven by the government’s relentless demand for services to meet specific targets while at the same time constraining the methods they use. Time after time John is able to demonstrate how this targets approach has been detrimental to the delivery of services. The book exposes an underlying tension between politicians and civil servants – that politicians believe that the civil service is profligate and incapable of designing and managing effective and efficient service delivery.

Historically the government’s policies have been used to attempt to force the civil service to become more efficient by constraining the methods within which it operates (not trusting them to do it themselves) while at the same time forcing it to meet certain targets (not trusting the desire to do it properly). The book provides example after example of how these policies have failed.

There is however some hope as John also discusses examples of how some departments, freed from the constraints of policy, have been able to deliver outstanding services that far exceed even the most demanding government targets. It is unfortunate that these services are chastised by central government because they fail to implement the methods laid out in policy but instead aim to deliver high quality services.

Despite the obvious failings of these policies over the last 30 years it appears that government is still fixated on targets as being the way to drive change through the public sector where in fact it is only driving inefficiency as everyone from nurses to policemen, care workers to benefits agents are having to spend the majority of their time and effort in  bureaucratic exercises to demonstrate policy compliance and target adherence as opposed to spending time delivering and improving services to the public.

This is the biggest waste in the public sector – not the supposed profligacy and inefficiency of it’s operations.

The book ends with a damning review of the process by which policy is created with nobody taking ultimate responsibility for the long-term effects of policies and little or no incentive to change this broken process.

Reading the book I recognised many of these same patterns in commercial organisations – with executives often too far removed from the day-to-day operations to have a real understanding but at the same time feeling a need to impose clear processes and targets in order to feel “in control” of what is happening. There is often a lack of trust between senior management and the employees as managers strive to improve efficiency through the imposition of processes and exert control through the imposition of targets.

These processes and targets mean that employees lack any incentive to innovate and improve the way they perform their roles. Instead we find employees who are frustrated by the inefficiencies they see in the companies they work for but feel powerless to make any kind of meaningful change.

Organisations in this position have a wonderful opportunity to liberate people to explore different methods of delivering the work they do – encouraging innovation and improvement driven by purpose as opposed to targets.

The keys to doing this involve: understanding the customer’s real purpose when interacting with the organisation and structuring the organisation to deliver this purpose.

Organisations need to ensure that everyone is clear on the purpose; provide people with the safety and structure that allows them to suggest and try different methods to achieve that purpose; and constantly encourage the continuous process of learning and improvement – even if this challenges current structures and processes.

Implementing this is challenging as it requires a different operating model from the traditional management style but it has great potential as it creates engaged employees who have a clear interest in improving the effectiveness of the organisation irrespective of their role.





The Estonian government is asking UK citizens what services they should build

29 10 2015

I was at a meetup last night at a small nondescript location just off Shoreditch. The main speakers were the Taavi Kotka (CIO of Estonia), Kaspar Korjus (MD of eResidency in Estonia) and Robin Walker (from the GOV.UK identity assurance programme).

The real purpose of the evening was for the Estonian government to find out what services global citizens wanted from the Estonian government. It was almost surreal – a foreign government was asking me how they could help to make my life easier and better. They have started a programme called eResidency and are running it like a lean startup with all the inherent risks of failure.

Taavi started by giving us a brief overview of why the Estonian government was introducing eResidency – the equation is pretty simple:

Estonia is a small country (1.2M people) and they wanted to figure out how they increase the wealth and security of their citizens. If they were a private company then they would have customers, you want more revenue then you need to get more customers. For a government the customer is a citizen so they have 3 ways of increasing the number of citizens:

  1. Through birth rate (but this would be slow and cause other problems)
  2. Migration (but nobody wants to migrate to Estonia – they have a quota for a maximum of 1200 immigrants a year and have never even come close to hitting it)
  3. Open residency up to people outside Estonia

Option 3 seemed like the only reasonable option. The Estonian government had already built a digital digital citizenship platform. Estonian citizens can interact with their government over this secure platform from anywhere in the world. For example voting is all done over the internet – polls are open for 5 days and you can change your vote at any time during those five days with your final choice being the one that counts.

So why not open this up to people outside Estonia. They would get all the benefits of an Estonian citizen (except the right to vote). But why would people want to become an eResident of Estonia?

By 2020 40% of the US workforce will be freelancers and an increasing number of people are keen to provide their services globally (according to Taavi).

Estonia are keen to be a responsible government and so are not keen to become a tax haven so what else could they offer? What if you could run your company without ever having to do any reporting? Or having to fill in any tax returns? Citizens of Estonia can pay their taxes with 3 clicks over the internet (and they could eliminate that if they chose!). Seamless bureaucracy.

The underlying infrastructure has been working for Estonian citizens for the past 17 years. In January the government approved the project to start investigating eResidency, in April they had a team of 7 working on it, they opened up the registration in May and since then have seen exponential growth (starting with just 4 registrations a week they are now seeing 200 people register every week).

One of the key benefits of becoming an eResident currently is the ability to sign documents using an electronic signature. Global contracts can be exchanged in minutes rather than weeks or days. People can verify the identity of other parties via the trusted third party validation performed by Estonia (every eResident is background checked and passport and biometric data is stored).

The objective is not to raise taxes from eResidents (they don’t use any of the infrastructure provided by Estonia). Instead they want to make it seamless for eResidents to open bank accounts and register companies in Estonia. By doing this they aim to increase the amount of capital within the Estonian economy. They also aim to be responsible global citizens and so will pay your taxes for you in whatever country you are physically resident by using existing reciprocal tax agreements.

Estonia has strong laws regarding privacy. The details of all citizens are available online (to other citizens) but each citizen can see who else has looked at their details. If someone has looked at you without a valid reason then you can press criminal charges.
After the overview we were asked what services we thought this platform could support – we broke into groups and discussed possible ideas before presenting them back – everything from validating Eurovision votes to providing online legal proceedings for contractual disputes.
What was very refreshing was the approach of a government that was truly interested in the people’s use cases for their infrastructure (beyond only their citizenship), that they approached this openly acknowledging that they might fail (a brave step for a government) and the speed of implementation – 5 months from idea to reality and less than 12 months for the legislative framework to be approved.
Will it succeed? I don’t know but it is refreshing to see a government actively engaging users, not just focusing inwardly and willing to take the risk of failure – if more governments followed this model we might see a revolution in the way government is delivered.




Facilitating workshops

29 08 2015

I get to facilitate workshops quite a lot – something that I love doing. The other day I was invited to facilitate a week-long workshop in Geneva. It was with a team that I had done some work with a couple of years back and it was great fun to catch-up and learn about how they are doing now.

As I was packing I realised that it has become habit for me to pack for these sessions, I thought ‘d share what I pack in case it is useful for others and I’d love to find out what other people find useful when they are prepping for sessions like this.

FacilitationTools

I take:

  • a couple of boxes of Sharpies (to help get as many people involved as possible)
  • Two big packs of the 3×3 post-it notes (I prefer people to use these in ideation because you can’t write an essay!)
  • Two packs of 3×5 pot-it notes (I use these for grouping and headings – I can write bigger and it makes the grouping clearer to have a big one stuck over it)
  • A pack of POSCA pens (these are great for preparing the boards and for general layout – they are very clear and can be seen across even a large room)
  • Blu tack (because lots of places have walls that are not very post-it note friendly)
  • Roll of brown kraft paper – 750mmx20m (I prefer to use this as opposed to flipcharts because it is bigger, the colour is more neutral and it is easier to pack)

I can fit all this kit into my hand luggage along with all the clothes etc. that I need for a week away so no need for checked luggage.

On this most recent trip I did get stopped by security at Heathrow airport asking if I had any liquids in my luggage – apparently the POSCA pens show up as liquid but they were happy to let me through once they saw what they were.





Measuring and predicting delivery

29 08 2015

I’ve now spent 7 or 8 years working with agile teams of one flavour or another. I’ve always struggled with the way that I was taught to report progress in an agile team and have never been comfortable with predicting our ability to deliver future features.

I have found that conversations about current progress in a team always needed to be preceded by a conversation about the terms that will be used and what they mean and what inference the recipient can draw from those numbers.

This always felt like a bad pattern to me. While discussing the design of software I always try to model it using business relevant terms so that we can share a single lexicon across business and technology. But when I report progress it’s suddenly OK to use terms that the business (at best) only vaguely understands and at worst has no clue about.

Here I’m talking about the use of things like “story points”, “velocity”, “scrum” and “burn-up”

These opaque terms seem far away from the open, transparent communication that I am striving for.

Then it comes to planning for the future and it all devolves into a little bit of black magic – I’ve done it more times than I can count – sizing stories to plot a scope line (both optimistic and pessimistic) and then trending “best case” and “worst case” velocity lines to give that quadrant of possible delivery.

It always felt to me to be too heuristic – a little too much black magic.

About a year ago I read Daniel Vacanti’s book “Actionable Agile Metrics for Predictability“. Now I’ve tried using Monte Carlo simulations for project planning before but very much based on heuristic values so not much better than the magic quadrant really. After reading Daniel’s book I downloaded Troy Magennis’s “Focused Objectives” software and started to model different software projects – with the right input and measurement I was far more confident in my ability to predict a team’s capacity. I had finally found a way that I could use the data generated during the measuring of progress to apply a more rigourous and structured planning model.

The big realisation for me using these tools is that project delivery can be modeled pretty well stochastically but I’d always been taught to do forecasting deterministically. In other words there are complex interactions in software delivery that I simply don’t take account of very well using traditional models of forecasting. Capturing the right measures and using these to build up a statistical model of the delivery has worked much better. Big shout out to Troy – his tool has been invaluable and his support when things go wrong is first class – I’d highly recommend downloading it and giving it a try (use the link above).

I now measure as much as I can about the team – every state change of every story is timestamped. I measure the overall cycle time from when a story is first prioritised to when it is complete and the time it takes to go through each stage. I have scatter plots for each of them and distribution curves for each. I monitor the average age of stories on our backlog and the trend across average age and average cycle time. I actively intervene in the team processes to try to optimise these measures to ensure that we remain as predictable as possible – and it has made a big difference. It feels more like the team has a constant delivery focus and rhythm and the analysis bears this out.

That means I am far more confident about my forecasts for how much work we can get through and when we will finish certain items that we are currently working on.





Some of the more influential books I have read in the last few years

8 08 2015

Here’s a list of some of the most influential books for me over the last few years – I’ve written a little bit about each book but feel free to ask me about them if you want more details. I read them in pretty much this order (although not back-to-back) which was useful for me in my journey.

Crucial Conversations: Tools for Talking When Stakes are High
Kerry Patterson, Joseph Grenny, Ron McMillan, Al Switzler, Stephen R. Covey
Good look at how the mind works in stress situations and how to lower the stress barrier to encourage good communication

The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity (2nd Edition)
Alan Cooper
Plenty of flaws in the text but the basic premise about how interacting with technology makes people feel is pretty interesting

The God Instinct: The Psychology of Souls, Destiny and the Meaning of Life
Jesse Bering
An exploration of the way our brains evolved – looking at the current Psych literature to understand why we think the way we do. I found this book a very interesting – basically a manual for the human brain. It helped me understand the way my mind works and to question some assumptions about others.

Community: The Structure of Belonging: Restoring the Possible
Peter Block
An look at how communities could work and what people are looking for in their communities. It’s a little dry and takes some wading through but an interesting insight nonetheless.

The 7 Laws of Magical Thinking: How Irrationality Makes us Happy, Healthy, and Sane
Matthew Hutson
Similar to Thinking Fast and Slow – explores cognitive biases in the way our brains work and why they have evolved with these biases.

Consumerology: The Market Research Myth, the Truth about Consumers and the Psychology of Shopping
Philip Graves
If you want to understand how people behave in commercial spaces then this book is great – it gives a real insight into why products that test well fail from the perspective of how our brains work and how we respond to stimulus.

The Joy of Sin
Simon Laham
A cheeky inclusion – this is a quick tour of the “7 deadly sins” and why some of them are not so deadly – in fact some might be necessary for our success as a species.

Being Wrong: Adventures in the Margin of Error: The Meaning of Error in an Age of Certainty
Kathryn Schultz
This is great read – I found it incredibly challenging. Really made me think about the way I respond to situations.

How Many Friends Does One Person Need?: Dunbar’s Number and Other Evolutionary Quirks
Robin Dunbar
Lighter than some of the others on this list – it’s a more concise overview of concepts that are covered in more depth in other books on this list.

Everything is Obvious
Duncan J. Watts
Again an interesting and challenging read – really made me think about the assumptions I make, what is obvious to some might not be to others.

The Epigenetics Revolution: How Modern Biology is Rewriting our Understanding of Genetics, Disease and Inheritance
Nessa Carey
This is pretty technical but after reading about how our thought patterns evolved I was keen to learn more about how the brain is formed. This is an interesting read and helps to understand why there is so much variety in humans. Be prepared to learn about proteins and the way they function in our bodies, how cells are formed and replicated etc.

Pieces of Light: The new science of memory
Charles Fernyhough
A truly fascinating book about how we remember (or not!). This challenged my understanding of how we define right and wrong and the culpability of human action.

Social: Why our brains are wired to connect
Matthew D. Lieberman
Not as detailed maybe as some of the others on the list but rounds off some of the missing pieces left by others.

Running with the Pack: Thoughts From the Road on Meaning and Mortality
Mark Rowlands
Way more philosophical that the other books on the list. Having explored the science of the mind it was interesting to read something on the practical application of this.





Accounting for Agile projects

1 04 2015

I’m often asked how to account for Agile development projects. In traditional projects we get developers to fill in timesheets at the end of each week, take this information and multiply by their pay / day rate, sum up all the values for all contributors to a particular project and we know how much it cost to deliver the software. Or do we?

I have 3 main issues with timesheets:

  1. Timesheets are notoriously inaccurate – especially when people are assigned to multiple different projects at the same time then there are also meetings to account for and all the small issues and bugs that often never find their way onto the timesheet and just get rolled up into whatever other project is being done (even if it is not related).
  2. People dislike timesheets – they are demotivating and make your developers feel like they are in an Orwellian dystopia.
  3. They focus on the individual – they pretend that delivering software can be measured by looking at what a group of individuals are doing. Timesheets completely miss the fact that it takes a team to deliver quality software. The focus on individuals has led to all manner of crazy decision-making like measuring “output” across grade with the idea of using “cheaper resources” to deliver a feature in order to drive down the delivery cost.

Agile offers us a chance to make this far easier, far more accurate and more focused on the true unit of delivery – the team.

How does it work?

I’m going to start with the basics and then layer complexity as we go.

We know how much it costs to run a particular team based on the sum of their day-rates / salaries. So we can easily calculate a cost per iteration / sprint.

If the team is all working to deliver a single feature and it takes them a certain number of iterations / sprints (say i) and it costs c per iteration to run the team then Feature Cost = i*c.

Easy – no timesheets, team-based and reasonably accurate (as much as timesheets anyway).

What if there is more than one project?

Unfortunately teams don’t always work on a single project. What if they worked on 2 different projects during the iteration / sprint? Maybe one was a small development and the other is the start of a larger piece of work.

Well we know the velocity of the team during the iteration / sprint (v) and we know the number of points (effort e) that the team expended on each of the features. For example let’s say the team completed 20 points during the iteration / sprint, 12 of these were for Feature 1 and 8 for Feature 2.

We can easily then determine the percentage of the effort that the team expended during the iteration / sprint on each feature: Feature effort = e1/v.

It is then trivial to calculate the Feature Cost for this iteration / sprint: Iteration Feature Cost = (e1/v)*c. The Total Feature Cost is the sum of all the Iteration Feature Costs.

We should know beforehand for each feature whether or not we can capitalise the Feature Cost so we can easily calculate our CapEx and OpEx.

What about dealing with operational support and production fixes?

There are a couple of ways that I’ve done this (I’m sure there are more that others have tried – that’s what the comments are for!).

First way is to have a reserved capacity within the team to deal with issues and support.  The PM / SM records a team composition that details on a daily basis how the team is split across support and development and the team is responsible for rotating through support to ensure that everyone does both. The team cost calculation is then a little more tricky as it starts to take into account the part of the team doing development and the part doing support, support is CapEx and development will depend on how it is categorised.

Second way is to size support issues relative to development and to include these in the velocity number. Support issues are tough to size upfront but a retrospective sizing can be done (i.e. it took 2 days and the last story we played that took as long was 5 points therefore I’ll give this 5 points).

What about holidays / sick leave?

By keeping a regular team composition sheet (a simple spreadsheet that shows the team composition by day) we can record times when the team is depleted for any given reason. Taking these absences into account when calculating the team cost will give the most accurate Total Feature Cost calculation.

Regular reporting

The great thing about this method of calculation is that it allows for a clear separation of concerns. Each team can have their own definition of points and points and velocity need never be reported outside the team. In fact I would strongly encourage the idea that points and velocity are an internal team measure and should never be reported outside the team. Instead a team is responsible for providing Finance with two things at the end of each iteration / sprint / week:

  1. Team Composition sheet
  2. Iteration Feature Effort for all features that team worked on during the iteration / sprint

With these two artifacts the Finance team can use their knowledge of the day-rates / salaries to perform the Iteration Feature Cost calculation and also to calculate the cost of effort spent on support.

This same approach can be used to calculate the estimated cost of a feature when we are evaluating if we should build it. The differences when performing a forward projection is that velocity projections should be expressed as a range (velocity is not a fixed figure but normally fluctuates between an upper and lower control limit). Also the relative size of the piece of work is not know well enough up front to provide a single fixed number of points and so this too should be expressed as a range. Performing the maths above using the smallest size and highest velocity and then again using the largest size and slowest velocity will give us a range for the predicted cost of the feature. We can then track actuals against our forecast to see how good our forecast was, future forecasts can be adjusted based on the feedback.

It is important to understand that the forecast is a forecast (not a prediction) and has implicit uncertainty. We can refine the way we forecast based on feedback from actuals but should not try to shoe-horn the actuals to make them fit the forecast (either by berating a late-running team or adjusting our points / velocity). Adjusting actuals retrospectively simply to “meet budget” decreases the accuracy of forecasting and should not be encouraged. Instead it is always better to review how forecasting and consistency can be improved.





The changing paradigm for higher education

26 03 2015

I was having a conversation with my brother about the changing world of higher education and have tried to summarise our thoughts here. We observed 2 trends that will directly impact the way that higher education is delivered in the future:

Trend 1 – decreasing cost of storing and disseminating information
Historically the storage and dissemination of information has been incredibly expensive. Up until Guttenberg then only 2 mechanisms for this were memory and hand-written books. Education in this period was often tutor to pupil – learning directly from masters to become a master and teach others. Learning was bound by the number of masters.

With the advent of the printing press information suddenly became a lot more readily available and the cost of dissemination plummeted. The disadvantage of this process was the the information contained in books and on printed material was static and in order to keep it up to date multiple volumes were needed, printed on a regular basis and so the cost of storage of information increased as the volume of information increased – this coupling meant that there was a finite limit to the amount of information accessible to any single individual. Learning changed so that teachers no longer had to be masters of all topics but could point pupils to rich archives of books containing more information that teachers could not memorise. This massively improved the number of people who were able to access education but it meant that the quality of eduction was directly correlated to the amount of money available to store a volume of books. The more books you have access to, the better your education.

Computers started to change this in significant ways – the cost of storing information in digital format is significantly lower than in books and the time needed to recover this information is also reduced. But the cost of serving information was still high – it required web developers, hosted servers, operations teams etc. meaning that the total cost of ownership of computers was still high.

Then Amazon introduced EC2. Suddenly the cost of storing and serving information became negligible a plethora of services have become available that have taken advantage of the cheap availability of storing and serving information (some more useful than others!)

Trend 2: The freeing of data to allow mash-ups
As the cost of storing and disseminating data has dropped so more information sets have been made publicly available (or available at very small cost). There is a growing trend of services where published information from multiple different sources is mashed up into a new useful product by a third party who doesn’t own any of the data. Ownership of data is no longer the bargaining chip that it used to be. Historically individuals and institutions have traded based on the information that they hold. With information becoming more freely available this bargaining position has been weakened. (Greeks used to pay large sums for new books to add to their libraries and would guard them tightly, rich students would pay large sums to learn from the best tutors, universities competed based on the knowledge of their professors and academic works).

The holding of knowledge is no longer good enough – information is too freely available. Universities continue to charge huge tuition fees but the curriculum being taught is streamed free of charge of the internet – this is an anomaly that cannot exist indefinitely. The traditional model of education will need to change to embrace the underlying change in the way we perceive and interact with information.




The risk of change

17 07 2011

Many of the clients that I work at find themselves in the situation where a lot of the technology that they use is outdated and the cost of changing the codebase to support the latest version is prohibitively high. The common excuse for this is that the risk of taking on a new version of their chosen platform is simply too high and they would prefer to stick to what they know works – “Better the beast you know than the one you don’t”.

On the surface this seems like a reasonable and conservative assessment that minimises corporate risk – but is it really? What do upgrades offer? New features? Improved performance? What if the system already performs just fine and do you really need those new extra features? Probably not and these are enticements to upgrade the real underlying reasons are more subtle – as a long term strategy upgrading regularly is a lower risk strategy than not upgrading at all.

This seems counterintuitive – taking on more work and an extra set of unknowns should increase risk rather than decreasing it and in the short term it probably does but what happens in the longer term? Skipping one version change means that when the next one comes out the leap needed to make that change is far greater and therefore far riskier so best stick to what you have – the security updates would be nice but not at the cost of upgrading all the code.

The problem with this strategy is that at some point in the future you will face the choice of either upgrading or losing any support from the platform vendor and at this point the differences between the platforms is so great that upgrading will often mean that other projects will need to be put on hold. So the strategy that looked the least risky actually increases the overall corporate exposure in the long term. I’ve worked at companies that are crippled by their choice not to upgrade. Large portions of support costs go into maintaining legacy code bases, developers become indispensable because of their knowledge of the legacy code – “the only person who can fix this”, development of new features is slowed down because all the best and most senior staff are maintaining the older code rather than cutting new code.

A better strategy would be to plan for change. Start writing software knowing that its going to need to change. But how can you possibly know what changes will be introduced in the next version? The point is you don’t need to know. Code that is well designed, decoupled and has comprehensive automated tests is code that is ready for change. There will inevitably be some additional work that needs doing but the tests should ensure that the changes are safe and the design should ensure that the changes are contained.

Does this mean organisations should always take the latest version? Not at all – waiting to make sure that the latest version is stable is prudent but that wait should not be so long that another version is already out before upgrading.

What about all that legacy code that is not tested or decoupled? Untested legacy code should be perceived as a risk to the organisation. It limits the organisations ability to respond quickly to changes in the marketplace, increases the cost of change and increases the dependencies on key individuals within the organisation. The best long-term risk mitigation strategy would be to write automated tests to cover untested code and to gradually refactor legacy code out of the codebase as new features are written and basic maintenance is performed.





Theory of Constraints

28 03 2011

Played a game developed by Christian Blunden today to highlight why queues are a bad thing. The incredible thing that this game illustrates is that working only to the speed of the bottleneck (i.e. slowing down the activities that lead up to the bottleneck) actually improves overall throughput for the system.

Its a game I have watched Christian run and was keen to run it myself. The game is run in 3 rounds and round 1 has the most activity but the group today did not manage to finish any items in their 5 minutes. We changed things around a bit for the second round and there was a little less frantic activity and a few finished articles but lots of rejects because of poor quality. The third round had the least activity and was by far the calmest. People were able to focus on quality not quantity and despite the apparent relaxed attitude the group finished more items and their lead time to finishing a single item was 5 times less than the first round and 3 times less than the second round.

The following discussion was great as peolpe started to realise what impact the queues have on throughput and figuring out the best way to structure the team to prevent this from happening.

I’ll check with Christian to see if he is OK with me posting the details of the game before publishing it.





Patterns for effective Acceptance Criteria

26 03 2011

We should all know about the standard “Given…When…Then” form for Acceptance Criteria (and hopefully use it). But how do we know if the Acceptance critieria that we write are good? Can we assess them?

Just like other roles in the development cycle I think there are some patterns that mark good acceptance criteria. In this post I’m going to explore some of the patterns that I have noticed that mark out good acceptance criteria but before doing that I want to explore quickly the reasons why we write acceptance criteria because this has helped me in the way that I write them.

Acceptance Criteria are a way of capturing the expected behaviour of the system, written in such a way that we can easily test to see if they have been met. So we have a couple of main motivations:

1) Capturing expected behaviour

Given that the acceptance criteria are (largely) written before development starts; and because they capture expected system behaviour, they should form part of the business sign-off of the story. With this in mind they should be written in such a way that a business person can read and understand them

2) Enable testing

The Acceptance Criteria of the story will ultimately be used to determine when the story is done. In order to be complete all the behaviours documented in the criteria must be met so they should be clear enough to explain to a user what steps to take to ensure that the criteria are met and also unambiguous enough that anyone testing them can clearly see if their testing succeeds or fails. Not only do they facilitate manual testing but they should also be written to facilitate automated testing. Automated testing normally follows a model something like this

  • Setup
  • Perform the required action
  • Validate the results
  • Teardown to facilitate other tests

The standard form for Acceptance Criteria fit nicely into this model:

  • Given tells us the prerequisites (what needs to be setup / done before we start this test?) – this will be the setup phase of automated testing
  • When tells us the action that needs to happen in order to trigger the outcomes that we are testing for – this will be the action that the test needs to perform
  • Then tells us what to expect when the action is performed – this is the validation step of the automated test
  • Teardown will then clean up any persistent setup to ensure that the test can be run repeatedly without any adverse impacts

It could be argued that one of the reasons that we write acceptance criteria is to facilitate developer’s understanding of the functionality to be developed. I’m working on the assumption here that we are doing TDD, based on this assumption any acceptance criteria that enables testing will also perform the role of facilitating developer’s understanding.

Now that we know why we write the acceptance criteria lets look at some patterns that can help.

1) Readable

“Does this acceptance criteria read well? Is it clearly understandable?”

We want the business to review and correct the acceptance criteria and then to sign them off. If the functionality is hidden behind obscurely written criteria then there is little chance that we will get useful feedback from the business. It is for this reason that I prefer paragraph based acceptance criteria rather than table based.

Paragraph:

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
When: I dial a number
Then: I am connected to the person I want to talk to
And: incoming calls are diverted to my voicemail

Table:

Given When Then
That I have entered a telephone number into my mobile phone and I have sufficient signal to make a call I dial the number I am connected to the person I want to talk to
Incoming calls are diverted to my voicemail

Personally I find the first one easier to read and understand – I’m used to reading paragraphs of text and I can follow the format. That’s not to say that table based acceptance criteria can’t be readable I just think that it requires more work. Even with paragraph based criteria it is important to read them as a whole not just as a set of fragments to make sure that they make sense as a single paragraph. Here is an example of the anti-pattern:

Given: logged in to the system
When: perform search
Then: results all contain word from the search criteria

This can be re-written to make it more readable:

Given: a user is logged in to the system
When: they perform a search
Then: all the results displayed must contain the word from the search criteria

2) Testable

“Can I easily test the results laid out in the acceptance criteria?”

This may seem obvious but it is often overlooked. There are 2 common anti-patterns that I have noticed that make the criteria impossible to test:

The first is the use of vague statements such as:

Given: that I have the search page loaded
When: I perform a search
Then: the search results come back within a reasonable period of time

Who is to say what is reasonable? This would be better with an absolute definition of what is considered reasonable:

Given: that I have the search page loaded
When: I perform a search
Then: the search results come back within 5ms

The second is the use of a non-system outcome such as:

Given: that I am on the home page
And: I am logged in
When: I navigate to account preferences
Then: I can see my account preferences

This looks and reads like an acceptable criteria until it comes to automating a test for it because I cannot write an assert statement about what the user is seeing. If the user closes their eyes does the test fail? This is better re-written with a system outcome:

Given: that I am on the home page
And: I am logged in
When: I navigate to account preferences
Then: my account preferences are displayed

I can write an assert statement to check that the account preferences are displayed so I can write an automated test to ensure that this requirement is met.

3) Implementation Agnostic

“Does the acceptance criteria drive the developers down a particular implementation route?”

If the acceptance criteria specifies implementation detail then re-write it to remove the implementation and make it agnostic. Do this by focusing on the functionality rather than the form of the outcome.

Given: that I am on the home page
And: I am logged in
When: I navigate to advanced search
Then: the advanced search web page must be displayed
And: a text box labelled “Name” is displayed
And: a text box labelled “Description” is displayed
And: a command button named “Search” is displayed

The clearly specifies some implementation detail: the advanced search must be a web page, text boxes and command buttons must be used. We can re-write this to make it more agnostic as such:

Given: that I am on the home page
And: I am logged in
When: I navigate to advanced search
Then: the advanced search is displayed
And: an option to search by name is displayed
And: an option search by description is displayed
And: the advanced search is displayed in accordance with the attached wireframe

This way if the wireframe changes then I don’t need to change any of the acceptance criteria – I only need to attach the updated wireframe.

Which brings up another topic which I’ll touch on here quickly. Rather than specifying detailed UI or implementation requirements in Acceptance Criteria, I would rather put them into Adornments. Adornments can be referenced in the Acceptance Criteria and can include colour palettes, CSS sheets, button images, fonts, wireframes etc.

The level to which stories are implementation agnostic will depend to some extent the maturity of the project. During the early stages of a project it is most important to stay agnostic. Later in a project, where stories may only be minor tweaks to the User Interface, this becomes less of an issue.

4) Actionable When Statement

“Can I automate the when statement?”

The symptoms for the anti-pattern often contain the real action in the Given statement instead of moving it to the When clause. So it might read something like this:

Given: that I am on the homepage
And: I have navigated to the search
When: I look at the page
Then: the search options are displayed

This breaks the relationship that we established in the beginning between the automated test model and the standard form of the acceptance test because I can no longer automate it. The When statement refers to something that is not a system action and the actual When has been hidden in the Given. Re-writing it to make it more actionable produces this:

Given: that I am on the homepage
When: I navigate to the search
Then: the search options are displayed

5) Strong Verb Usage

“Is the language used deterministic?”

Avoid weak verb forms like should and could instead use absolute statements. Replace “the system should show x” with “the system displays x“.

This small change makes a big differences to the strength of the Acceptance Criteria changing something like this:

Given: that I am logged into the system
When: I navigate to the search page
Then: an option to search based on the “Name” field should be displayed

into something like this:

Given: that I am logged into the system
When: I navigate to the search page
Then: an option to search based on the “Name” field is displayed

6) Keep the criteria specific to the story

This often manifests itself as really long acceptance criteria which are specifying behaviours, functions or look and feel that were implemented in previous stories. Try to ensure that the When portions of the criteria are specific to the story that is being developed. Only specify changes from previous functionality (i.e. where you expect it to be different from previously specified).

7) Tell a story

“Do the Acceptance Criteria walk the user through the scenario?”

For complex stories where there are multiple acceptance criteria and exception cases, I like to structure my acceptance criteria in such a way that they walk the user through the scenarios in a realistic order with exception cases at the end.

Call Available number

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
And: the person I am calling is available on the network
When: I dial the number
Then: the screen display indicates that the phone is ringing
And: my incoming calls are diverted to my voicemail

Person answers call

Given: that I have dialled an available phone
And: the phone I am calling is ringing
When: they answer the call
Then: the screen display shows that I am connected

I end the call

Given: that I have dialled an available phone
And: the person I called has answered the call
When: I terminate the call
Then: the screen displays the fact that the call has ended
And: my incoming calls are no longer diverted to my voicemail

Call Unavailable number

Given: that my mobile phone is switched on
And: I have sufficient signal to make a call
And: the person I am calling is not available on the network
When: I dial the number
Then: the screen displays a message to tell me that the phone is unavailable
And: my incoming calls are diverted to my voicemail

The other thing that I like about writing my acceptance criteria in paragraphs is that I can give each one a heading. Having a heading instead of a number for the acceptance criteria makes them more readable and means that I can have meaningful conversations with people. Instead of “We’re having a little difficulty with acceptance criteria number 3a” we can talk about “I’m having a bit of difficulty with Call an unavailble number”. This makes for more meaningful conversations because everyone can understand what the conversation is about without having to look up what acceptance criteria 3a is about.

If I tried to write the above criteria as a table I might end up with something like this (you can make your own mind up as to which is most useful):

Table:

Given When Then
That my mobile phone is switched on and I have sufficient signal to make a call and the person I am calling is available on the network I dial the number the screen display indicates that the phone is ringing
my incoming calls are diverted to my voicemail
That I have dialled an available phone they answer the call the screen display shows that I am connected
That I have dialled an available phone and the person I called has answered the call I terminate the call the screen displays the fact that the call has ended
my incoming calls are no longer diverted to my voicemail
That my mobile phone is switched on and I have sufficient signal to make a call and the person I am calling is not available on the network I dial the number the screen displays a message to tell me that the phone is unavailable
my incoming calls are diverted to my voicemail