Human Error
We’ve coveredfailuresbefore, but this week we’re focusing on errors. Failures can be preventable, complex, or “intelligent” –such as those resulting from experiments where we try something, intentionally, that might fail. However, errors, in this context, refer to theunwanted results of action or inaction.
We often tend to put mistakes, slips, lapses, errors and violations all in the same pot of “human error”. In fact, different types of error require very different approaches to prevent and address them.
Human error should be the start of investigations, not the end.
‘Human error’ is often stated to be the cause of incidents, problems, or disasters – it sometimes tends to be theendof an investigation – the smoking gun or “root cause”. In reality, human error should be where investigations really begin. Human error is a symptom, an indicator, of deeper issues within the system (by which we mean thesociotechnicalsystem that includes people and things). Just this week,theFAA analysis of a US airlines systems outageblamed “personnel who failed to follow procedures” and stated “The system is functioning properly“. Arguably if thesystem can be brought down completelyby a single person with one corrupt file, this indicates a bigger problem with the system itself.
Human error as a cause of incidents is a manifestation of the reductionist belief that the system as designed is perfect, and it’s people that are the problem. In this line of reasoning, it’s people that are fallible, and they need to be controlled – turned into more predictable machines, just like the system itself. It often coincides with a stance that everything would be just peachy if people simply complied with rules and procedures.
Systems are safer because of the people within them, not in spite ofthem.
In practice, we should consider whether people use and interact with a system,or are they partof the system?New ways of thinking, such as Resilience Engineering and Safety II, take the latter stance, and indeed suggest that systems are safer and moreresilient *because* of the people within them, not in spite ofthem.
By taking a traditional human error perspective, we focuson what people failed to doand what they should have done instead. We might even try to find the “bad apple” who lost “situational awareness” or wasn’t paying enough attention. We might assume that everyone has “Fingerspitzengefühl” or the “fingertips feeling” of someone with perfect situational awareness (hat tip to John Boyd for the term). This is “counterfactual” reasoning orhindsight bias: “if only they’d done X”, or “they should have done Y”, and of course as the name suggests, this can only be applied in hindsight.
“He should have just dived to his right.”
Counterfactual reasoning:
Nurses are only meant to take bloods from a patient under the direct instruction from a doctor. A patient in an acute care hospital ward takes a sudden turn for the worse, and a nurse calls for a doctor. The nurse, knowing time is of the essence, immediately takes blood samples from the patient. They deviate from the rule but make the assumption that the doctor will ask for them urgently anyway, and they have the patient’s best interests at heart. In one possible outcome, the patient survives because the nurse acted quickly and deviated intentionally from the rulebook. In an alternative possible outcome, complications that the nurse was unaware of meant that the patient suffered harm as a result of the blood test.
Did the nurse make an error? The fact that we can only decide after the incident whetherthe nurse acted in error is because we rely on counterfactual arguments to make that call. In the patient harm scenario, the nurse should have waited for the doctor. But in the other scenario, hadthe nurse waited for the doctor, the patient may have died as a result of the delay.We cannot properly determine the correct decision until we know the result of it.
FromA Tale of Two Stories: Contrasting Views of Patient Safety. Woods et al 1997.
Types of Human Error
Probably the simplestcategorisation of human errorwould be to split them into errors of omission or commission (Kern, 1998).
- Omission– Errors of omission occur when people fail to carry out a required task.
- Commission– Errors of commission occur when people carry out a task incorrectly or do something that is not required.
Researchers havesince further differentiated human erroras the following (Strauch, 2004: Reason, 1990);
- Slips and Lapses – However conscientious and expert we are, we will never execute something perfectly every time.
- Mistakes – When presented with incomplete information, insufficient experience or knowledge, we may make a decision that turns out, after the fact, to have been incorrect.
- Violations – We may take actions or make decisions that conflict with the way we “should” do things. That might be a violation of policy, process, or the expected norms and rules of the group (which also includes unacceptable behaviour). We may do this unintentionally, or intentionally, with positive intent. Too much of this can lead to the normalisation of deviance.
Of course, there should be consequences for intentional violations, and we should hold ourselves and others accountable to high standards ofprofessionalism. Butone thing we know for sure is that if we punish people for slips, lapses, and mistakes rather than trying to improve the system around them,we don’t end up with fewer errors, we just stop hearing about them until they’re too big to hide.
Another thought experiment about human error:
A competent and experienced member of your team makes a relatively serious lapse, and neglects to connect a safety mechanism to a machine before use. Do you sack them? Most of us would probably say no, we all need a second chance.
Twice? Probably not, but maybe we’d give them a warning, right?
Five times? We might be considering sacking them now?
Ten times? We’re surely in the realm of sacking someone now, but something else is niggling at you… This seems like a lot of errors to make.
Now, what about a hundred times, and other people are making the same slip? Hmm, something’s definitely up. What’s causing them to all make this error? There must be something going on that makes this safety mechanism difficult or unclear. Also, if we sack all these people we’re hardly going to have anyone left!
(Thanks to John Willis for the story behind this thought experiment.)
Responding to ‘human error’
Typically in many organisations and industries, we try to reduce and mitigate human error by telling the people closest (proximal in time and space) to the incident to:
- Be more vigilant!
- Pay more attention!
- Put in more effort and try harder!
- Comply with these rules!
- Follow this standardised procedure!
By focusing on the berating the people, we miss real opportunity to improve. Instead, we should try to find out why thepeople at the sharp endmade the error.We make errors all the time, even with great expertise and experience. Radiologists misread X-rays, pilots misinterpret Air Traffic Control communications, drivers momentarily exceed the speed limit. Instead of blaming people, we can choose to see that errors are an opportunity to learn – to learn about our systems, procedures, tools, training, practices and structures – the “blunt end”. Why was someone able to make such an error? Why was that error not caught before it resulted in an unwanted outcome?
The Local Rationality Principle
Just as we covered theAgile Prime Directivelast week, this week we consider theLocal Rationality Principle– that people do what makes sense to them at the time.People do reasonable things given their goals, knowledge, understanding of the situation and focus of attention at a particular moment.This principle provides a scaffold for investigation – what were this person’s goals? How deep is their knowledge? Was their attention split during this task, and if so, why?
Nobody is perfect, and designing a system for the perfect operator is unwise. We all make mistakes, we all lapse, slip, and sometimes deviate from the “proper” process, even (maybe especially) experts. If we design a system that relies on people always doing the right thing, and doing it right every time, we’re designing a system that is guaranteed to fail. AsErik Hollnagel says:
“human error’ is not a meaningful category and we therefore should stop using it.“
Erik Hollnagel
We should design systems that expect people to lapse and make mistakes.
Instead, we should design a system that expects people to make mistakes, slips and lapses. We should createsystems that cangracefully fail(where if a component fails or one step is missed or made incorrectly, only that part fails, not the whole system). This is as true of technological systems as it is of healthcare systems, financial systems, and business systems. Part of how we adopt thissystemic approach is by fostering a culture of psychological safety andtrusting thatnobody comes to work to do a bad job.
If you’d like to dive deeper into Human Error, Sidney Dekker’s “Field Guide to Understanding Human Error” is a brilliant place to start.
References:
Cook, R.I., Woods, D.D. and Miller, C., 1998. A tale of two stories: contrasting views of patient safety. The Foundation.
DEKKER, S. (2006). The Field Guide to Understanding Human Error. Ashgate Publishing Company, USA.
KERN, T. (1998). Flight discipline. New York: McGraw-Hill.
Marx, D. Human Error is NEVER the Root Cause – REVISITED – Outcome Engenuity (2017). Available at: https://outcome-eng.com/human-error-never-root-cause-revisited/
REASON, J. (1990). Human error. New York: Cambridge University Press.
STRAUCH, B. (2004). Investigating human error: Incidents, accidents and complex systems. Aldershot, England: Ashgate Publishing Ltd.
This newsletter is sponsored byConflux
Conflux is the leading business consultancy worldwide helping organizations to navigate fast flow in software. We help organizations to adopt and sustain proven, modern practices for delivering software rapidly and safely.
‘The Fearless Organization’ by Amy Edmondson is considered by many to be essential reading on the topic of psychological safety.In this article, Sophie Weston, Principal at Conflux, has put together some key takeaways from the book.
Psychological Safety in the Workplace
I love this piece fromJason Cox on cultures of candor.And I like how he demonstrates hisself-awareness enough to state, “I recognize that I’m over indexed on optimism, but I also believe that at any given point, something is going wrong.“
This episode ofCautionary Tales with Tim Harfordis excellent. The first half is about epidemiology in the online World of Warcraft game, which I found fascinating, but the second half will definitely appeal to newsletter readers.It’s an account of twosignalmen running a busy stretch of railroad on the Scottish borderwho had to adhere to strict rules to prevent crashes– but those regulations failed to take into account the human factors.
This is a good piece fromMIT Sloan on “intellectual honesty”and the duty we have to speak up if we can do so. I like this 2×2 (I’m a sucker for 2x2s) onpsychological safety and intellectual honestyand how the two elements combine in an organisation.
Here’s agood opinion piece in the BMJ on Reverse Mentoring– where senior folks are mentored by people in the organisation who are more junior. It addressesthe reluctance of many people to do so because it challenges traditional hierarchies– but it’s a valuable and powerful approach.
This is a very good piece in the Metro aboutcode switching at work:I lie to my colleagues about who I am – I won’t reveal anything about myself
This week’s poem:
Snow, by Louis MacNeice
The room was suddenly rich and the great bay-window was
Spawning snow and pink roses against it
Soundlessly collateral and incompatible:
World is suddener than we fancy it.
World is crazier and more of it than we think,
Incorrigibly plural. I peel and portion
A tangerine and spit the pips and feel
The drunkenness of things being various.
And the fire flames with a bubbling sound for world
Is more spiteful and gay than one supposes—
On the tongue on the eyes on the ears in the palms of one’s hands—
There is more than glass between the snow and the huge roses.
errorslearningLocal Rationalitypsychological safety