High Reliability Fundamentals: A New Way of Thinking
PRINCIPLES OF HIGH RELIABILITY
An important concept in human performance literature is captured in the performance improvement forumla: Re + Mc → ØE. In plain English, reducing errors (Re) coupled with managing controls (Mc) leads to zero significant events (ØE). (see the DOE Human Performance Improvement Handbook, Vol. 1, pg. 1-16. This siimple, common sense formula is very appealing and useful to most people. However, to the mathematically inclined using even simple mathematical symbols implies a prceision that doe snot exist and is therefore confusing. Instead of facilitating learning as intended, the formula becomes a barrier.
The formula has two characteristics that give the mathematician trouble. First, adding two terms implies they independently contribute to the result. In this case, if one succeeded in driving errors
to zero, there would still be many significant events if Mc were some positive number. Instead,
these terms intuitively interact. That is, errors combine with the inadequate controls to produce
events. In mathematics interactive terms are multiplied (or divided). They give each other leverage
or are catalysts.
Second, the term “managing controls” is counterintuitive. Intuition says that more or better
managing is good, but the current formula implies that would lead to more events not fewer! How
should one think about Mc if it is small? Slightly altered the meaning for the Mc term to:
“maximizing control effectiveness” thereby makes large Mc very good.
Combining the two changes, an alternative performance improvement formula that may appeal to
the mathematician and non-mathematician alike: Lim Re/Mc → ØE. That is, one approaches zero
events by reducing errors and maximizing control effectiveness. This recognizes that errors will
never be zero and that control effectiveness can never be infinite, but by making one small and the
other very large one can get very close to zero (many 9s in quality-speak).
Finally, it is important to communicate that Re and Mc are actually complex, non-linear functions with multiple variables. Indeed, humans are in the loop and there are no known closed form formulas that describe human behavior. The best one can do is use heuristic tools designed to help identify and correct undersirable situations.
Extending the Concept to High Reliability Organizational (HRO) Theory
The performance improvement formula may be further enhanced using High Reliability
Organization (HRO) concepts. Research on resilient performance has documented differences in
the ideal work and acutual work practice, a notion referred to as a “work perception gap”—
basically the difference between work as imagined as opposed to how work is actually done.1
When the two are not aligned the resulting gap can further complicate error reduction and
management control efforts thus leading to more significant events. This is a simple, but very
powerful concept.
For this discussion, define Wi = work as imagined and Wd = work as done (each > 0); assume Wd
can never be better than Wi (Wd ≤ Wi), so that the ratio Wd/Wi is always less than or equal to 1.
Ideally, Wd/Wi would equal 1, but in the real world the ratio is going to be less than 1.
In order to be a good scaling factor in the performance improvement formula the new parameter
must approach zero as it improves so it contributes to achieving zero events. A suitable parameter
is then (1 – Wd/Wi). This parameter approaches zero as the work as done approaches work as
imagined. Some may recognize an equivalent form is (Wi – Wd)/Wi or the difference between
work as imagined and work as done scaled by the larger of the two, Wi.
This leads to proposing a new work gap scaling parameter using the form "delta W" for the work perception gap:
ΔW = (1 – Wd/Wi) = (Wi – Wd)/Wi.
The new HRO form of the performance improvement formula then becomes:
Lim (Re/Mc) ΔW → ØE.
Minimizing errors and the work perception gap while maximizing control effectiveness will
significantly lower the probability of significant events.
There are two more distinctions from high reliability research that shape a formula for a strategic
reliability approach; risk management and resilience. The Re expression was initially focused on the
idea that with proper work design considering expertise and performance support tools, the
incidence of error could be minimized. While that remains a desirable goal, the reality is that
conditions are never ideal, the map is not the territory, thus there will always be a positive ΔW . The
ability to notice the Δ is a function of what researchers term ‘mindfulness,’ and the ability to notice
and understand is termed ‘sensemaking’. So the function of the Re term can be modified to reflect
noticing and sensemaking.
The second distinction of resilience is the ability to respond to the unexpected. In complex systems
circumstances sometimes come together in never before seen ways, ways that are computationally
impossible to predict in design. Emergency preparedness is a classic recognition that such
combinations do occur, that absolute prevention may not be possible, thus mitigation of harm
becomes the driving objective. Highly reliable organizations are characterized by building the
capability of resilience to detect, respond, improvise, mitigate and recover without serious harm to
people, the environment or the mission. Thus a new term is added to the equation, Br for
Bolstering Resilience. Bolstering is used in the sense of cushioning, supporting or adding to. As ÄW
becomes small and the ability to respond to the unexpected becomes large, we may then express
the strategic equation for reliability as:
(Re/Mc) (ΔW/Br) → ØE.
* Re − increase mindfulness to notice & mitigate risk
* Mc − maximizing defense effectiveness
* ΔW − work as imagined vs. work as done
* Br − bolstering resilience
* ØE − no consequential events
(Editor's concluding comments) Jens Rasmussen, James Reason, Erik Hollnagel and others discuss "error" in cognitive terms relating to human information processing. It is in that sense that we use the term Re. Heuristics are essential in operationalizing high reliability concepts, and indeed there are a large number of tools that have been developed to support highly reliable performance. (see for example the DOE Human Performance Handbook Volume II.
1(ed.note: the term Work as Imagined versus Work as done was introduced into the lexicon by Sidney Dekker in the first Resilience Engineering book)
Contact Us: W. Earl Carnes or phone: 301-903-5255 with your questions and comments.
|