The goal of this document is to provide a reference list of criteria which can be used to compare different methods of root-causal analysis of accidents and incidents, and a handbook for applying these criteria. Each individual criterion is to have a

Name
Explanation of its meaning
Guidance for applying it
Example of its application
Maintainer (one or two persons)

The name, an explanation (which might be brief) and the maintainer are guaranteed to be present for each criterion. For each criterion, its name appears first, in boldface, and its maintainer appears last, in italics in parentheses. Since this document evolves from the initial discussion on December 18, 2002, other characteristics that these three might be missing for any given criterion. They will be added as they become available.

Since this document is evolutionary, and is intended to be for general use, we invite contributions and commentary. We acknowledge contributors at the bottom of the document.

List of Criteria

Target User Group
Is the method targeted towards the "sophisticated user" (does it require use of techniques such as theorem proving which require specific expertise)? Is is suitable for use by domain experts? The ideal criterion is that the method should be easy to use (with a minimum amount of training, preferably requiring no proprietary tool) by the average engineer.
(Jens Braband jens.braband@siemens.com)
Tool Support
This criterion is self-explanatory. What is there in the way of tool support?
(Luke Emmet loe@adelard.com)
Scalability
Is the method scalable? Can the method be used cost-effectively for minor incidents as well as major accidents? Can you apply a subset of the method to small, or to less-significant, problems and the "full monty" to large, or to significant, problems?

In inquiring about the scalability of a method, one should keep in mind that there are objective differences in problem and analysis complexity, that is, differences which depend on characteristics of the incident to be analysed and not on properties of the analysis alone. For example, one might claim that the 1988 A320 accident at Habsheim is an easier accident to analyse than the train crash at Ladbroke Grove outside Paddington, despite a decade of public disagreement over faked data recorders and recordings concerning Habsheim. Here is the argument.

According to the official report, whose analysis was verified by a Why-Because Analysis, the accident at Habsheim can be put down to: bad flight planning and preparation on the part of the cockpit crew (CRW), bad management oversight (Air France broke its own rules and allowed CRW to plan to break French regulations governing airshow demonstration flights), lack of skill (the pilot flying had no experience in airshow demonstration flights or in test flying), bad execution (the pilot flying failed to follow his flight plan, and improvised, including breaking his own altitude limits), the ground layout (there were trees at the end of a runway), and, maybe, characteristics of the aircraft. That comprises six relatively clear factors, of which some are obvious without much investigation (for example, the ground layout). So five factors on which all agree, despite the public disagreements. The report is easy to read and, apart from some dissension about possibly missing data and perhaps slightly different aircraft characteristics, it is definitive.

Compare the situation at Ladbroke Grove. There are at least nine different domains in which factors appear, according to the Why-Because Analysis performed by Ernesto de Stefano, and the official (Cullen) report focused on only one of them. First of all, that is, objectively, a difference in complexity, due to subsystem constitution, if you like. Second, the disagreement between the official analysis and the technical WB-Analysis shows that the factors, whatever they might be, are not so easy or obvious to grasp that everyone can do it. And, finally, it is hard to see any way in which de Stefano's results can adequately be summarised so as to fit into a WB-graph the size of the Habsheim WB-graph while still retaining its explanatory force. One may thus reasonably conclude that Ladbrove Grove is a more complex accident than Habsheim to analyse, although there was essentially no public disagreement over the analysis.

So the question of scalability asks whether the complexity of analysis using the method scales with the complexity of the problem. For the WB-Analyses of the Habsheim and Ladbroke Grove accidents, one would consider whether the approximately twenty nodes of the former against the approximately ninety nodes of the latter constituted an appropriate measure of the objective problem complexity
(Peter Ladkin ladkin@rvs.uni-bielefeld.de)
Graphical Representation (see also this page)
What is the nature of the method's graphical representation?

The motivating principle is that a picture is better than a thousand words. It is often more comprehensible to display results of an analysis method as an image, a graph, or other form of illustration, than as purely written text.

We argue that there is a reason for this phenomenon. Besides containing descriptions of the facts surrounding an incident, say written in a language L, a written text must also indicate the causal interrelations of those facts, which necessitates some use of specific technical language, say T, along with a method to distinguish this language from the language in which the facts of the incident are expressed. This method would usually be the syntax: T would use keywords, and a certain sentence structure, for example, in which the factual sentences in language L would be inserted. This embedding of L in T, as well as the necessity to distringuish the syntax of T from that in use already in L, leads either to more complicated sentence structures, or to oversimplification of the causal assertions. Neither of these features is desirable: the first leads to cognitive complexity, and the second leads to information loss.

However, a non-textual representation of the causal interrelations and other identified structure, such as taxonomies of failures, facilitates the cognitive distinction of this structure from that of the facts themselves. The causal and taxonomic structure is perceived visually, and the incident facts linguistically, and these two cognitive capabilities are largely independent. The complexity of the understanding of the facts remains at the level of the language L; the visual structure is at the level of the language T (with the embedded L-expressed facts treated as atomic), and the cognitive complexity of the representation remains at the level of L added to T, rather than L embedded in T, as with a purely text-based representation.

Anecdotal experience with text-based and graphical representations of the results of causal analyses has indicated that a representation of causal information as a directed graph allows both experts and non-experts alike to interpret the results more easily, as well as to see structural features that could well be hidden in a textual representation. Examples of such renderings are WB-Graphs (see the WBA Home Page at http://www.rvs.uni-bielefeld.de) and the causal graphs of Pearl (Judea Pearl, Causality: Models, Reasoning and Inference, Cambridge University Press, 2000).

The desirable properties of a graphical representation are:
- to display clearly the semantics of causality (including denotation of causal factors, and taxonomic classification of factors), and
- to be cognitively (relatively) easily surveyable by a single person.
In pursuit of the second property, it is desirable to display the results of a causal analysis as a unit, say a page in some standard format. The practical limit of page size is A0 in the international standard sizes. If the causal relations can be displayed graphically on a page of size less than or equal to A0 while remaining readable, then there is good reason so to do. If this cannot be accomplished, some method of factoring the results must be applied to render them over multiple pages. A factoring method is desirable even if results may be rendered on one page, for different page sizes are used for different purposes: an A0 rendering of a graph is adequate for comprehension and discussion, but is inappropriate for a report, for which A4 is the standard size.

One method of factoring which has proved relatively helpful for causal graphs in particular is graph-theoretic factoring, in which graphs are separated at their places of least width and each "chunk" is rendered separately, with the cut-set nodes identified in each chunk through color. The entire graph is rendered small, to exhibit the overall graph structure including the (colored) cut-set nodes while rendering any text unreadable. The individual chunks are rendered on separate pages. An example is the WBA of the 1993 A320 Warsaw accident (Ladkin, Höhl), available through the Publications page at http://www.rvs.uni-bielefeld.de This factoring, however, has limited application to situations in which the width of the graph is large (say, greater than five nodes), or in which a large proportion of the graph lies in one chunk. An example in which this factoring would seem to be poor is the WBA of the 1995 Cali accident, also at http://www.rvs.uni-bielefeld.de. This type of factoring obviously does not apply to representations of the results in forms other than that of a graph.

Another method of factoring consists in identifying subsystem involvement and rendering factors which concern an individual subsystem as one unit, with different units for different subsystems. Such a method was used for the WBA of the Ladbroke Grove accident (de Stefano) to factor a WB-graph whose minimal readable rendering is likely A2 as separate A4 units. (Contact the author, Ernesto de Stefano for a copy of his report, in German).

Some representational difficulties may be caused by orthogonal features of causal explanations, such as identifying causal factor relations on the one hand, and classifying them (say, according to latency/immediacy features; according to time of occurrence, which may well involve long intervals or recurrency for latent factors; or according to some taxonomy of human error or of organisational behavior). Attempts to represent all these features in one unit may lead to visual clutter and thereby to cognitive complexity. Some form of factoring is required in such cases.

Ideally, a graphical representation could also display the history of the analysis. For example, the method SOL and its tools (contact Dr. Babette Fahlbruch) infers the existence of causal factors in the structure of the organisations involved in an incident (called "indirect causes") from the ostensive facts concerning the progression of the incident (a "situational description") using a checklist/questionnaire style approach based on phenomenological and structural taxonomies, designed to control for the "heuristics" which lead to bias in using such "checklist" approaches to causal information gathering. (SOL is to our knowledge unique among causal anaylsis methods in attempting to control for heuristics.) Not all heuristics are known, and not all known heuristics have accepted controls. It might be judged that a visual rendering of the history of the analysis would allow one more easily to identify and control for heuristics whose features may not have been fully accounted for in the version of the method being used.
(Claire Blackett claire.blackett@ucd.ie)
Modularity
How modular is the method, and in what ways is it modular? Can its modularity be made to mirror the organisational division of labour and domain expertise in the user group?
(Peter Ladkin,I Made Wiryana ladkin@rvs.uni-bielefeld.de mwiryana@rvs.uni-bielefeld.de)
Reproducibility
Are the results of the method reproducable? Do different people using it independently obtain similar results for the same tasks?
(Fergus Toolan fergus.toolan@ucd.ie)
Plausibility Checks
Do there exist reasonable, quick plausibility checks on the results obtained which are independent of the tool? What ways are there of checking the "correctness" of the results?
(Luke Emmet loe@adelard.com)
Rigor
How rigorous is the method? Rigor has two relevant aspects:
- Does the method have a rigorous meaning, a formal semantics, for the key notions of cause, causal factor, root cause? Is the semantics easy to apply?
- Are the results of method amenable to formal (mathematical) verification? To what extent is an application of the method so amenable?
(Peter Ladkin ladkin@rvs.uni-bielefeld.de)
Factor Detection
Does the method provide guidance on identifying additional causes which have not been identified on initial investigation?
(Oliver Lemke o.lemke@tu-bs.de)
Improvement Factor
What is the improvement factor, in terms of quality of analysis and expressiveness, from what could be done using other methods, or using a "naive" approach? There are at least two aspects to judging the improvement factor:
- Quality improvement. Does the method find mistakes in previous analyses? Does it find all such mistakes? Does it show definitively why these are mistakes?
- Expressive improvement. Does the method lead to a finer analysis of subdomains of causal factors, which improves targeting of prophylactic measures?
(Peter Ladkin ladkin@rvs.uni-bielefeld.de)
Evolutionary compatibility
How well does the method mesh with the "standard" methods already in use in one's organisation? How well does the method mesh with industry "best practice" to date?
(Peter Ladkin ladkin@rvs.uni-bielefeld.de)
Adaptability.
How adaptable is the method to individual requirements? How well does the method accomodate changes (usually called "improvements") in subdomain characterisation? That is expressed somewhat abstractly. More concretely, by example: suppose someone comes up with a new taxonomy for management factors, say reengineers the business, or implements Professor X's taxonomy for human factors in engineering processes. How easily can the method accomodate this new taxonomy?
(Peter Ladkin ladkin@rvs.uni-bielefeld.de)
Coverage
Webster's dictionary explains coverage as "the extent or degree to which something is observed, analyzed and reported." In the case of root-cause-analysis the focus of the analysis is the detection of error/fault sources which have contributed to the accident under consideration. The term "coverage" describes in this context the extent to which the analysis method used can provide this service. Two situations can be distinguished:
1. We have only one analysis method under consideration. Then we need a list of error/fault sources to evaluate which of the error/fault sources from the list can be identified by this method.
2. We have two or more analysis methods under consideration and compare the error/fault sources which can be identified by each one.
The following questions arise in connection with determining coverage:
- Does the method identify all, or a high proportion of, root causes (completeness)?
- Does the method support, and to what extent does it support, the detection of all types of error/fault sources (e.g. technical, managerial, organisational errors/faults)?
- If the method does not support the detection of all the error/fault sources, can other, complementary, methods be used?
- If two methods support the detection of the same error/fault type, are they equally capable?
- Does the method provide hints towards finding related error/fault sources to those already found
(Jan-Tecker Gayen j.gayen@tu-bs.de)
Viewpoint Support
Does the method provide support for different viewpoints, for example
- identifying design faults
- identifying human error
(Timm Grams Timm.Grams@et.fh-fulda.de)
Documentation
How available, and of what quality, is documentation concerning the method?
(Jens Braband jens.braband@siemens.com)

Contributors

Participants of the First Bieleschweig Workshop