On February 12, 2009, about 2217 eastern standard time, a Colgan Air, Inc., Bombardier DHC-8-400, N200WQ, operating as Continental Connection flight 3407, was on an instrument approach to Buffalo-Niagara International Airport, Buffalo, New York, when it crashed into a residence in Clarence Center, New York, about 5 nautical miles northeast of the airport. The 2 pilots, 2 flight attendants, and 45 passengers aboard the airplane were killed, one person on the ground was killed, and the airplane was destroyed by impact forces and a postcrash fire. The flight was operating under the provisions of 14 Code of Federal Regulations Part 121. Night visual meteorological conditions prevailed at the time of the accident.
The National Transportation Safety Board determines that the probable cause of this accident was the captain’s inappropriate response to the activation of the stick shaker, which led to an aerodynamic stall from which the airplane did not recover. Contributing to the accident were (1) the flight crew’s failure to monitor airspeed in relation to the rising position of the lowspeed cue, (2) the flight crew’s failure to adhere to sterile cockpit procedures, (3) the captain’s failure to effectively manage the flight, and (4) Colgan Air’s inadequate procedures for airspeed selection and management during approaches in icing conditions.
The safety issues discussed in the final report focus on strategies to prevent flight crew monitoring failures, pilot professionalism, fatigue, remedial training, pilot training records, airspeed selection procedures, stall training, Federal Aviation Administration (FAA) oversight, flight operational quality assurance programs, use of personal portable electronic devices on the flight deck, the FAA’s use of safety alerts for operators to transmit safety-critical information, and weather information provided to pilots. Safety recommendations concerning these issues are addressed to the FAA.
Photo: The wreckage of Continental flight 3407 (Photo credit AP)
The Organizational Influences
“Accidents come in many sizes, shapes and forms and are the result of a sequence of events or a serial development. It is now broadly recognized that accidents in complex systems occur through the concatenation of multiple factors, where each may be necessary but none alone sufficient, they are only jointly sufficient to produce the accident.
All complex systems contain such potentially multi-causal conditions, but only rarely do they arise thereby creating a possible trajectory for an accident. Often these vulnerabilities are “latent”, i.e. present in the organization long before a specific incident is triggered. Furthermore, most of them are a product of the organization itself, as a result of its design (e.g. staffing, training policy, communication patterns, hierarchical relationship,) or as a result of managerial decisions.”
THE ORGANIZATIONAL ACCIDENTS (1)
“Major accidents occur in complex productive systems, had been extensively investigated and the reports had made it clear that the performance of those at the sharp end (who may or may not have made errors, but mostly did) was shaped by local workplace conditions and upstream organizational factors. It became obvious that one could not give an adequate account of human error without considering these contextual system issues.”
“Short-term breaches may be created by the errors and violations of front-line operators, however, latent conditions – longer-lasting and more dangerous gaps- are created by the decisions of designers, builders, procedure writers, top-level managers and maintainers. A condition is not a cause, but it is necessary for a causal factor to have an impact. All top-level decisions seed pathogens into the system, and they need not be mistaken. The existence of latent conditions is a universal in all organizations, regardless of their accident record.”
“No one failure, human or technical, is sufficient to cause an accident. Rather, it involves the unlikely and often unforeseeable conjunction of several contributing factors arising from different levels of the system. The concurrent failure of several defenses, facilitated, and in some way prepared, by suboptimal features of the organization design, is what defines an organizational accident.”
“Organizations, whether they are a result of natural evolution or design, generally function in a hierarchical fashion. This means that actions, decisions and directives made at a higher level are passed on to a lower level, where they either are implemented directly, or interpreted in some way before they are passed on to the next level below, etc. The basic principle of organizational control is simply that higher levels control what happens at lower levels, although control more often is in terms of goals (objectives) and criteria than instructions that must be carried out to the letter.”
“The very basis for the principle used to explain accidents as failures at anyone of these stages is that management decisions propagate downwards and progressively turn into productive activity; Bad management decisions propagate downwards and progressively turn into unsafe activity, and possibly accidents.
However, we have known, at least since the days of David Hume (1711-1776), that causes must be prior to effects, e.g., that A must happen before B. But we also know that the temporal orderliness of two events does not mean that A necessarily is the cause of B. Such a conclusion is logically invalid and furthermore disregards the role of coincidences.”
“Accidents are due to a combination of specific events and the failure of one or more barriers – or of all barriers if they are serial rather than parallel- that should have prevented a hazard from resulting in a loss. The failed barriers can be found at any level of the organization or – what is essentially the same thing – at any stage of the developments that led to the accident. This is consistent with the view that “everybody’s blunt end is somebody else’s sharp end”.”
“The understanding of how accidents occur has during the last eighty years or so undergone a rather dramatic development. The initial view of accidents as the natural culmination of a series of events or circumstances, which invariably occur in a fixed and logical order (Heinrich, 1931), has in stages been replaced by a systemic view according to which accidents result from an alignment of conditions and occurrences each of which is necessary, but none alone sufficient (e.g., Bogner, 2002).
Indeed, it may even be argued that the adaptability and flexibility of human performance is the reason both for its efficiency and for the failures that occur, although it is rarely the cause of the failures. In that sense even serious accidents may sometimes happen even though nothing failed as such.
Adopting this view clearly defeats conventional accident models, according to which accidents are due to certain (plausible) combinations of failures. This is the logic of functions as represented, e.g., by the fault tree. But the fault tree only shows representative accidents. The more unusual accidents cannot be captured by a fault tree, one reason being that there are too many conjunctive conditions. What we see in accidents is that confluences occur, and predictive accident models must therefore not only recognize that confluences occur but also provide a plausible explanation of why they happen. If we relax the requirement that every accident must involve the failure of one or more barriers, the inescapable conclusion is that we need accident analysis methods that look equally to individual as to organizational influences. In other words, models of “human error” and organizational failures must be complemented by something that could be called socio-technical or systemic accident models
This line of thinking corresponds to the Swedish MTO model- Människa (Man) – Teknik (Technology) – Organisation. MTO considers accidents are due to a combination of human, technological and organizational factors giving the three groups equal importance. It promotes a view of accidents as due to a combination of the three groups related to performance variability. Performance variability management accepts the fact that accidents cannot be explained in simplistic cause-effect terms, but that instead, they represent the outcome of complex interactions and coincidences which are due to the normal performance variability of the system, rather than actual failures of components or functions. (One may, of course, consider actual failures as an extreme form of performance variability, i.e., the tail end of a distribution.) To prevent accidents there is therefore, a need to be able to describe the characteristic performance variability of a system, how such coincidences may build up, and how they can be detected. This reflects the practical lesson that simply finding one or more “root” causes in order to eliminate or encapsulate it is inadequate to prevent future accidents. Even in relatively simple systems, new cases continue to appear, despite the best efforts to the contrary.”
WHY DO AIRCRAFT CRASH? (2)
“The annals of aviation history are littered with accidents and tragic losses. Since the late 1950s, however, the drive to reduce the accident rate has yielded unprecedented levels of safety to a point where it is now safer to fly in a commercial airliner than to drive a car or even walk across a busy New York city street. Still, while the aviation accident rate has declined tremendously since the first flights nearly a century ago, the cost of aviation accidents in both lives and dollars has steadily risen. As a result, the effort to reduce the accident rate still further has taken on new meaning within both military and civilian aviation.
Even with all the innovations and improvements realized in the last several decades, one fundamental question remains generally unanswered: “Why do aircraft crash?” The answer may not be as straightforward as one might think. In the early years of aviation, it could reasonably be said that, more often than not, the aircraft killed the pilot. That is, the aircraft were intrinsically unforgiving and, relative to their modern counterparts, mechanically unsafe. However, the modern era of aviation has witnessed an ironic reversal of sorts. It now appears to some that the aircrew themselves are more deadly than the aircraft they fly (Mason, 1993; cited in Murray, 1997). In fact, estimates in the literature indicate that between 70 and 80 percent of aviation accidents can be attributed, at least in part, to human error (Shappell & Wiegmann, 1996). Still, to off-handedly attribute accidents solely to aircrew error is like telling patients they are simply “sick” without examining the underlying causes or further defining the illness.
So what really constitutes that 70-80 % of human error repeatedly referred to in the literature? Some would have us believe that human error and “pilot” error are synonymous. Yet, simply writing off aviation accidents merely to pilot error is an overly simplistic, if not naive, approach to accident causation. After all, it is well established that accidents cannot be attributed to a single cause, or in most instances, even a single individual (Heinrich, Petersen, and Roos, 1980). In fact, even the identification of a “primary” cause is fraught with problems. Rather, aviation accidents are the end result of a number of causes, only the last of which are the unsafe acts of the aircrew (Reason, 1990; Shappell & Wiegmann, 1997a; Heinrich, Peterson, & Roos, 1980; Bird, 1974).”
The Human Factors Analysis and Classification System- HFACS describes four levels of failure: 1) Unsafe Acts, 2) Preconditions for Unsafe Acts, 3) Unsafe Supervision, and 4) Organizational Influences.
What some call Root Cause NEVER is in the airman is in the organization.
The Organizational Influences leading to the Unsafe Supervision behind the Preconditions for Unsafe Acts of Air Crew.
Photo: The Asiana Airlines Boeing 777 plane after it crashed while landing in San Francisco. Photograph: Jed Jacobsohn/Reuters
“Fallible decisions of upper-level management directly affect supervisory practices, as well as the conditions and actions of operators. Unfortunately, these organizational errors often go unnoticed. Generally speaking, the most elusive of latent failures revolve around issues related to resource management, organizational climate, and operational processes.
1. Resource Management. This category encompasses the realm of corporate-level decision making regarding the allocation and maintenance of organizational assets such as human resources (personnel), monetary assets, and equipment/facilities. Generally, corporate decisions about how such resources should be managed center around two distinct objectives – the goal of safety and the goal of on-time, cost effective operations. In times of prosperity, both objectives can be easily balanced and satisfied in full. However, there may also be times of fiscal austerity that demand some give and take between the two. Unfortunately, history tells us that safety is often the loser in such battles and, as some can attest to very well, safety and training are often the first to be cut in organizations having financial difficulties. If cutbacks in such areas are too severe, flight proficiency may suffer, and the best pilots may leave the organization for greener pastures.
Excessive cost-cutting could also result in reduced funding for new equipment or may lead to the purchase of equipment that is sub optimal and inadequately designed for the type of operations flown by the company. Other trickle-down effects include poorly maintained equipment and workspaces, and the failure to correct known design flaws in existing equipment. The result is a scenario involving unseasoned, less-skilled pilots flying old and poorly maintained aircraft under the least desirable conditions and schedules. The ramifications for aviation safety are not hard to imagine.
2. Organizational Climate refers to a broad class of organizational variables that influence worker performance. Formally, it was defined as the “situationally based consistencies in the organization’s treatment of individuals” (Jones, 1988). In general, however, organizational climate can be viewed as the working atmosphere within the organization.
One telltale sign of an organization’s climate is its structure, as reflected in the chain-of-command, delegation of authority and responsibility, communication channels, and formal accountability for actions. Just like in the cockpit, communication and coordination are vital within an organization. If management and staff within an organization are not communicating, or if no one knows who is in charge, organizational safety clearly suffers and accidents do happen (Muchinsky, 1997).
An organization’s policies and culture are also good indicators of its climate. Policies are official guidelines that direct management’s decisions about such things as hiring and firing, promotion, retention, raises, sick leave, drugs and alcohol, overtime, accident investigations, and the use of safety equipment. Culture, on the other hand, refers to the unofficial or unspoken rules, values, attitudes, beliefs, and customs of an organization. Culture is “the way things really get done around here.”
When policies are ill-defined, adversarial, or conflicting, or when they are supplanted by unofficial rules and values, confusion abounds within the organization. Indeed, – However, the Third Law of Thermodynamics tells us that, “order and harmony cannot be produced by such chaos and disharmony”. Safety is bound to suffer under such conditions.
3. Operational Process. This category refers to corporate decisions and rules that govern the everyday activities within an organization, including the establishment and use of standardized operating procedures and formal methods for maintaining checks and balances (oversight) between the workforce and management. For example, such factors as operational tempo, time pressures, incentive systems, and work schedules are all factors that can adversely affect safety. As stated earlier, there may be instances when those within the upper echelon of an organization determine that it is necessary to increase the operational tempo to a point that overextends a supervisor’s staffing capabilities.
Therefore, a supervisor may resort to the use of inadequate scheduling procedures that jeopardize crew rest and produce sub-optimal crew pairings, putting aircrew at an increased risk of a mishap. However, organizations should have official procedures in place to address such contingencies as well as oversight programs to monitor such risks.
Regrettably, not all organizations have these procedures nor do they engage in an active process of monitoring aircrew errors and human factor problems via anonymous reporting systems and safety audits. As such, supervisors and managers are often unaware of the problems before an accident occurs. Indeed, it has been said that “an accident is one incident to many” (Reinhart, 1996). It is incumbent upon any organization to fervently seek out the operattional dangers and risks and plug them up before they create a window of opportunity for catastrophe to strike.
The Unsafe Supervision behind the Preconditions for Unsafe Acts of Air Crew
Recall that in addition to those causal factors associated with the pilot/operator, Reason (1990) traced the causal chain of events back up the supervisory chain of command. As such, we have identified four categories of unsafe supervision: inadequate supervision, planned inappropriate operations, failure to correct a known problem, and supervisory violations.
1. Inadequate Supervision. The role of any supervisor is to provide the opportunity to succeed. To do this, the supervisor, no matter at what level of operation, must provide guidance, training opportunities, leadership, and motivation, as well as the proper role model to be emulated. Unfortunately, this is not always the case. sound professional guidance and oversight is an essential ingredient of any successful organization. While empowering individuals to make decisions and function independently is certainly essential, this does not divorce the supervisor from accountability. The lack of guidance and oversight has proven to be the breeding ground for many of the violations that have crept into the cockpit.
Some examples of inadequate supervision are (not limited to):
- Failed to provide guidance
- Failed to provide operational doctrine
- Failed to provide Oversight
- Failed to provide Training
- Failed to provide Qualifications
- Failed to provide Track performance
2. Planned Inappropriate Operations. Occasionally, the operational tempo and/or the scheduling of aircrew is such that individuals are put at unacceptable risk, crew rest is jeopardized, and ultimately performance is adversely affected. Such operations, though arguably unavoidable during emergencies, are unacceptable during normal operations. Therefore, the second category of unsafe supervision, planned inappropriate operations, was created to account for these failures.
Some examples of inappropriate planned operations are (not limited to):
- Failed to provide correct data
- Failed to provide adequate brief time
- Improper manning
- Mission not in accordance with rules/regulations
- Provided inadequate opportunity for crew rest
3. Failure to Correct a Known Problem. The third category of known unsafe supervision, Failed to Correct a Known Problem, refers to those instances when deficiencies among individuals, equipment, training or other related safety areas are “known” to the supervisor, yet are allowed to continue unabated. The failure to correct the behavior, either through remedial training or, if necessary, removal from flight status, the failure to consistently correct or discipline inappropriate behavior certainly fosters an unsafe atmosphere and promotes the violation of rules.
Some examples of failure to correct a known problem are (not limited to):
- Failed to correct document in error
- Failed to identify an at-risk aviator
- Failed to initiate corrective action
- Failed to report unsafe tendencies
4. Supervisory Violations. Supervisory violations, on the other hand, are reserved for those instances when existing rules and regulations are willfully disregarded by supervisors. Supervisors have been known occasionally to violate the rules and doctrine when managing their assets. For instance, there have been occasions when individuals were permitted to operate an aircraft without current qualifications or license. Likewise, it can be argued that failing to enforce existing rules and regulations or flaunting authority are also violations at the supervisory level. While rare and possibly difficult to cull out, such practices are a flagrant violation of the rules and invariably set the stage for the tragic sequence of events that predictably follow.
Some examples of supervisory violations are (not limited to):
- Authorized unnecessary hazard
- Failed to enforce rules and regulations
- Authorized unqualified crew for flight
No one thing “causes” accidents. Accidents are produced by the confluence of multiple events, task demands, actions taken or not taken, and environmental factors. Each accident has unique surface features and combinations of factors.What some call Root Cause NEVER is in the airman is in the organization.
To be continued on Normalization of Deviance: when non-compliance becomes the “new normal”
- Revisiting The « Swiss Cheese » Model Of Accidents. J. Reason, E. Hollnagel, J Paries. European Organisation for the Safety Of Air Navigation- EUROCONTROL. Eurocontrol Experimental Centre. EEC Note No. 13/06. Project Safbuild. Issued: October 2006.
- DOT/FAA/AM-00/7 U.S. Department of Transportation, Federal Aviation Administration, The Human Factors Analysis and Classification System–HFACS. Scott A. Shappell, Douglas A. Wiegmann. February 2000
- Loss of Control on Approach Colgan Air, Inc.Operating as Continental Connection Flight 3407 Bombardier DHC-8-400, N200WQ Clarence Center, New York. February 12, 2009. Accident Report NTSB/AAR-10/01 National PB2010-910401
Not long ago I was saying to a dear friend that serious incidents and accidents do not occur overnight, are just the tip of the iceberg. They are the result of decisions that create conditions and situations that remain dormant in the environment for a long time waiting for someone to put the last link in the chain. Our flight crews avoid every day that this chain is completed, until one day some of them will not be able to.
There is no doubt that flight crews should be responsible for their actions and we expect them to do their work with professionalism, to study a lot, to deeply know their aircraft, to be disciplined, to adhere to the standard operating procedures, to take care of themselves, to sleep good and sufficient, to not self-medicate. But you can not with a decision, pretend they to bear the blame for the mistakes and failures of an entire system.
Root Cause NEVER is in the airman, is in the organization.
By Laura Duque-Arrubla, a medical doctor with postgraduate studies in Aviation Medicine, Human Factors and Aviation Safety. In the aviation field since 1988, Human Factors instructor since 1994. Follow me on facebook Living Safely with Human Error and twitter@dralaurita. Human Factors information almost every day