When the automated system fails and no human with sufficient competence exists to intervene, the failure is catastrophic by definition.
This paper does not argue from theory. It argues from the record. The case studies documented here share a single structural feature: an automated system failed outside its design parameters, and no human with sufficient competence existed to intervene effectively. The failures were catastrophic not because the automated systems were poorly designed, but because the human redundancy that would have contained the failure had been eliminated.
This is Stage 3 of the collapse gradient. Stages 0 and 1 (HC-020) describe how human capability erodes under automation. Stage 2 (HC-021) describes how tacit knowledge fails to transfer. Stage 3 is where the consequence arrives: the automated system encounters a condition outside its training distribution, and the human backup that was assumed to exist does not.
Human redundancy has been eliminated. The automated system handles all normal operations. When the system encounters conditions outside its training distribution — novel inputs, cascading failures, adversarial conditions — no humans with sufficient competence to intervene remain in the loop.
The critical distinction: the humans may still be present. They may have formal authority to intervene. They may have override controls available. What they lack is the practiced competence to use those controls effectively under the time pressure and cognitive load of a system failure. Authority without capability is not redundancy. It is theater.
The SEC/CFTC joint report, “Findings Regarding the Market Events of May 6, 2010,” documents the most rapid large-scale market disruption in history at that time. In approximately 36 minutes, the Dow Jones Industrial Average fell nearly 1,000 points — roughly 9% — and recovered most of the loss. Individual securities traded at prices from one penny to over $100,000. Approximately $1 trillion in market value temporarily vanished.
The Flash Crash was a Stage 3 event in miniature. Automated trading systems interacted in ways that no individual system designer anticipated. The speed of the cascade exceeded human reaction time. The human traders who remained on the floor could observe the collapse but could not intervene at the speed required to arrest it. The eventual stabilization required automated circuit breakers — more automation to contain the failure of automation, because the human layer was too slow to function as a backup.
SEC Release No. 34-70694 (October 16, 2013) documents the Knight Capital incident in regulatory detail. A software deployment error activated obsolete trading code on Knight’s automated systems. In 45 minutes, the system executed erroneous trades that produced $440 million in losses — exceeding the firm’s total capital. Knight Capital was effectively destroyed as an independent entity and was acquired by Getco LLC.
The structural lesson: Knight Capital had human operators monitoring the system. Those operators detected the anomaly within minutes. But the automated system was executing thousands of trades per second. The time required for a human to diagnose the problem, determine the correct response, and implement it exceeded the time in which the damage became fatal. The humans were present, aware, and acting — and it was not enough, because the system operated at a speed that made human intervention structurally insufficient.
The Boeing 737 MAX crashes — Lion Air Flight 610 (October 2018) and Ethiopian Airlines Flight 302 (March 2019) — killed 346 people. The NTSB and FAA investigations documented a specific Stage 3 mechanism: the Maneuvering Characteristics Augmentation System (MCAS) received erroneous angle-of-attack data and repeatedly pushed the aircraft nose down. The pilots had formal authority to override MCAS. They had physical access to the override controls. What they lacked was sufficient training on and practiced familiarity with the MCAS system’s failure modes to diagnose and correct the problem under the extreme time pressure and cognitive load of a nose-down emergency at low altitude.
The 737 MAX case is the clearest illustration of the distinction between authority and capability. Boeing’s design assumed that pilots could override MCAS. The assumption was formally correct — the override procedure existed. But the assumption that pilots would have the practiced competence to execute the procedure under emergency conditions was not validated. The human backup was assumed, not ensured.
The 737 MAX investigation revealed a pattern that generalizes beyond aviation: system designers assume human override capability that the training system does not produce. The pilots were trained on the 737 MAX with minimal MCAS-specific instruction, because MCAS was designed to operate transparently. When it failed non-transparently, the pilots encountered a system behavior they had not practiced responding to, under conditions that left no time to learn.
The Northeast Blackout of 2003 affected approximately 55 million people across the northeastern United States and Ontario, Canada. The US-Canada Power System Outage Task Force investigation documented a cascading failure that began with software bugs in the alarm system at FirstEnergy Corporation in Ohio. Operators were unaware that their monitoring systems had failed. Without accurate real-time data, they could not detect the developing cascade until it was beyond manual intervention.
The blackout illustrates a variant of Stage 3: the human operators were competent but blind. The automated monitoring system that they depended on for situational awareness failed silently. By the time the operators realized the monitoring system was not functioning, the grid had already entered a cascading failure state that exceeded the capacity of manual intervention. The human backup existed but could not function because it depended on the automated system it was supposed to back up.
These four cases share a structural pattern that defines Stage 3:
1. The automated system fails outside its design envelope. Interacting algorithms in the Flash Crash. Obsolete code activation at Knight Capital. Erroneous sensor data in MCAS. Silent alarm failure at FirstEnergy. In each case, the failure mode was not one the system was designed to handle gracefully.
2. Human override exists formally but not functionally. Traders could theoretically stop trading. Operators could theoretically kill the process. Pilots could theoretically trim manually. Grid operators could theoretically re-route power. In each case, the practical conditions — speed, information, practiced competence — made the formal override insufficient.
3. The result is catastrophic. $1 trillion in temporary market disruption. $440 million in losses destroying a firm. 346 deaths. 55 million people without power. The failures are not proportional to the triggering error. They are proportional to the absence of effective human redundancy.
The human was in the loop. The human had authority. The human could not act effectively. This is what single-point fragility looks like: not the absence of a human, but the absence of a competent one.
Stage 3 can be detected before catastrophic failure through the following indicators:
Post-failure recovery time. Increasing time required for human operators to restore normal function after automated system failures. If recovery time is growing, human competence relative to system complexity is shrinking.
Incident report language. “No human available with sufficient expertise” or equivalent language in post-incident analyses. This language signals that the authority-capability gap documented in the 737 MAX case is present.
Explainability decline. Declining ability of human operators to explain why the automated system made specific decisions. If practitioners cannot explain the system’s behavior under normal conditions, they cannot diagnose its behavior under failure conditions.
Override practice frequency. How often human operators practice manual override of automated systems. Aviation recognized this indicator and mandated manual flight practice (FAA AC 120-111). Most other domains have not.
Stage 3 produces the evidence that Stages 0–2 were operating — but the evidence arrives as catastrophe. The question for HC-023 (The Common Faculty Problem) is why the current wave of AI automation creates a risk at Stage 4 that prior automation waves did not: not fragility in one domain, but fragility across the cognitive substrate that all domains share.
Internal: This paper is part of The Collaboration (HC series), Saga XI. It draws on and contributes to the argument documented across 31 papers in 2 series.
External references for this paper are in development. The Institute’s reference program is adding formal academic citations across the corpus. Priority papers (P0/P1) have complete references sections.