The Measurement Crisis · Paper IV · Synthesis

The Engagement Metric

Goodhart’s Law in the Attention Economy — Four Platforms, Four Metric Shifts, One Mechanism

The Institute for Cognitive Sovereignty · 2026 · Research Paper

CSI-2026-MC-004 Published March 3, 2026 20 min read Learn: Truth →
Facebook's engagement algorithm increase in anger and divisive content — internal research, 2021
4
Major platforms that each optimized a different engagement proxy — producing the same outcome
1975
Year economist Marilyn Strathern named Goodhart's Law — the mechanism now runs on three billion users
“When a measure becomes a target, it ceases to be a good measure.”
— Marilyn Strathern, Improving Ratings: Audit in the British University System, 1997 (formalizing Charles Goodhart's 1975 observation)
Section I

Goodhart’s Law — The General Principle

Charles Goodhart was a Bank of England economist who observed in 1975 that monetary policy had a systematic problem: whatever measure the bank chose as its policy target tended to decouple from the economic phenomenon it was meant to track. The mechanism was predictable. When a statistical measure was used to regulate economic behavior, economic actors had incentives to optimize for the measure. Their optimization distorted the statistical relationship between the measure and the underlying phenomenon. The measure then tracked the optimization rather than the phenomenon it was designed to capture.

Goodhart's observation was formalized by sociologist Marilyn Strathern in 1997 into the form now known as Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure." The law is not a statement about bad faith. It describes a structural feature of measurement in institutional contexts where actors are evaluated against metrics they can influence. The distortion occurs whether or not any individual actor intends to deceive. It is a consequence of incentive alignment, not of dishonesty.

The four preceding papers in this series documented Goodhart's Law at work in GDP, BMI, and standardized test scores. This paper documents it in the domain where the mechanism operates at the largest scale: digital engagement metrics. The platforms that organize the information environment for three billion people optimize for engagement measures. Engagement optimization has produced content environments that, by the platforms' own internal research, degrade the mental health, social epistemology, and political discourse of the populations whose engagement they are maximizing. The metric became the target. The target destroyed the value the metric was meant to represent.


Section II

The Engagement Metric Defined — What Platforms Actually Measure

"Engagement" in the context of digital platforms refers to a composite of behavioral signals: time spent on platform, clicks, likes, comments, shares, reactions, and responses. Each of these signals is observable and quantifiable. Each correlates, at the platform level, with advertising revenue: more time spent and more interactions mean more opportunities to serve advertisements. The engagement metric was designed as an advertising revenue proxy. It was not designed to measure the value delivered to users.

This is the foundational category error. The platforms present engagement metrics to advertisers, analysts, and regulators as evidence that users find the platform valuable. Users who spend four hours per day on a platform are presumed to find it valuable. But engagement does not distinguish between time spent in a state of pleasure, learning, or genuine social connection and time spent in a state of anxiety, compulsive scrolling, or outrage activation. Both produce the same engagement signal. The platform cannot, from engagement data alone, tell the difference between a user who is satisfied and a user who is distressed. Optimization for engagement optimizes for both.

The consequence is predictable from Goodhart's Law: platforms developed recommendation algorithms that maximized engagement as measured. The most engagement-producing content, across platforms and over time, turned out to be content that activated strong negative emotions — outrage, anxiety, tribal identity threat, fear. Content that produced calm, satisfaction, or genuine learning generated less sustained engagement than content that produced emotional arousal. The algorithms, optimizing for engagement, amplified the arousal-producing content. The metric did not distinguish between flourishing and distress. The algorithm produced what the metric rewarded.


Section III

Facebook: The Anger Amplifier — Internal Research and the 5× Finding

In 2021, the Wall Street Journal published the "Facebook Files" — a series of investigative reports based on internal company research documents obtained by whistleblower Frances Haugen. Among the most significant findings: Facebook's own data scientists had documented that its recommendation algorithm significantly increased the prevalence of anger-inducing and politically divisive content in users' news feeds relative to a chronological control condition. The documented increase was approximately 5× for content classified as high-anger or divisive. Facebook researchers had identified the problem. Facebook management had declined to implement the proposed fixes.

The mechanism was documented in Facebook's internal research: the platform had added a five-emoji reaction system in 2016, including an "angry" reaction. Data scientists observed that the angry reaction was statistically more likely to be followed by additional engagement — more commenting, sharing, and further interaction — than other reactions. The algorithm, optimizing for engagement, learned to weight angry-reaction content more heavily in recommendations. The result was a feed that systematically surfaced content optimized for outrage production, because outrage production drove the engagement the algorithm was trained to maximize.

Facebook researchers proposed in 2018 to reduce the engagement weighting of the angry reaction. The proposal was declined by management, with concerns that it would reduce overall platform engagement. The researchers' analysis — that optimizing for angry-reaction engagement was producing a feed that degraded user wellbeing and political discourse — was available to the executives who made that decision. The decision to maintain the algorithm was made with knowledge of the consequence. The engagement metric was not ignorantly misapplied. It was deliberately retained when its effects were documented.


Section IV

YouTube: The Rabbit Hole — Watch Time and the Radicalization Pathway

YouTube's recommendation algorithm optimizes for watch time — the aggregate amount of time viewers spend watching content recommended by the system. Watch time is the platform's core engagement metric; it directly determines advertising revenue. YouTube's algorithm is among the most powerful recommendation systems ever built, responsible for approximately 70% of time spent on the platform according to the company's own estimates.

Research on YouTube's recommendation sequences — what videos the algorithm recommends after a user watches a particular video — has documented a consistent directional pattern: recommendations systematically move toward more extreme, emotionally activating, and conspiracy-adjacent content regardless of the user's initial viewing. A researcher at Google published internal analysis in 2019 suggesting that the recommendation algorithm drove viewers toward progressively more radical content because radical content produced longer watch times. The researcher's findings were disputed by YouTube management. Independent researchers at MIT Media Lab and elsewhere subsequently replicated the directional pattern using external analysis.

The mechanism is again Goodhart's Law: watch time is a proxy for the value users receive from the platform. Extended watch time was treated as evidence that users found content valuable. Radical, outrage-producing, emotionally activating content extends watch time relative to calm, informational, or balanced content — not because users value it more, but because it produces stronger emotional engagement that is neurobiologically more difficult to interrupt. The platform optimized for watch time and produced content environments that kept users watching for reasons that did not correspond to the value the watch-time metric was designed to signal.


Section V

Twitter: The Outrage Dynamic — Retweets and Emotional Contagion

Twitter's core engagement metric is the retweet — the forwarding of another user's message to one's own followers. Retweet rates determine what content is amplified by the platform's recommendation systems and what content remains in relative obscurity. Twitter optimizes for retweet-producing content because retweets drive user acquisition (users are introduced to new accounts through retweeted content) and user retention (users return to the platform to see what has been retweeted about topics they follow).

Research on emotional contagion in social networks — including work by Brady et al. published in PNAS — has documented that moral-emotional language in tweets substantially increases retweet rates. Content that frames information in terms of moral violations, tribal threat, disgust, or outrage spreads faster and further than equivalent information presented in neutral terms. The mechanism is not irrational: moral-emotional content activates strong responses that motivate sharing. The algorithm rewards the activation. The result is an information environment systematically tilted toward content that frames events in the most emotionally activating moral terms available.

Twitter's own researchers published analysis in 2021 finding that the platform's recommendation algorithm amplified political content from the right side of the political spectrum more than the left, and amplified outrage-producing content more than neutral content, across multiple national contexts. The researchers treated this as a finding requiring explanation. The explanation consistent with all available evidence is Goodhart's Law: the algorithm was optimizing for engagement, engagement was higher for politically activating content, and the algorithm's optimization produced systematic amplification of that content regardless of its accuracy, value, or relationship to what users claimed they wanted to see.


Section VI

TikTok: The Identity Engine — Scroll Time and the Adolescent Self

TikTok's recommendation algorithm is optimized for scroll time and video completion rates — how long users stay on the platform and whether they watch videos to their end. The algorithm operates on a uniquely dense feedback loop: because TikTok videos are short (15-60 seconds), the algorithm can update its model of each user's preferences at an extremely high frequency relative to longer-form platforms. Users receive a feed that is specifically calibrated to their demonstrated preferences within hours of first use.

The clinical concern about TikTok's optimization effect on adolescent users centers on the identity-formation mechanism. Research on adolescent psychology documents that adolescents are engaged in an identity formation process that is sensitive to social feedback and group identification. TikTok's algorithm, optimizing for scroll time, learns which content categories produce extended engagement for each user and amplifies content in those categories. For adolescent users, research on platform content consumption patterns documents disproportionate amplification of appearance-focused content, thin-ideal imagery, and content communities organized around depression, self-harm, and eating disorder identity.

The Counter-Argument
Engagement metrics serve important functions — the alternative is paternalistic content control.

The argument for engagement-based recommendation is not trivial: in a world of unlimited content, some recommendation system is necessary, and user behavior is the most direct available signal of what users actually want. Engagement metrics are behavioral democracy — they amplify what users actually choose to watch, share, and like, rather than what editors, curators, or regulators believe users should watch. Replacing engagement optimization with editorial curation would vest a small group of humans with the power to decide what information billions of people should encounter, which creates its own serious risks.

The response this paper offers is not that recommendation systems should be abolished. It is that engagement — behavioral activation — is an imprecise proxy for user value that systematically over-weights content that produces emotional arousal and under-weights content that produces calm satisfaction, learning, or genuine connection. The choice is not between engagement optimization and editorial control. There is a third option: developing and optimizing for metrics that better track the value users actually want — metrics that distinguish between satisfied engagement and distressed engagement, between connection and compulsion.


Section VII

The Engagement Inversion — The Named Condition

The four platforms documented in this paper each represent a variation of the same mechanism. Facebook optimized for reaction-based engagement and produced anger amplification. YouTube optimized for watch time and produced radicalization pathways. Twitter optimized for retweet rates and produced outrage contagion. TikTok optimized for scroll time and produced identity capture. The specific proxies differ; the structural outcome is identical: the engagement metric became the target, the algorithm optimized for the target, and the optimized system produced content environments that degraded the underlying value the engagement metric was meant to signal.

Named Condition — MC-004
The Engagement Inversion

The pattern in which algorithmic optimization for an engagement proxy — defined as behavioral activation (clicks, shares, watch time, reactions) — produces content environments that degrade the underlying value the proxy was meant to measure. Facebook's internal research documented a 5× increase in anger and divisive content relative to non-algorithmic feeds while simultaneously documenting declines in user-reported wellbeing. The platform produced more engagement by producing worse wellbeing. The metric and the value it was meant to represent had inverted. The Engagement Inversion is Goodhart's Law operating at the scale of human civilization's primary information infrastructure.

The phrase "engagement" performs an important rhetorical function in this inversion. It suggests that users are engaged — present, interested, receiving value. The word carries positive connotations of interest and connection. But the engagement metrics these platforms optimize for do not distinguish between a user who is engaged because they are learning, connecting, or enjoying and a user who is engaged because they are outraged, anxious, or compulsively stimulated. The language of engagement obscures the behavioral reality: these platforms are, in the documented cases, optimizing for states that users experience negatively when surveyed directly, while the engagement metric signals them as positive.


Section VIII

The Metric We Need — What Alignment Would Require

The resolution of the engagement inversion requires a different measurement target — one that tracks the value users actually want rather than the behavioral activation that advertising revenue correlates with. Several research programs have attempted to define and measure such a target. The "time well spent" framework, developed by Tristan Harris and colleagues at the Center for Humane Technology, proposes measuring user-reported wellbeing following platform use rather than time spent on platform. Surveys of users immediately after social media sessions consistently find that longer sessions correlate with lower reported mood and higher reported regret.

Meaningful social interaction measures — tracking whether platform interactions produced the neurobiological signatures of genuine social connection (reduced cortisol, reported closeness to the person interacted with) rather than simply the behavioral signature of interaction — represent a more valid proxy for the social value platforms claim to deliver. Research reviewed in the Recovery Architecture series documents that digital social interaction does not produce the cortisol reduction and autonomic co-regulation that face-to-face interaction produces. A platform optimizing for the quality of social connection rather than the quantity of interactions would be measuring and optimizing something different from what current engagement metrics capture.

The development of better metrics is technically feasible. The barrier is not measurement capability — it is that better metrics would reduce advertising revenue. Engagement as behavioral activation is highly correlated with advertising revenue. Wellbeing or genuine connection as platform value is less directly correlated with the number of advertisements a user sees in a session. The platforms that have built civilization-scale information infrastructure on engagement optimization have strong financial incentives to retain the metric that generated the infrastructure. Goodhart's Law does not resolve itself: when the measure becomes the target, the institutional infrastructure built around the target has interests that resist the replacement of the measure, even when the measure's distorting effects are fully documented.


Sources

Selected References

  • Goodhart, C. A. E. (1975). Problems of monetary management: The U.K. experience. Papers in Monetary Economics. Reserve Bank of Australia.
  • Strathern, M. (1997). Improving ratings: Audit in the British university system. European Review, 5(3), 305–321.
  • Haugen, F. (2021). The Facebook Files. Internal documents disclosed to the Wall Street Journal and SEC, October 2021.
  • Brady, W. J., et al. (2017). Emotion shapes the diffusion of moralized content in social networks. PNAS, 114(28), 7313–7318.
  • Ribeiro, M. H., et al. (2020). Auditing radicalization pathways on YouTube. Proceedings of the International AAAI Conference on Web and Social Media, 14(1), 562–573.
  • Twitter. (2021). Examining algorithmic amplification of political content on Twitter. Twitter Blog, October 22, 2021.
  • Twenge, J. M., et al. (2018). Increases in depressive symptoms, suicide-related outcomes, and suicide rates among U.S. adolescents after 2010 and links to increased new media screen time. Clinical Psychological Science, 6(1), 3–17.
  • Hunt, M. G., et al. (2018). No more FOMO: Limiting social media decreases loneliness and depression. Journal of Social and Clinical Psychology, 37(10), 751–768.
  • Harris, T. (2016). How technology hijacks people's minds — from a magician and Google's design ethicist. Medium, May 18, 2016.
  • Zentner, M. (2023). Social media and adolescent mental health: The role of disrupted sleep. JAMA Pediatrics.