“The welfare of a nation can scarcely be inferred from a measurement of national income.”
— Simon Kuznets, Report to the U.S. Senate, 1934
The Design Brief — What Kuznets Was Asked to Build
The Great Depression presented Congress with a problem it could not solve because it could not measure its extent. In 1932, no systematic accounting of the American economy's total output existed. Franklin Roosevelt's advisers could estimate unemployment, observe breadlines, and read the stock market ticker, but they lacked a single number that captured the scope of economic contraction. Congress commissioned Simon Kuznets, a 31-year-old economist at the National Bureau of Economic Research, to build one.
Kuznets delivered his report in 1934. It introduced what would become the Gross Domestic Product — a measure of the total value of goods and services produced in the American economy in a given period. The measure accomplished what Congress asked: it provided a single, comparable, annually updated number that could track whether the economy was expanding or contracting. It also came with an explicit warning. Kuznets wrote: “The welfare of a nation can scarcely be inferred from a measurement of national income.” He listed what GDP did not measure: the distribution of income, the quality of goods produced, the social value of the activities counted. He was precise about the instrument's purpose and its limits.
The warning was published in the same report that introduced the metric. It was available to everyone who read the report. It was not abstract or qualified — it was a direct statement by the metric's designer about what the metric could and could not do. For a decade, it was largely heeded: GDP was used to measure economic activity, not national welfare. Then the context changed, and the warning was overridden.
The process by which a proxy measure displaces the underlying value it was designed to track, until institutional decisions optimize for the proxy rather than the value. GDP was designed to measure economic activity; it was not designed to measure welfare. When it became the primary indicator of policy success, policy optimized for economic activity — including activity that actively degrades welfare. Kuznets named the displacement risk in 1934. The displacement occurred by 1944.
The 1944 Turn — How GDP Became the Policy Metric
The Bretton Woods conference of 1944, which established the post-war international monetary and economic order, adopted GDP as the standard measure for international economic comparisons. The logic was compelling: World War II had made international economic coordination essential, and an internationally standardized, comparable measure of economic output was necessary for that coordination. GDP provided what was needed. Its adoption as the global standard was rational given the institutional context.
What the Bretton Woods architects did not address was Kuznets's warning. GDP was adopted as the international standard for economic activity. The question of whether it should also be the standard for policy success — for whether a country was doing well — was not distinguished from the question of whether it was a good measure of economic output. The distinction collapsed in practice: countries measured their progress in GDP growth, politicians ran on GDP growth platforms, international development organizations measured development in GDP per capita. The measurement tool became the definition of success.
By the time the Marshall Plan was deployed in 1948, GDP growth was the explicit measure of reconstruction success. By the 1950s, GDP growth rates were the primary economic indicator cited in policy debates across the United States, the United Kingdom, and Western Europe. The substitution of economic activity for welfare in the fundamental measure of national progress was complete within a generation of the metric's introduction — and within a decade of the warning the metric's designer had issued.
What GDP Measures — A Precise Accounting
To understand what GDP cannot see, it helps to understand precisely what it does see. GDP counts the market value of all final goods and services produced in an economy in a given period. “Market value” means that only transactions with a price are counted. “Final goods and services” means that intermediate goods — materials used to produce other goods — are not double-counted. These design choices produce an accurate measure of economic activity, and they also produce systematic blind spots that are as important as what the measure captures.
What rises in GDP: emergency room visits after car crashes; hospital services for patients with preventable diseases; prison construction and guard salaries; insurance claims and legal services after natural disasters; the reconstruction of homes and infrastructure destroyed by storms; the manufacture and sale of alcohol, tobacco, and addictive substances; the legal fees and litigation costs generated by conflict; environmental cleanup costs. All of these represent economic activity — transactions with market prices. They all add to GDP. The crash, the disease, the crime, the storm, the addiction, the conflict, the pollution — none of these subtract from GDP. Only the market response to them is counted.
What does not appear in GDP: the care a parent provides for her children; the time a neighbor spends helping an elderly resident; the work a volunteer performs for a community organization; the quiet of an afternoon spent in nature; the health of an ecosystem; the quality of a child's education (as opposed to the amount spent on it); the safety of a community; the strength of social bonds; the wellbeing of citizens measured directly. These are not market transactions. They produce no GDP. A policy that maximizes GDP at their expense — that converts unpaid care into paid care, wilderness into extractable resources, community bonds into commercial transactions — registers as progress by the only measure that most institutions report.
The Invisible Economy — What GDP Cannot See
The “invisible economy” of non-market production is not a marginal phenomenon. Estimates of its scale place it at 25-35% of GDP-equivalent value in developed economies — comparable to the entire manufacturing sector. Unpaid childcare and eldercare in the United States, if valued at market rates, would add approximately $3.8 trillion annually to measured economic output. The environmental services provided by ecosystems — carbon sequestration, water filtration, pollination, flood control, soil formation — have been valued at $125 trillion annually by ecological economists. These services are rendered continuously, for free, and they do not appear in GDP.
The invisibility of these systems in GDP has policy consequences. Policy that maximizes GDP can simultaneously destroy ecosystems, require market substitutes for their services, and record the destruction as net positive: the ecosystem services that were provided for free disappear from the accounting system because they were never in it, and the market substitutes (water filtration plants, pesticide-assisted agriculture, engineered flood control) add to GDP. From the perspective of GDP accounting, converting a functioning ecosystem into a degraded one that requires expensive engineering to provide the services the ecosystem formerly provided for free is indistinguishable from — and may register as — economic progress.
The policy implication is severe. If the metric used to assess the success of policy decisions cannot see the goods that policy destroys, the metric provides no signal when those goods are degraded. The United States has spent forty years pursuing GDP-maximizing policies that have degraded natural ecosystems, weakened community bonds, increased incarceration, produced longer commutes, and shifted care from families to markets. Each of these trends registers as GDP growth. GDP accounting cannot tell us whether we are better off. That is precisely what Kuznets said in 1934.
Alternatives — What Other Metrics Capture
The Genuine Progress Indicator (GPI) was developed in the 1990s as a direct response to GDP's welfare blindness. GPI begins with personal consumption expenditures (like GDP) and then makes a series of adjustments: it adds the value of household work, volunteer work, and the services of consumer durables; it subtracts for income inequality (additional income matters less to wealthy households than poor ones), for crime and its costs, for environmental degradation, for the loss of leisure time, and for long-term environmental damage. The result is a measure that tracks genuine economic progress for human welfare rather than economic activity.
The GPI finding for the United States is striking and consistent across multiple research teams: US GPI peaked in the late 1970s and has declined since, while GDP has continued to rise. The two measures tracked together from the 1950s through the late 1970s; after that, they diverged. GDP continued upward; GPI stagnated and then declined. What this divergence represents, in concrete terms, is that the economic activities added to GDP after the late 1970s were, on balance, not improving human welfare as GPI measures it — while the costs that GPI subtracts were increasing.
Bhutan's Gross National Happiness index provides a different alternative: four pillars (sustainable development, preservation of cultural values, conservation of the natural environment, and good governance) operationalized into a composite index that guides policy prioritization. Over three decades, Bhutan has made policy choices that differ measurably from GDP-maximizing alternatives in areas including forest preservation, cultural institution investment, and working-hour norms. Whether Bhutan's specific choices are optimal is a separate question from whether the framework that produced them — taking all four pillars seriously — produces different outcomes than GDP-only optimization. The evidence is that it does.
Policy Distortion — Decisions Made in GDP's Image
The deepest consequence of GDP's dominance is not that it misses things — it is that it actively misdirects policy. When the primary indicator of policy success is GDP growth, policy makers face a systematic incentive to prefer activities that increase GDP over activities that improve welfare but are invisible to GDP. The distortions this produces are not incidental — they are the predictable output of optimizing for the wrong metric.
Healthcare is the most striking example. The United States spends more on healthcare as a percentage of GDP than any other developed country. In the GDP accounting framework, this is a sign of a large, productive healthcare sector — a contribution to economic output. In welfare terms, it is a sign of a population that is sicker, receiving more expensive treatment for conditions that other countries prevent, in a system that generates revenue through procedures regardless of whether those procedures improve health outcomes. The GDP signal is positive. The welfare signal is alarming. American health outcomes — life expectancy, infant mortality, rates of chronic disease — are among the worst in the developed world despite the highest healthcare spending.
Urban form illustrates the distortion differently. GDP-maximizing urban form features long commutes, car-dependent transportation, high vehicle and fuel sales, suburban real estate development, and high rates of commercial activity. This form generates substantial GDP: Americans spend trillions annually on vehicles, fuel, insurance, and the commercial infrastructure of car-dependent communities. The welfare costs — longer commutes, less time for family and community, higher accident rates, air quality degradation, carbon emissions — are invisible in GDP. The policy choice between dense, walkable, transit-served urban form and sprawling, car-dependent form is made in a measurement environment where one choice shows up as economic activity and the other is invisible.
The cross-national correlation between GDP per capita and welfare outcomes is real and documented. It reflects a genuine relationship between economic resources and the capacity to provide welfare-improving goods and services. The argument from cross-national correlation, however, does not address the within-country, over-time divergence that GPI research documents: within the United States, over the last forty years, GDP has risen while GPI has declined. The cross-national correlation captures the fact that having more economic resources is better than having fewer; it does not capture whether marginal increases in GDP at current US income levels are producing welfare improvements. The evidence suggests they are not — and that the metric that tells us they are is the problem.
What a Welfare Metric Would Measure — The Design Brief for GDP's Replacement
Replacing GDP as the primary indicator of national progress does not require abandoning GDP as a measure of economic activity. GDP remains a useful instrument for its original purpose: tracking changes in economic output over time and comparing economic scale across countries. The problem is not GDP's existence but its promotion to a role it was not designed to perform. A welfare metric would complement GDP, not substitute for it.
The elements of a welfare metric, based on the research literature and existing alternative metric frameworks, include: the distribution of economic benefits (median income, poverty rates, and measures of income inequality that affect whether aggregate income growth reaches most people); health outcomes (life expectancy, healthy life expectancy, mental health prevalence, and rates of preventable mortality); education quality (not years of schooling but learning outcomes and the reduction of learning gaps across socioeconomic lines); environmental health (air and water quality, ecosystem integrity, and carbon intensity of economic output); time use (the balance between paid work, unpaid care, leisure, and civic engagement); and social connection (community participation, trust, and the prevalence of social isolation).
The technical challenges of such a metric are real but not insurmountable — the GPI framework and the UNDP's Human Development Index demonstrate that composite welfare metrics can be constructed, maintained, and compared across time and jurisdictions. The political challenge is more difficult: the industries and institutional actors that benefit from GDP-maximizing policy (whose benefits register in GDP even when they impose welfare costs that do not) have strong incentives to resist the adoption of a metric that would make those costs visible and include them in policy evaluation. Kuznets identified the technical problem in 1934. The political problem is still being worked out ninety years later.
Sources
- Kuznets, Simon. National Income, 1929–32. Senate Document No. 124, 73rd Congress, 2nd Session. U.S. Government Printing Office, 1934. Contains the original warning against using national income to measure welfare.
- Costanza, Robert, et al. “The Value of the World's Ecosystem Services and Natural Capital.” Nature 387 (1997): 253–260. Estimate of global ecosystem service value at $33 trillion; updated estimate $125 trillion (2014).
- Oxfam International. Time to Care: Unpaid and Underpaid Care Work and the Global Inequality Crisis. 2020. Estimate of unpaid care work value.
- Kubiszewski, Ida, et al. “Beyond GDP: Measuring and Achieving Global Genuine Progress.” Ecological Economics 93 (2013): 57–68. Documents GPI peaking in late 1970s while GDP continued to rise.
- Daly, Herman E., and John B. Cobb Jr. For the Common Good: Redirecting the Economy toward Community, the Environment, and a Sustainable Future. Beacon Press, 1989. Original GPI framework.
- UNDP. Human Development Report. Annual publication. Documents HDI methodology and cross-national comparisons.
- Waring, Marilyn. If Women Counted: A New Feminist Economics. HarperCollins, 1988. Analysis of GDP's invisibility of women's unpaid work.
- Stiglitz, Joseph, Amartya Sen, and Jean-Paul Fitoussi. Report by the Commission on the Measurement of Economic Performance and Social Progress. 2009. Comprehensive assessment of GDP's limitations and alternatives.
- World Bank. The Changing Wealth of Nations 2021. Broad wealth accounting including natural and human capital.
- Goodhart, C.A.E. “Problems of Monetary Management: The U.K. Experience.” Papers in Monetary Economics (Reserve Bank of Australia), 1975. Original statement of Goodhart's Law.