Government statistical agencies commonly report official economic statistics as point estimates. Agency documents describing data and methods may acknowledge that estimates are subject to error, but they typically do not quantify error magnitudes. News releases present estimates with little if any mention of potential error.
In the absence of agency guidance, users of official statistics may misinterpret the information that the statistics provide. I urge statistical agencies to measure uncertainty and report it in their news releases and technical publications.
Why is it important to communicate uncertainty in official statistics? A broad reason is that governments, firms, and individuals use the statistics when making numerous decisions. The quality of decisions may suffer if decision makers incorrectly take reported statistics at face value or incorrectly conjecture error magnitudes. For example, a central bank may mis-evaluate the status of the economy and consequently set inappropriate monetary policy (Lindé et al. 2008). Agency communication of uncertainty would enable decision makers to better understand the information actually available regarding key economic variables.
Sampling error for statistics based on survey data can be measured using established statistical principles. The challenge is to satisfactorily measure nonsampling error. There are many sources of such errors and there has been no consensus about how to measure them. I find it useful to distinguish transitory statistical uncertainty, permanent statistical uncertainty, and conceptual uncertainty.
Transitory statistical uncertainty arises because data collection takes time. Agencies sometimes release a preliminary estimate of an official statistic in an early stage of data collection and revise the estimate as new data arrive. Hence, uncertainty may be substantial early on but diminish as data accumulate.
Permanent statistical uncertainty arises from incompleteness or inadequacy of data collection that does not diminish with time. In survey research, considerable permanent uncertainty may stem from non-response and from the possibility that some respondents may provide inaccurate data.
Conceptual uncertainty arises from incomplete understanding of the information that official statistics provide about well-defined economic concepts or from lack of clarity in the concepts themselves. Thus, conceptual uncertainty concerns the interpretation of statistics rather than their magnitudes.
This column, which summarises material from Manski (2014), illustrates each form of uncertainty and discusses strategies for measurement and communication.
Transitory Uncertainty in National Income Accounts
In the United States, the Bureau of Economic Analysis (BEA) reports multiple vintages of quarterly GDP estimates. An advance estimate combines data available one month after the end of a quarter with trend extrapolations. Second and third estimates are released after two and three months, when new data become available. A first annual estimate is released in the summer, based on more extensive data collected annually. There are subsequent annual and five-year revisions.
BEA analysts have provided an upbeat perspective on the accuracy of GDP statistics. Landefeld, Seskin, and Fraumeni (2008) state (p. 213): “In terms of international comparisons, the U.S. national accounts meet or exceed internationally accepted levels of accuracy and comparability. The US real GDP estimates appear to be at least as accurate (based on a comparison of GDP revisions across countries) as the corresponding estimates from other major developed countries.”
Croushore (2011) offers a considerably more cautionary perspective (p. 73): “Until recently, macroeconomists assumed that data revisions were small and random and thus had no effect on structural modelling, policy analysis, or forecasting. But real time research has shown that this assumption is false and that data revisions matter in many unexpected ways.”
Measurement of transitory uncertainty in GDP estimates is straightforward if it is credible to assume that the revision process is time-stationary. Then, historical data on revisions can be extrapolated to measure the uncertainty of future revisions. A simple extrapolation would be to suppose that the empirical distribution of revisions will persist.
A notable precedent is the regular release of fan charts by the Bank of England. The figure below reproduces a fan chart for annual GDP growth in the February 2014 Inflation Report (Bank of England 2014). The part of the plot showing growth from late 2013 on is a probabilistic forecast that expresses the uncertainty of the Bank’s Monetary Policy Committee regarding future GDP growth. The part showing growth in the period 2009 through mid 2013 is a probabilistic forecast that expresses uncertainty regarding the revisions that the UK Office of National Statistics will henceforth make to its estimates of past GDP. The Bank explains as follows (p. 7): “In the GDP fan chart, the distribution to the left of the vertical dashed line reflects the likelihood of revisions to the data over the past.”
Observe that the figure expresses considerable uncertainty about GDP growth in the period 2010-2013. Thus, the Bank judges that future revisions to estimates of past GDP may be large in magnitude.
Figure 1. Fan chart for annual GDP growth
Source: Bank of England (2014)
Permanent uncertainty due to survey non-response
Each year the United States Census Bureau reports statistics on the household income distribution based on income data collected in a supplement to the Current Population Survey (CPS). There is considerable non-response. During 2002-2012, 7 to 9% of the sampled households yielded no income data due to unit non-response, and 41 to 47% of the interviewed households yielded incomplete income data due to item non-response (Manski 2013). Nevertheless, Census publications give the impression that statistics on the income distribution are exact.
To produce point estimates, the Census Bureau applies hot-deck imputations, stating (U. S. Census Bureau 2006, p. 9-2): “This method assigns a missing value from a record with similar characteristics, which is the hot deck. Hot decks are defined by variables such as age, race, and sex. Other characteristics used in hot decks vary depending on the nature of the unanswered question. For instance, most labour force questions use age, race, sex, and occasionally another correlated labour force item such as full- or part-time status.”
CPS documentation offers no evidence that the hot-deck method yields a distribution for missing data that is close to the actual distribution. Another Census document describing the American Housing Survey is revealing (US Census Bureau 2011, p. D-2): “Some people refuse the interview or do not know the answers. When the entire interview is missing, other similar interviews represent the missing ones [...] For most missing answers, an answer from a similar household is copied. The Census Bureau does not know how close the imputed values are to the actual values.”
Econometric research has shown how to measure uncertainty due to non-response without making assumptions about the nature of the missing data. One contemplates all values that the missing data can take. Then, the data yield interval estimates of official statistics. The literature derives these intervals for population means, quantiles, and other parameters (Manski 2007, Chapter 2). The literature also shows how to form confidence intervals that jointly measure sampling and non-response error (Imbens and Manski 2004).
To illustrate, I have used CPS data to form interval estimates of median household income and the fraction of families with income below the poverty line in 2001-2011 (Manski 2013). One set of estimates takes into account item non-response alone, and another recognises unit response as well. The estimates show that item non-response poses a huge potential problem for inference on the American income distribution. Unit non-response exacerbates the problem.
Conceptual uncertainty in seasonal adjustment
Viewed from a sufficiently high altitude, the purpose of seasonal adjustment of official statistics appears straightforward. The US Bureau of Labour Statistics explains seasonal adjustment of employment statistics this way (US Bureau of Labour Statistics 2001): “What is seasonal adjustment? Seasonal adjustment is a statistical technique that attempts to measure and remove the influences of predictable seasonal patterns to reveal how employment and unemployment change from month to month.”
It is less clear from ground level how one should actually perform seasonal adjustment. Statistical agencies in the US use the X-12-ARIMA method (Findley et al. 1998). X-12 may be a sophisticated and successful algorithm. Or it may be an unfathomable black box containing a complex set of statistical operations that lack economic foundation. Wright (2013) expresses the difficulty of understanding X-12 this way (p. 2): “Most academics treat seasonal adjustment as a very mundane job, rumoured to be undertaken by hobbits living in holes in the ground. I believe that this is a terrible mistake.” He goes on to say that “Seasonal adjustment is extraordinarily consequential.”
There now exists no clearly appropriate way to measure the uncertainty associated with seasonal adjustment. X-12 is a standalone algorithm, not a method based on a well-specified dynamic theory of the economy. It is not obvious how to evaluate the extent to which it accomplishes the objective of removing the influences of predictable seasonal patterns